A Nomogram for Predicting Cardiovascular Diseases in Chronic Obstructive Pulmonary Disease Patients

Cardiovascular diseases (CVDs) are the most common comorbidities in the chronic obstructive pulmonary disease (COPD), which increase the risk of hospitalization, length of stay, and death in COPD patients. This study aimed to identify the predictors for CVDs in COPD patients and construct a prediction model based on these predictors. In total, 1022 COPD patients in National Health and Nutrition Examination Surveys (NHANES) were involved in the cross-sectional study. All subjects were randomly divided into the training set (n = 709) and testing set (n = 313). The differences before and after the manipulation of the missing data were compared via sensitivity analysis. Univariate and multivariable analyses were employed to screen the predictors of CVDs in COPD patients. The performance of the prediction model was evaluated via the area under the curve (AUC), accuracy, sensitivity, specificity, negative predictive value (NPV), positive predictive value (PPV), and calibration. Subgroup analysis was performed in patients using different COPD diagnosis methods and patients smoking or not smoking in the testing set. We found that male, older age, a smoking history, overweight, a history of blood transfusion, a history of heart disease in close relatives, higher levels of white blood cell (WBC), and monocyte (MONO) were associated with the increased risk of CVDs in COPD patients. Higher levels of platelets (PLT) and lymphocyte (LYM) were associated with reduced risk of CVDs in COPD patients. A prediction model for the risk of CVDs in COPD patients was established based on predictors including gender, age, a smoking history, BMI, a history of blood transfusion, a history of heart disease in close relatives, WBC, MONO, PLT, and LYM. The AUC value of the prediction model was 0.75 (95% CI: 0.71–0.79) in the training set and 0.79 (95%CI: 0.73–0.85) in the testing set. The prediction model established showed good predictive performance in predicting CVDs in COPD patients.


Introduction
As a complex respiratory disorder, chronic obstructive pulmonary disease (COPD) is characterized by persistent airflow limitation associated with an abnormal inflammatory due to exposure to noxious particles and gases [1,2]. e prevalence of COPD is 11%-26%, and the worrisome trend is expected to continue over the next 25 years [3]. Alarmingly, over 6 million deaths were estimated to result from COPD annually all over the world, and by 2030, COPD will become the third major reason of death all through the world [4]. is prediction has already been fulfilled and COPD has been reported to have caused 3.23 million deaths in 2019 [5]. e prevalence of COPD has caused a large burden to the society with an estimated cost of US$50 trillion per year [6]. COPD is expected to become the main economic burden of human chronic diseases in the future with the increase in air pollution and the speed of aging worldwide [7]. To display special concern on COPD was essential for the society and patients.
Although COPD primarily affects the lungs, patients also suffered from concurrent comorbidities such as cardiovascular diseases (CVDs), lung cancers, and metabolic diseases [8]. CVDs are the most common comorbidities in COPD, which increase the risk of hospitalization, length of stay, and death in COPD patients [9]. Previously, studies have reported that the prevalence of CVDs in COPD patients was approximately 10%-38% [10], and CVDs caused about 20%-50% of mortality in COPD patients [11]. To prevent the occurrence of CVD in COPD patients was of vital significance for improving the prognosis of those patients.
Accumulating research findings emerged over the past years, and the risk factors for CVDs in COPD patients were identified in various studies. Increased serum levels of inflammation and oxidative stress associated factors such as vascular cell adhesion molecule-1 [12] and human epididymis protein 4 [13] were reported to be correlated with the increased risk of CVDs in COPD patients. Chronic bronchial infection was also identified to increase the incidence of CVDs in COPD patients [14]. Machine learning enables systems to automatically learn and build the analytical model from their experience, and various prediction models were built for identifying those with the risk of some diseases, or for clinical use [15,16]. e prediction models provided valuable tools for helping and guiding the treatments and care for clinicians and nurses. Previously, a prediction model for predicting the risk of CVDs in COPD patients was also established based on monocyte (MONO) level/ HDL cholesterol ratio with an area under the receiver operating characteristic curve (AUC) of 0.73 [17]. is model only focused on the MONO level/HDL cholesterol ratio in those patients, which lacked important demographic and clinical variables associated with inflammation in COPD [18]. In this study, we collected the data of 1022 COPD patients from the National Health and Nutrition Examination Surveys (NHANES) between 2007 and 2018. e purpose of our study was to explore the predictors for CVDs in COPD patients and construct a prediction model based on these predictors. A nomogram for predicting CVDs in COPD patients was also plotted to quickly identify the possibility of CVDs in COPD patients.

Study Population.
e current cross-sectional study collected the data of 1199 COPD patients in the NHANES database from 2007 to 2018. e NHANES is an ongoing program performed by the National Center for Health Statistics (NCHS) of the Centers for Disease Control (CDC) to evaluate the health and nutritional status in the civilian noninstitutionalized populations of the United States [19]. erefore, informed consent from the participants was waived. Every year, about 5000 nationally representative individuals are sampled through multistage, stratified, clustered sampling method [19]. In our study, patients with answers "unknown" or "refuse" were excluded and finally 1022 patients were included. All participants were divided into the training set (n � 709) and the testing set (n � 313). e screening process of subjects is displayed in Figure 1.

Main Variables and Outcome
Variables. e main variables analyzed included age (years), gender, race (non-Hispanic White, non-Hispanic Black, or other race), education level (under 12th grade, high school grad/general equivalent diploma or equivalent, or some college or above), marital status (married or other), annual family income (<20000$ or ≥20000$), smoke or not, overweight or not, a history of blood transfusion, a history of heart disease in close relatives, white blood cell (WBC; 10 9 /L), MONO (10 9 /L), neutrophil (NEUT; 10 9 /L), platelet (PLT; 10 9 /L), lymphocyte (LYM; 10 9 /L), NEUT/ LYM ratio (NLR), and PLT/LYM ratio (PLR). e outcome variable was COPD patients with CVDs. e COPD was defined according to the question MCQ160O (Has a doctor or other health professional ever told that you had COPD?) in the MCQ series with an answer of "Yes" or verified by spirometry with postbronchodilator forced expiratory volume in 1 second (FEV1)/forced vital capacity (FVC) ratio (FEV1/FVC) ≤0.7 [20]. CVDs were defined as patients with at least one myocardial infarction, congestive heart failure, angina pectoris, or a history of coronary heart disease.

Statistical Analysis.
e Kolmogorov-Smirnov test was used to assess the normality of the measurement data. Continuous variables of normal distribution were represented by mean standard deviation (Mean ± SD), and comparison between groups was performed by t-test. e measurement data of nonnormal distribution were exhibited as M (Q 1 , Q 3 ), and differences between groups were compared via Wilcoxon rank sum test. e enumeration data were described as n (%), and chi-square (χ 2 ) or Fisher's exact probability method were applied for comparisons between groups [21]. e differences before and after the manipulation of the missing data were compared in the sensitivity analysis. All subjects were classified into the training set and the testing set with a ratio of 7 : 3. Univariate and multivariable analyses were employed to screen the predictors of CVDs in COPD patients. e prediction model was constructed and the nomograms were plotted. e evaluation of the prediction model performance was performed via the area under the curve (AUC), accuracy, sensitivity, specificity, negative predictive value (NPV), positive predictive value (PPV), and calibration. e receiver operator characteristic (ROC) curves were drawn, and subgroup analysis was performed in patients using different COPD diagnosis methods and patients smoking or not smoking in the testing set. Statistical analysis was conducted via SAS 9.4 software and R4.0.2 software was used to construct the model. P < 0.05was considered as a statistical difference.

e Proposed Architecture of Our Study.
In this study, the data of 1022 COPD patients from the NHANES between 2007 and 2018 were collected. e purpose of our study was to explore the predictors for CVDs in COPD patients and construct a prediction model based on these predictors. All subjects were classified into the training set and the testing set with a ratio of 7 : 3. Univariate and multivariable analyses were employed to screen the predictors of CVDs in COPD patients. e prediction model was constructed, and a nomogram for predicting CVDs in COPD patients was also plotted to quickly identify the possibility of CVDs in COPD patients. Our prediction model had good predictive performance and the nomogram made it easy for the clinicians to quickly estimate the possibility of CVDs in a COPD patient and provide timely interventions to prevent the occurrence of CVDs in patients with high risk of CVDs.

Predictors of CVDs in COPD Patients.
Variables with statistical significance in the baseline data of COPD patients with or without CVDs were included in the multivariate logistic regression analysis. e results showed that males were associated with 1.060-fold higher risk of CVDs than females (odd ratios (OR) � 1.060, 95% confidence interval (CI): 0.752-1.494). Increased age in COPD patients was associated with a higher risk of CVDs (OR � 1.040, 95% CI: 1.025-1.055). A smoking history in COPD patients increased the risk of CVDs by 0.737 times (OR � 1.737, 95% CI: 1.118-2.697). e risk of CVDs was increased by 0.987 times in overweight patients (OR � 1.987, 95% CI: 1.449-2.725).

e Equilibrium Test of the Training Set and Testing Set.
e participants were randomly divided into the training set and the testing set (7 : 3). e results of equilibrium analysis revealed that there was no statistical significance in the differences of variables between the training set and the testing set (all P > 0.05) ( Table 3), which indicated that the data in the training set and the testing set were almost equilibrated.

Construction of the Logistic Prediction Model and Validation of the Predicative Value via the Testing Set.
Predictors with statistical difference in the multivariable regression analysis were involved in the logistic prediction model. e results delineated those males had a 1.231-fold higher risk of CVDs than females (OR � 1.231, 95% CI: 0.817-1.856). COPD patients with older ages were correlated with a 1.037-fold increase of the risk of CVDs (OR � 1.037, 95% CI: 1.019-1.054). COPD patients with a smoking history were associated with a 1.497-fold increase of the risk of CVDs (OR � 1.497, 95% CI: 0.891-2.513). Overweight was linked with a 1.575-fold increase of the risk of CVDs in COPD patients (OR � 1.575, 95% CI: 1.080-2.298). Patients with a history of blood transfusion were associated with a 2.090 times higher risk of CVDs (OR � 2.090, 95% CI: 1.387-3.148). A history of heart disease in close relatives increased the risk of CVDs in COPD patients (OR � 2.944, 95% CI: 1.923-4.506). e risk of CVDs in COPD patients was increased in patients with higher levels of WBC (OR � 1.197, 95% CI: 1.082-1.324) and MONO (OR � 1.115, 95% CI: 1.034-1.204). Higher levels of PLT (OR � 0.996, 95% CI: 0.993-0.999) and LYM (OR � 0.826, 95% CI: 0.625-1.091) were associated with a reduced risk of CVDs in COPD patients (Table 4). e formula of the prediction model was: Logit (P) � Ln (P/1-P) � 0.208 male + 0.036 age +0.202 smoking+0.227 overweight + 0.368 blood trans-fusion+0.540 heart disease in close relatives + 0.    (Table 5). e calibration curves of the model in the training set and testing set are shown in Figure 3, which depict that the prediction values of the model in the training set and testing set deviated slightly from the perfected models, but were close to matching, indicating the prediction model had good agreement between the predictive probability and the actual probability. A nomogram was also established for predicting the occurrence of CVDs in COPD patients ( Figure 4). A sample was randomly selected in the training set and the patient was a female without a history of heart disease in close relatives. e LYM level of the patient was 1.91 10 9 /L, the level of WBC was 7.12 10 9 /L, the level of PLT was 338 10 9 /L, and the level of MONO was 6.09 10 9 /L. e patient was 58 years old with a history of smoking and blood transfusion. e patient was not overweight. e total score was 288 and the possibility of CVDs in the patient was 0.15, which was similar with the actual results ( Figure 4).

Subgroup Analysis of Prediction Ability of the Prediction
Model. As there were two diagnosis methods for COPD patients in our study, subgroup analysis was performed in the  (Table 6). e prediction model showed better performance in patients with COPD diagnosed according to questionnaires and patients with a history of smoking.

Discussion
is study collected the data of 1022 COPD patients with CVDs to evaluate the factors associated with the occurrence of CVDs in COPD patients and establish a prediction model based on these predictors. e data revealed that male, age, smoking history, overweight, history of blood transfusion or heart disease in close relatives, and levels of WBC, PLT, LYM, and MONO were predictors for CVDs in COPD patients. Additionally, we established a prediction model for the occurrence of CVDs in COPD patients based on these predictors, the AUC value of the prediction model was 0.75 in the training set and 0.77 in the testing set, which showed good predictive performance. Subgroup analysis revealed that the prediction model had better performance in patients with COPD diagnosed according to questionnaires and patients with a history of smoking.
Cigarette smoke is the major cause of COPD, which results in about 95% of COPD cases in industrialized countries [22]. Smoking is also reported to be one of the most important risk factors for COPD with CVDs [23]. is may be due to the diverse inflammatory responses resulted from smoking in COPD patients, which increased the risk of CVDs [24]. Austin et al. identified that alveolar macrophages in bronchoalveolar lavage from the lungs of smokers might release more reactive oxygen species than nonsmokers [25]. Herein, COPD patients with a history of smoking were associated with a higher risk of CVDs. Previously, several studies have indicated that age was associated with the incidence of CVDs in COPD patients [14,26]. ese findings supported the results in our study, which showed that increased age was correlated with a higher risk of CVDs in the  COPD patients. In the current study, patients with a history of blood transfusion were also associated with an increased risk of CVDs in COPD patients. As reported, blood transfusion was a risk factor of major cardiovascular events in patients with acute myocardial infarction and anemia [27,28]. Family history of a heart disease is widely proposed to be an essential marker for predicting the occurrence of cardiovascular events in patients [29,30], which provide evidence to the findings of our study, which depicted that a history of heart disease in close relatives was associated with an increased risk of CVDs in COPD patients. Blood routine parameters are essential inflammatory markers of COPD and some of the inflammatory makers were also elevated in patients with COPD [31,32]. MONO circulates in the blood, bone marrow, and spleen and are one of the active members of inflammation in COPD [33]. An increased WBC and a decreased LYM count were also identified in COPD patients compared to healthy subjects [34,35]. Inflammation was associated with the changes in structure, shape, and dynamics of PLT, which may further affect atherogenic and thrombotic events [36]. Another study also showed that increased levels of inflammatory markers are associated with the increased incidence of atherosclerosis, coronary heart disease, congestive heart failure, and atrial fibrillation [37]. In our study, higher WBC and MONO levels were associated with an increased risk of CVDs in COPD patients while the higher levels of PLT and LYM were associated with a decreased risk of CVDs in COPD patients. For COPD patients, regular blood routine inspection should be conducted to pay close attention to the levels of WBC, MONO, PLT, and LYM for timely identifying patients with a high risk of CVDs. e current study assessed the predictors for CVDs in COPD patients and established a prediction model based on these predictors in the training set. e validation of the prediction model was performed in the testing set. e AUC values of the model showed good predictive abilities in both the training set and the testing set. e calibration curves of the model also suggested that the prediction model had good agreement between the predictive probability and the actual probability, indicating our prediction model had good predictive performance. Previously, a prediction model for the occurrence of CVDs in COPD patients was constructed based on the MONO level/HDL cholesterol ratio, which showed an AUC value of 0.73, and our prediction model had better predictive performances than the model. Additionally, a nomogram was plotted in line with our model, which was easy for the clinicians to calculate the score directly from the graph and quickly estimate the possibility of CVDs in a COPD patient. e nomogram might offer a tool for the clinicians to provide timely interventions to prevent the occurrence of CVDs in patients with a high risk of CVDs. e strengths of this study were that we dealt with the missing data and sensitivity analysis was conducted, which revealed that there was no significant difference between the characteristics of patients before and after manipulating the missing data. ese suggested that the data used in our study through manipulating the missing data might reduce the bias than simply deleting the data, which might increase the reliability of our results. Internal validation was also performed to verify the results of the present study. Several limitations existed in the present study. First, the sample size was small and the statistical power was reduced. Second, external validation was not conducted. ird, some data collected from questionnaires in NHANES were self-reported, which might cause bias. In the future, studies with large scale of sample size were required to validate the findings of our study.

Conclusions
Herein, a prediction model was constructed for predicting CVDs in COPD patients based on predictors including gender, age, smoking history, overweight, history of blood transfusion or heart disease in close relatives, and levels of WBC, MONO, PLT, and LYM using the data of 1022 patients from NHANES database. e results showed that our prediction model had good predictive performance in predicting CVDs in COPD patients with AUC values of 0.75 in the training set and 0.79 in the testing set. A nomogram was also plotted for predicting the occurrence of CVDs in COPD patients. e findings might help identify COPD patients with a high risk of CVDs and provide timely interventions and treatment to prevent the occurrence of CVDs. For future work, we are planning to collect the samples of COPD in our hospital and use these data to verify the predictive performance of the prediction model. Meanwhile, advanced deep learning and optimization approaches will be employed to help improve the predictive value of predicting CVDs in COPD patients.

Data Availability
e datasets generated and/or analyzed during the current study are available via https://www.cdc.gov/nchs/nhanes/ index.htm.