Developing a Preliminary Clinical Prediction Model for Prognosis of Pneumonia Complicated with Heart Failure Based on Metagenomic Sequencing

Background The predictive factors of prognosis in patients with pneumonia complicated with heart failure (HF) have not been fully investigated yet, especially with the use of next-generation sequencing (NGS) of metagenome. Methods Patients diagnosed with pneumonia complicated with HF were collected and divided into control group and NGS group. Univariate and multivariate logistic regression and LASSO regression analysis were conducted to screen the predictive factors for the prognosis, followed by nomogram construction, ROC curve plot, and internal validation. Data analysis was conducted in SPSS and R software. Results The NGS of metagenome detected more microbial species. Univariate and multivariate logistic regression and LASSO regression analysis revealed that Enterococcus (χ2 = 7.449, P = 0.006), Hb (Wals = 6.289, P = 0.012), and ProBNP (Wals = 4.037, P = 0.045) were screened out as potential predictive factors for the prognosis. Nomogram was constructed with these 3 parameters, and the performance of nomogram was checked in ROC curves (AUC = 0.772). The specificity and sensitivity of this model were calculated as 0.579 and 0.851, respectively, with the threshold of 0.630 in ROC curve. Further internal verification indicated that the predictive value of our constructed model was efficient. Conclusion This study developed a preliminary clinical prediction model for the prognosis of pneumonia complicated with HF based on NGS of metagenome. More objects will be collected and tested to improve the predictive model in the near future.


Introduction
As the population ages, community-acquired pneumonia afects more than 5 million and up to 100,000 deaths annually in the USA [1]. Evidence suggests that pneumonia is associated with long-term cardiovascular outcome, especially heart failure (HF) [1][2][3]. Investigation on the clinical trials of PARADIGM-HF [4] and PARAGON-HF [5] demonstrated that pneumonia incidence was high in patients with HF followed by 4-fold higher mortality [6]. However, the predictive factors of prognosis in patients with pneumonia complicated with HF were not fully investigated yet [7][8][9][10], especially with the help of next-generation sequencing (NGS) of metagenome.
Tis study collected patients with pneumonia complicated with HF, and tried to screen the predictive factors for the clinical prognosis of HF with pneumonia [11][12][13], based on the next-generation sequencing (NGS) of metagenome to detect microbial pathogens. Ten, the univariate and multivariate logistic regression, least absolute shrinkage selector operator (LASSO) regression analysis, nomogram, receiver operating characteristic (ROC) curve plot, and internal validation were performed to construct the visualization model and validation. Tus, we aimed to develop a preliminary clinical prediction model for the prognosis of pneumonia complicated with HF based on NGS of metagenome and hope that this model will be helpful to the assessment of prognosis of HF with pneumonia [8].

Patients Selection and Data Collection.
Initially, 66 patients diagnosed with pneumonia complicated with HF in Guangdong Hospital of Traditional Chinese Medicine from January 2021 to October 2022 were collected and divided into two groups (33 cases each), i.e., sputum culture test group (control group) and sputum culture combined with NGS test of metagenome group (NGS group). Te samples for the sputum culture were obtained from nasopharynx or oropharynx, while the samples for the NGS were obtained from fbre bronchoscopy examination.

Univariate and Multivariate Logistic Regression Analysis.
Regarding the prognosis (i.e., alive and dead) of patients, binary logistic regression with univariate and multivariate analysis was conducted to check the predictive factors, including demographic information, complications, microbial species, and biochemistry parameters. Te variables with P < 0.1 in univariate analysis were screened out for multivariate analysis and P < 0.05 in multivariate analysis were taken as potential prognostic predictors.

Construction and Internal Verifcation of Nomogram.
Te variables with P < 0.05 in multivariate logistic regression were screened out for the preparation of constructing nomogram. Meanwhile, the LASSO regression analysis was performed to select the best predictive model with the least variables by calculating the values of λ.min and λ.1se. Ten, the ROC curves were constructed to evaluate the predictive performance of the nomogram, with the calculated area under curves (AUC) of ROC indicating the model performance. Internal verifcation of the nomogram was tested with the whole dataset, and predictive value for the prognosis (alive or dead) was clustered to 2 main categories by the median method.
2.6. Statistical Analysis. Data were analyzed in SPSS (v26.0, Inc., Chicago, Illinois, USA) and R (v3.6.2, http://www.rproject.org) software. Te continuous data were expressed as mean ± standard deviation, and the comparison between the two groups was performed by two independent samples Student's t-test. Te categorical variables were expressed in frequency and proportions (%), and Chi-square tests were performed for comparison between the groups. Univariate and multivariate logistic regression analysis was conducted according to binary regression analysis in SPSS. Te forest plot, violet plot, LASSO regression, nomogram, and ROC curves were plotted in the R software. P < 0.05 was considered as statistical signifcance.

Demographic Characteristics of the Selected Patients.
Totally, 66 patients were enrolled based on the criteria. For the parameters of demographic characteristics, there was no signifcant diference between the control group (n � 33) and NGS group (n � 33), except the days in the hospital (P � 0.033) ( Figure 1 and Table 1).

Microbial Species Detected by NGS.
Compared with the sputum culture group, the NGS group detected more microbial species (Supplementary Table 1 and Figure 2(a)). Te 3 most commonly detected bacteria were Candida, Enterococcus, and Corynebacterium striatum ( Figure 2(b)).

Univariate and Multivariate Logistic Regression of Parameters on Clinical Treatment and Prognosis.
Firstly, logistic regression was used to analyze the infuence of pathogen type on the clinical treatment efects such as antibiotics. Results of the univariate logistic regression showed that Enterococcus may signifcantly afect the clinical treatment efects (χ 2 � 9.48, P � 0.009), while this signifcance disappeared in the multivariable logistic regression (χ 2 � 0.998, P � 0.32) ( Table 2). Ten, logistic regression was also performed to analyze the infuence of pathogen type on clinical prognosis (i.e., alive or dead). Both univariate and multivariable logistic regression showed that Enterococcus may signifcantly afect the clinical prognosis (χ 2 � 7.449, P � 0.006); thus, Enterococcus was selected as one of the factors afecting clinical prognosis (Table 3).

Construction and Visualization by Nomogram for Multivariate Logistic Regression.
Te LASSO regression analysis was used to fnd the most appropriate model with the least parameters. As the coefcient value of screened variables decreased to zero when λ increased, the contributions of most variables could be eliminated in this model ( Figure 4(a)). Ten, cross-validation was conducted to choose the best performance model composed with least variables, and the partial-likelihood deviance curve showed that 3 variables were enrolled for the best model ( Figure 4(b)).
Meanwhile, regarding to the results of multivariate logistic regression on clinical treatment and prognosis abovementioned, the 3 parameters of Enterococcus, Hb, and ProBNP were fnally selected to construct the nomogram for visualization ( Figure 5(a)). To check the predictive efcacy of this model, the ROC curves of Hb independently and the combination of Hb, Enterococcus, and ProBNP were both plotted. As we can see, the AUC of the 3 parameters (AUC � 0.772) prediction was better than that of the Hb independently (AUC � 0.677) under the ROC curves ( Figure 5(b)). Te specifcity and sensitivity of this 3 parameters model were calculated as 0.579 and 0.851, respectively, with the threshold of 0.630 in ROC curve.

Internal Verifcation of Nomogram Prediction.
Te whole dataset was used for the internal verifcation of the nomogram prediction. Te predictive value was calculated for the prognosis of alive or dead, and the result was clustered by the median method, exhibiting 2 main categories clustered in the dendrogram (Figure 6(a)). Ten, the 3 parameters of Hb, Enterococcus, and ProBNP were clustered and shown in the 3D scatter plot, and we can fnd that the 2 main categories were separated quite independently ( Figure 6(b)). When checking Hb independently, the levels of Hb in the 2 categories were signifcantly diferent ( Figure 6(c)). Tese data of internal verifcation indicated that the predictive value of our constructed model was efcient.

Discussion
Tis study tried to develop a preliminary clinical prediction model for the prognosis of pneumonia complicated with HF based on NGS of metagenome. As we expected, the NGS group detected more microbial species. Univariate and multivariate logistic regression and LASSO regression analysis revealed that 3 parameters, i.e., Enterococcus, Hb, and ProBNP were screened out as potential predictive factors for the prognosis. Te nomogram was constructed with these 3 parameters, and the performance of the nomogram was checked in the ROC curves and validated by internal verifcation.
Heart failure is a broad catch of all terms and various conditions and etiologies which can lead to poor pump function and low perfusion. Te diagnosis and etiology of heart failure could easily be obtained from the chart review of the patients. Tere are several factors for monitoring and prognosis of HF with pneumonia [14][15][16], in which ProBNP is a classic factor [17]. Meanwhile, Hb is reported to be an independent predictor for the survival in patients with chronic HF (CHF), with anaemic and polycythaemic patients having the worst survival in the ELITE II trial [18]. Te  Critical Care Research and Practice clinical trial named EMPEROR-Reduced also evidenced that anemia was associated with poor outcomes of HF, and empaglifozin administration showed improved HF and kidney outcomes irrespective of anemia status at baseline [19]. A similar result was observed on the HF patients with iron defciency or abnormal red cell indices [20]. Besides, Hb is reported to be associated with the frailty score in community-acquired pneumonia, which may afect the prognosis of pneumonia [21]. Our data are consistent with these reports that the low level of Hb was associated with poor outcomes of HF. Bacteria are recognized as one of the predictive factors for the prognosis of HF. Bacterial infection can indirectly cause HF by inducing endocarditis, myocarditis, and infections in other organs including pneumonia. In our study, myocarditis may explain the cause of heart failure in one aspect. Endocarditis could be diagnosed with the suspicion of bacteremia and echocardiographic changes. Te bacteria can also lodge on heart valves and cause infection of the endocardium [22,23]. It is demonstrated that procalcitonin (PCT)-based indication of bacterial infection identifes high risk acute HF (AHF) patients, and elevated PCT indicated probable bacterial infection with poorer in-hospital and postdischarge outcomes, despite similar severity of HF [24].
Metagenomic next-generation sequencing has been widely used for pathogen determination from patients with infectious diseases, especially pneumonia, and identifcation of specifc pathogens can guide the antimicrobial treatments [25]. Our results showed that NGS detected more kinds of microbe species compared with the normal sputum culture, and univariate and multivariate logistic regression analysis revealed that the infection of Enterococcus detected by the NGS was statistically signifcant related to the clinical outcome. Besides, Enterococcus infection is reported to be associated with HF in the literature. Although the major source of Enterococcal endocarditis is from genitourinary tract infections which are more common than pneumonia, it is reported that Enterococcal endocarditis was one of the      Note. * Single-factor analysis of binary logistic regression, P < 0.1. subacute infection characterized by HF [26]. It is also recommended to treat pneumonia similar to Enterococcal infection in the patients with HF and Enterococcal infection in other organs with HF [27]. Terefore, the reports support our data which suggest that bacteria are associated with the prognosis of HF, and untreated Enterococcus infection in pneumonia may become a predictor for poor prognosis of patients with HF in the hospital.
In conclusion, our study developed a preliminary clinical prediction model and used visualized nomogram for the prognosis of pneumonia complicated with HF based on NGS of metagenome. However, the main limitation of this study is the relatively small sample size. In the future projects, more objects will be collected and tested to improve the predictive model and internal validation and consummate the external validation.

Data Availability
All the data used in this study are available from the corresponding author upon reasonable request.

Ethical Approval
Tis study was approved by the Ethics Committee of Guangdong Provincial Hospital of Traditional Chinese Medicine with the approval registration number BF2022-115.

Supplementary Materials
Supplementary