A Broad Learning System to Predict the 28-Day Mortality of Patients Hospitalized with Community-Acquired Pneumonia: A Case-Control Study

This study was to conduct a model based on the broad learning system (BLS) for predicting the 28-day mortality of patients hospitalized with community-acquired pneumonia (CAP). A total of 1,210 eligible CAP cases from Chifeng Municipal Hospital were finally included in this retrospective case-control study. Random forest (RF) and an eXtreme Gradient Boosting (XGB) models were used to develop the prediction models. The data features extracted from BLS are utilized in RF and XGB models to predict the 28-day mortality of CAP patients, which established two integrated models BLS-RF and BLS-XGB. Our results showed the integrated model BLS-XGB as an efficient broad learning system (BLS) for predicting the death risk of patients, which not only performed better than the two basic models but also performed better than the integrated model BLS-RF and two well-known deep learning systems-deep neural network (DNN) and convolutional neural network (CNN). In conclusion, BLS-XGB may be recommended as an efficient model for predicting the 28-day mortality of CAP patients after hospital admission.


Introduction
Pneumonia is the most common respiratory disease [1]. Before the advent of antibiotics, pneumonia was one major killer to the human health [2]. With the advances in modern medicine, many pneumonia patients have been cured with antibiotics and adjuvant therapy, but the mortality rate remains high among the very young, the elderly, and those with compromised immune functions [3]. After the initial triage of patients with pneumonia, it is critical for emergency medical staff to assess whether these patients require hospitalization [4]. Unnecessary hospitalizations not only increase the risk of acquired infections but also drain health care resources [5]. Several pneumonia severity scales may be used to assess the severity of a patient's illness, but these scales are mainly used in the inpatients and are not suitable for emer-gency patients [6]. Community-acquired pneumonia (CAP) is a common infectious disease of respiratory system [7]. A deep insight into the potential factors influencing the quality of antibiotic use is essentially necessary to develop effective and targeted interventions to improve care for patients with CAP [8]. Accurate disease assessment is of great value for the initial treatment, clinical stability, and long-term prognosis [9]. Biomarkers are immune cells and immune proteins that are significantly increased in the process of microbial immunity and have auxiliary diagnostic value in the evaluation of CAP [10].
Nowadays, artificial intelligence is already used to solve emergent problems for medical engineering and particularly, for predicting CAP [11]. In order to avoid the devastating effects of the CAP on the patients' daily lives and healthcare systems and to control the further spread of this virus, we not only need to make an effective early diagnosis of infected patients through effective screening but also need to predict the risk of death in CAP patients [12,13]. A series of models and algorithms were proposed to search for optimal hidden-layer architectures, connectivity, and training parameters for deep learning systems for predicting the CAP risk among patients with respiratory complaints, but the efficiency of these models and algorithms in predicting the death risk of patients hospitalized with CAP needs a further investigation, and meanwhile, novel approaches are quite necessary [14,15].
Our objectives in the present studies are (1) to develop an efficient model based on the previous models and algorithms for predicting the risk of the 28-day mortality in patients hospitalized with CAP, using the random forest (RF) and eXtreme Gradient Boosting (XGB) models [16]; (2) to utilize the broad learning system (BLS) extract the features and evaluate the importance of BLS features in predicting the 28-day mortality of patients [17]; and (3) to compare the performance of the proposed model with two wellknown deep learning systems-deep neural network (DNN) and convolutional neural network (CNN).

Study Design and Population.
This was a retrospective case-control study. The information of a total of 1,397 CAP patients was collected from the Chifeng Municipal Hospital between August 2019 and December 2020. After excluding cases with age < 18 years (n = 58), having recently received chemotherapy (n = 24), advanced liver disease (n = 67), and the serum creatinine level > 1:5 mg/dl (n = 38), 1,210 eligible patients were finally included in this study. This study was approved by the Institutional Review Board (IRB) of Chifeng Municipal Hospital (approval number: no. 2019_24).
The inclusion criteria were as follows: (1) age ≥ 18 years old, (2) patients diagnosed with CAP according to Chinese Guidelines for Diagnosis and Treatment of Adult Community-acquired Pneumonia, and (3) available information of 28-day mortality or survival after hospital admission.

Establishment and Validation of the Prediction Models.
All CAP patients were randomly grouped into the training and testing sets with a ratio of 6 : 4. The balance test was carried out between the two sets. Six prediction models were conducted using the training set ( Figure 1). The logistic regression, RF, DNN, and CNN analyses were used to establish four models to predict the risk of 28-day mortality in  Figure 1: Establishment and validation of the prediction models for the 28-day mortality of CAP patients. 2 Computational and Mathematical Methods in Medicine patients hospitalized with CAP, respectively. All study variables entered the BLS to generate 106 features. Then, the two models (BLS-RF and BLS-XGB) based on the 106 features were established using RF and XGB analyses, respectively. Figure 2 displayed the establishment of the BLS-RF model. The area under the curve (AUC), accuracy, sensitivity, specificity, positive predict value (PPV), and negative predict value (NPV) evaluated the predictive performance of the six models. Internal validation of the six prediction models was conducted using the testing set. Receiver operating characteristic (ROC) curves of the BLS-RF, BLS-XGB, CNN, and DNN models for predicting the 28-day mortality of CAP patients were shown in Figure 3.
DNN consists of three layers, input layer, hidden layer, and output layer. Each layer is fully connected. Using the original data as the input layer, the sample features are obtained progressively through the hidden layer, and then the features in the output layer are predicted. For deep learning processes, 30 hidden layers are used.
CNN's full name is convolutional neural network, which includes three convolutional layer for feature extraction and max pooling layer for down sampling. And Fully Connected Layer for classification 2 Features are extracted by the convolutional layer, useless features are excluded by the pooling layer, and finally features in the output layer are classified and predicted by the full connection layer. In this study, four   Computational and Mathematical Methods in Medicine convolutional layers, one pooling layer, and one full connection layer are adopted.

Statistical
Analysis. The normality test for measurement data was assessed by Shapiro test. The continuous variables with normal distribution were analyzed using T test and expressed by mean ± standard deviation (Mean ± SD). Nonnormally distributed measurement data were analyzed by Mann-Whitney U test and represented by median and quartile (M½Q1, Q3). Categorical data were evaluated utilizing χ 2 test or Fisher's exact probability method, with the number of cases and the composition ratio (N ð%Þ). All missing data were filled by random forest analysis. The sen-sitivity analysis was carried out. All statistical analyses were performed using Python software. P < 0:05 was considered as a statistical difference.      Figures 3 and 4. From the AUC, we can find that the AUC of the two training models based on BLS is similar in the testing set (P = 0:414). However, there was no significant difference between the AUC of DNN and CNN in the testing set (P = 0:270). There is no significant difference between the AUC of the two basic prediction models Logistic and Random Forest in the testing set (P = 0:371). The AUC of the testing set of BLS-based stochastic forest model is better than that of DNN (P = 0:047). The AUC of integrated models in the testing set not only is better than those of basic model RF and logistic in the testing set.

Importance Diagram of the BLS-Based Features.
As stated in Section 2, BLS is used to learn and output features, and random forest prediction is used. Among the BLS output features, the features with the highest feature importance are the 60th, 65, 74, 9, 84, 45, 18, 102, 75, and 49 among the top 10 features with the highest model importance, BLS60 is the most important, followed by BLS65, BLS49 is the lowest, see details in Figure 4.
Machine learning analysis with text representation has been utilized in some previous studies, such as early detection of readmission risk for decision support based on clinical notes, discovering the predictive value of clinical notes, deep learning approaches in chest radiograph, and deep learning techniques on chest X-ray and CT scan [18,19]. But a further investigation on applications in predicting the death risk of CAP among hospitalized patients with respiratory com-plaints is still required, and a novel approach to improve the model performance is also quite necessary [20][21][22][23].
In Section 3.2, we utilized the BLS to construct better hidden-layer architectures and connectivity to extract the data features, and in this section, we further trained parameters in the integrated broad learning system and compare the efficiency of the integrated models with previous algorithms by performance in predicting the death risk of patients with acquired pneumonia after 28-day hospitalization.
As shown in Table 2, experimental results show that the integrated model BLS-XGB (training accuracy = 95:9%, testing accuracy = 93:2%) as an efficient BLS for predicting

Computational and Mathematical Methods in Medicine
This study was to develop a prediction model for the risk of the 28-day mortality in patients hospitalized with CAP, which is essentially significant for emergent treating system in intelligent decisions for modern hospitals [24][25][26][27][28]. The potential engineering applications of our proposed model will not be limited to the patients hospitalized with CAP [29][30][31][32]. We used RF and XGB methods after learning the sample characteristics of the data by BLS [33][34][35][36][37][38]. This approach is novel compared to the previous studies on predicting the risk of death among CAP cases [39][40][41][42][43]. Accuracy of the integrated model is more than 90%, indicating a robust prediction. Our model also makes prediction according to various indicators of patients. At the same time, compared with the method of the previous basic models and other competitive models, the integrated model has significantly improved the performance accuracy in practical applications. The unresolved issues are also the main challenges in treating pneumonia is that a patient's condition can deteriorate suddenly, and therefore, the subsequent emergent treatment for saving personal patients needs a further utilization of other methods in medicine and artificial intelligence.

Conclusion
BLS offers an alternative way of learning in deep structure and in the present study, after being integrated with XGB, the experiments indicate a robust prediction for control the 28-day mortality risk of CAP patients after hospital admission. The integrated model BLS-XGB was selected as an efficient model to control the 28-day mortality of patients hospitalized with CAP. For subsequent studies, we encourage other researchers to extend potential engineering applications of our proposed model (not be limited to the patients hospitalized with CAP). Another next research priority is to find accompanied methods in medicine and artificial intelligence for the emergent treatment for saving personal patients after the death risk is predicted.

Data Availability
The data utilized to support the findings are available from the corresponding authors upon request.

Conflicts of Interest
The authors declare that they have no competing interests.