Study of Hospitalization Costs in Patients with Cerebral Ischemia Based on E-CHAID Algorithm

Background The aging of the population has led to a rapid increase in the prevalence of most neurological diseases between 1990 and 2016, with a growth rate of up to 117%, which has put enormous pressure on medical insurance funds. As one of the core diseases of disease diagnosis grouping, the hospitalization cost composition and grouping research of patients with cerebral ischemic disease can help to determine scientific payment standards and reduce the economic burden of patients. Aim We aimed to understand the cost composition and influencing factors of hospitalized patients with cerebral ischemic diseases and to identify a reasonable cost grouping scheme. Methods The data come from the homepage of medical records of inpatients with cerebral ischemia in a tertiary hospital in Sichuan Province from 2018 to 2020. After cleaning the data, a total of 5,204 pieces of data were obtained. Nonparametric tests and gamma regression models were used to explore the influencing factors of hospitalization costs. Taking the influencing factors as the predictor variables and the hospitalization cost as the target variable, the exhaustive Chi-squared automatic interaction detector (E-CHAID) algorithm was used to form the costs grouping, and the payment standard of the hospitalization cost for each group was determined. The rationality of cost grouping was evaluated by coefficient of variation (CV) and Kruskal–Wallis H test. Results From 2018 to 2020, the average hospital stay of 5,204 inpatients with cerebral ischemic disease was 10.70 days, and the average hospitalization cost was 17,206.09 RMB yuan. Among the hospitalization costs, diagnosis costs and drug costs accounted for the highest proportion, accounting for 41.18% and 22.38%, respectively, in 2020. Gender, age, admission route, comorbidities and complications, super length of stay (>30 days), and discharge mode had significant effects on hospitalization costs (P < 0.05). Patients were divided into 10 cost groups, and the grouping nodes included comorbidities and complications, discharge mode, age, gender, and admission route. The CV of 9 of the 10 cost groups is less than or equal to 1. The Kruskal–Wallis H test showed that the difference between groups was statistically significant (P < 0.05). Conclusion The cost grouping of patients with cerebral ischemic diseases based on the E-CHAID algorithm is reasonable. This study examined the effects of super length of stay (>30 days), comorbidities and complications, and age on hospitalization cost in patients with cerebral ischemic disease. This study can provide a theoretical basis for advancing the China Healthcare Security Diagnosis Related Groups (CHS-DRG) grouping program and medical expense payment, thereby reducing the disease burden of patients.


Introduction
Cerebral ischemic disease is brain tissue damage caused by vascular obstruction, including neuronal cell death and cerebral infarction, mainly manifested as ischemic stroke [1]. A survey of about 480,000 people in China showed that the prevalence and mortality of cerebral ischemia were 1114.8 and 114.8 per 100,000 people, respectively [2]. Studies have shown that the proportion of people discharged with neurological diseases in general hospitals in China is close to 11% [3], and cerebrovascular disease has become the leading cause of death among Chinese residents. Ischemic cerebrovascular disease accounts for 87% of new or recurring cerebrovascular diseases each year, and cerebral ischemic disease is also the fifth leading cause of death and disability in the United States [4].
In 2021, WHO released the Cross-departmental Global Action Plan for Epilepsy and Other Neurological Disorders [5], requiring countries to develop sustainable interventions for the prevention and management of neurological diseases based on local best practices, to ensure that patients with neurological diseases receive timely, affordable, and highquality service. Patient hospital length of stay is one the most direct embodiment on efficiency as well as bed turnover rate, which consequently enhance patient access to service [6]. Hospitals have been using clinical management, such as the organizational structure, discharge planning, clinical pathway to shorten hospital length of stay, and reduce the medical cost [7]. DRG is believed to have a positive impact in optimizing hospital management, shortening the day of hospitalization.
e appropriate bed management is not only to control medical cost, indirectly but also to reduce the financial burden for patients and third-party payers. erefore, DRG is widely used among many medical systems as payment methods to achieve the objectives. e reform of medical insurance payment includes two aspects: cost grouping and cost payment. e growing elderly population puts more and more pressure on the medical insurance fund pool [8,9], so the reform of medical insurance payment methods will be the focus of long-term research in the future. According to the United Nations definition, China has entered an aging society at the beginning of the twenty-first century, and the proportion of the elderly population in China is expected to exceed 30% by 2050 [10].
Reasonable cost grouping is the premise of scientific cost payment. Based on BJ-DRG, CN-DRG, CR-DRG, and C-DRG, which are the most widely used and authoritative in China, CHS-DRG grouping program [11] was formulated and released by the National Healthcare Security Administration in 2019. CHS-DRG mainly uses the national health insurance code, including "Medical security disease diagnosis classification and code (ICD-10)" and "Medical security surgery classification and code (ICD-9-CM-3)." As one of the 376 adjacent diagnosis related groups (ADRG) in the CHS-DRG grouping scheme, the cerebral ischemic disease group has research value.
Due to the skewed distribution of hospitalization cost, it is usually necessary to logarithmically transform it to use multiple linear regression [12]. Although Logistic regression models do not require data distribution, cost data integrity after classification is reduced [13]. Gamma regression model is a type of generalized linear model that can process data through logarithmic links to reduce data loss and ensure data integrity [14], so it is gradually applied to medical cost analysis [15,16].
ere have been previous studies on hospitalization cost [17][18][19], but not many studies on cerebral ischemic diseases. Based on the newly released CHS-DRG grouping scheme in China, we firstly calculated the CV of cerebral ischemic diseases ADRG group, which is the premise of cost grouping. en we analyze the composition, influencing factors, and cost grouping of cerebral ischemic disease patients, and finally tries to determine the cost standard of each cost grouping, so as to reduce the disease burden and economic pressure of patients.

Setting.
is study is a single-center retrospective study with data from a large tertiary hospital in Sichuan Province, China. e medical center has more than 4,000 acute care beds. In 2021, the hospital has 7.75 million outpatient visits (included emergency), and more than 283,000 admissions, with an average length of stay of 6.80 days. e hospital has been among the best for many years which can represent the medical level of China's general acute hospital and has certain reference significance for other countries in the world.

Data Source and Processing.
e data come from the information system of a tertiary hospital in Sichuan Province, including the basic information and cost data of patients with cerebral ischemic diseases. In the CHS-DRG grouping scheme (Figure 1), patients with cerebral ischemic disease were divided into the BR2 ADRG group, and no DRG group subdivision was performed. A total of 6214 cases of inpatients with cerebral ischemic diseases from 2018 to 2020 were collected, and 5204 cases remained after data processing. Data exclusion criteria: (1) cases with blank/ missing items; (2) cases in which the major diagnosis or major operation code is not included in the CHS-DRG grouping scheme; (3) cases in which the length of stay is longer than 60 days; (4) cases in which hospitalization cost are less than 1 percentile or more than 99 percentiles.
According to the MCC&CC inclusion and exclusion tables in the CHS-DRG grouping scheme, we sorted out the comorbidities and complications of patients with cerebral ischemic diseases. If a patient had any secondary diagnosis on the MCC&CC inclusion table and the primary diagnosis was not on the exclusion table, the case had MCC or CC. e comorbidities and complications of the patients were divided into three groups, the first group had major comorbidities and complications (MCC), the second group had comorbidities and complications (CC), and the third group had no comorbidities and complications (Non-CC).

Classification and Description of Hospitalization Expenses.
e classification of the total hospitalization cost is based on the homepage of Chinese hospitals [20], of which the diagnosis cost and drug cost account for the highest proportion. Diagnosis cost items include pathological diagnosis, laboratory diagnosis, imaging diagnosis, and clinical diagnosis. Drug cost includes cost items in western medicine, Chinese patent medicine, and Chinese herbal medicine. Comprehensive medical service cost includes cost items of physician fee, nursing care, bed, and others.

Statistical Analysis.
Because the hospitalization cost does not conform to the normal distribution, the nonparametric test was used to carry out univariate analysis of the hospitalization cost, and the Gamma regression model was used to carry out the multivariate analysis of the hospitalization cost and calculate the cost ratio (CR). In univariate analysis, the Mann-Whitney U test was used for cost comparisons between two groups of variables, and the Kruskal-Wallis H test was used for cost comparisons among multiple groups. Taking the hospitalization cost as the dependent variable, and the factors that have a high degree of influence on the hospitalization cost as the grouping node, the E-CHAID algorithm is used to group the cost. CV [21] and Kruskal-Wallis H test [22] were used to evaluate the reasonableness of cost grouping. Excel 2019 software was used for data entry and SPSS 26.0 and SPSS Modeler 18.0 were used for statistical analysis.

e Premise of Cost Grouping.
e document shows [23] that ADRG group with CV greater than 1 can be subdivided, and the CV of hospitalization cost in the cerebral ischemic disease ADRG group is calculated to be 1.18. e cost grouping process of this study is shown in Figure 2.

General Information.
From 2018 to 2020, there were 5204 patients in the cerebral ischemia disease ADRG group, including 3215 male patients (61.78%) and 1989 female patients (38.22%). e age of the patients ranged from 0 to 99 years old, with an average age of 65.30 years. e length of hospital stay ranged from 1 to 60 days, with an average hospital stay of 10.70 days.

Composition of Inpatient Hospitalization Expenditure.
As shown in Table 1, the total hospitalization cost of the cerebral ischemic disease ADRG group from 2018 to 2020 was 89.54 million RMB yuan, and the average hospitalization cost was 17,206.09 RMB yuan, of which the diagnosis cost and drug cost accounted for the highest proportion, accounting for 41.18% and 22.38%, respectively, in 2020. In the past three years, the proportion of comprehensive medical service cost has decreased year by year, and the proportion of treatment cost, and blood and blood products cost has increased year by year. Table 2, in addition to allergy, gender, age, social insurance, admission route, comorbidities and complications, discharge mode, and super length of stay (>30 days) have statistically significant effects on hospitalization cost (P < 0.05).

Multivariate Analysis Using Gamma Model.
Results (Table 3) showed that gender, age, comorbidities and complications, admission route, super length of stay (>30 days), and discharge mode all had an impact on hospitalization cost. rough the cost ratio, it can be seen that the super length of stay (>30 days), age, and comorbidities and complications have a greater impact on the hospitalization cost. Compared with patients whose hospitalization days were less than or equal to 30 days, patients with super length of stay (>30 days) spent more medical costs (CR � 4.23); compared with patients aged 0-17 years old, patients older than 65 years old spent more medical costs (CR � 2.63).

Grouping and Verification of Inpatient Hospitalization
Cost. Since the length of hospital stay in China is affected by many factors, and there are large disparities between different hospitals, the super length of stay (>30 days) is not used as a grouping variable. Selecting the meaningful factors of multivariate analysis as grouping nodes, using CART and Journal of Healthcare Engineering   Journal of Healthcare Engineering E-CHAID algorithms to construct cost groups, and form 2 and 10 cost groups respectively. As shown in Table 4, taking into account the number of groups and the mean absolute error, we finally chose the E-CHAID model for subsequent analysis. As shown in Table 5, 5204 patients were divided into 10 cost groups, and the grouping nodes included comorbidities and complications, discharge mode, age, gender, and admission route. e fourth group had the largest number of patients with 1823 patients (35.03%), these patients had general comorbidities and complications, and were older than 65 years. e seventh group had the smallest number of patients, with only 56 patients (2.09%), who had no comorbidities and complications, were admitted through outpatient and other routes, and were younger than 18 years old. Among the 10 cost groups, the CV of nine groups are all less than or equal to 1, and the grouping is reasonable. e Kruskal-Wallis H test was performed on the cost groups, and the difference between the groups was statistically significant (P < 0.001). Table 5, the median medical cost of each group is taken as the standard cost. e 75th percentile of cost per group plus 1.5 times the interquartile range was used as the upper limit of medical costs for that group, and cases above the upper limit were defined as excess amount [24,25]. e second group (MCC, transferred to another hospital, death and others) had the highest standard cost, about 25,000 RMB yuan; the seventh group (Non-CC, outpatient and other admission, <18 years old) had the lowest standard cost, close to 1,500 RMB yuan. e fourth group (CC, ≤65 years old) had the highest excess rate at 10.43%, and the ninth group (Non-CC, outpatient and other admission, 18-65 years old, female) had the lowest excess rate at 0.56%.

Discussion
According to the CHS-DRG grouping scheme, the CV of the cerebral ischemic disease ADRG group is 1.18, and the patient's personal characteristics and disease characteristics have a certain influence on the hospitalization cost, so it is reasonable to group the cost for the ADRG group. At the same time, it also shows that the applicability of the grouping scheme of CHS-DRG needs to be improved. e prevalence of most neurological diseases increased rapidly from 1990 to 2016, with a growth rate of 117% due to an aging population [26]. As a common neurological disease, cerebral ischemic disease has the characteristics of high mortality and high disability rate [5], which will cause a heavy economic and labor burden to patients, families, and society. e results of the study showed that from 2018 to 2020, the average length of hospital stay for patients with cerebral ischemic disease was 10.70 days, and the average hospitalization cost was 17,206.09 RMB yuan. Among the hospitalization cost, diagnosis cost and drug cost accounted for the highest proportion, accounting for 41.18% and 22.38% in 2020, respectively. Diagnosis of cerebral ischemic disease, selection of treatment options, and assessment of prognostic status involve imaging studies [27,28], which may result in high diagnostic costs. e high proportion of diagnostic costs is in line with the trend of medical reform, and early diagnosis is of great significance for the prevention and treatment of cerebral ischemic diseases. e use of imported drugs (the thrombolytic agent alteplase (rt-PA) [29]) and the nonreimbursement of certain drugs by medical insurance are possible reasons for the high cost of drugs [30,31]. e proportion of drug costs in the total cost in China is significantly higher than the international proportion [32,33], so reducing the drug costs of patients with cerebral ischemic disease has positive significance for reducing the disease burden of such patients [34].
Age has important implications for patient classification in many countries [35,36]. e research results show that the average age of patients with cerebral ischemia is 65.30 years old, and the hospitalization cost of patients over 65 years old is 2.63 times that of patients aged 0-17 years old, which is in line with the natural pathological characteristics of the disease. Meanwhile, the elderly tend to have more comorbidities and complications because of the decline of physical function and thus spend more medical expenses [36]. e homogeneity of patients in the DRG group is a prerequisite for reasonable reimbursement by medical insurance. In order to ensure the homogeneity of patients in each group, it is necessary to carefully select categorical variables [37]. rough the cost ratio, it can be found that super length of stay (>30 days), comorbidities and complications, and discharge mode have a greater impact on hospitalization cost. e hospitalization cost of patients with super length of stay (>30 days) is significantly higher than that of patients with hospitalization days less than or equal to 30 days (CR � 4.23), which requires hospitals to follow evidence-based guidelines, strengthen clinical pathway management, and standardize similar patients' treatments and costs, thereby reducing the disease and economic burden of individual patients. Comorbidities and complications are important influencing factors in hospitalization costs. It can be seen from the grouping results that the standard cost of the second group (MCC, transferred to another hospital, death and others) is the highest, about 25,000 RMB yuan, which may be related to the increased difficulty of treatment due to the higher severity of the patients included in this grouping [18]. Discharge mode is also influenced to some extent by the severity of the disease, for example, patients who are transferred to another hospital on advice or who die tend to have more severe disease.
Computational cost grouping usually includes three methods: CART, CHAID, and E-CHAID, of which E-CHAID is an improved method of CHAID [19]. We found that E-CHAID has smaller model error and higher grouping performance than CART, which is consistent with other research results [38]. e results of CV and Kruskal-Wallis H test show that the cost grouping in this study is reasonable, and the grouping scheme is meaningful, which can provide a feasible method for the medical insurance department to Journal of Healthcare Engineering improve the disease grouping scheme and pay for diseases [39].

Conclusion
Based on the CHS-DRG grouping scheme, this study analyzed the composition and influencing factors of medical costs for inpatients with cerebral ischemic diseases, and used the E-CHAID algorithm to group costs. is study further verifies the applicability of the CHS-DRG grouping scheme and helps to optimize the DRG grouping system. is study also provides a theoretical basis for cost control of cerebral ischemic diseases, which is beneficial to reduce the economic burden of patients, and provides suggestions for other developing countries to improve the disease diagnosis grouping system.

Limitations of the Study.
e advantage of this study lies in the rich sample size and the data from representative general hospitals. However, this study also has certain limitations. Due to the availability of data, the data in this study are only from one hospital, and multicenter studies need to be added in the future to make the findings more generalizable. In addition, the hospitalization cost measured in this article is only a part of the direct economic burden [39], so the actual cost of the patient may be higher.
Data Availability e data come from the homepage of medical records of inpatients with cerebral ischemia in a tertiary hospital in Sichuan Province, and the use of the data need to obtain approval from the relevant departments of the hospital.

Consent
Since the data obtained after each patient's written consent to treatment were anonymous, patient consent was not required.

Conflicts of Interest
e authors declare that they have no conflicts of interest.

Authors' Contributions
JG contributed to the study design, data analysis, drafting the major portion of the manuscript, and article revisions. HCC contributed to the conceptual framework and article revisions. YW contributed to data collection and analysis. STH contributed to writing the manuscript. All authors critically reviewed the manuscript and approved the final version.