Can Artificial Intelligence Be Applied to Diagnose Intracerebral Hemorrhage under the Background of the Fourth Industrial Revolution? A Novel Systemic Review and Meta-Analysis

Aim We intended to provide the clinical evidence that artificial intelligence (AI) could be used to assist doctors in the diagnosis of intracerebral hemorrhage (ICH). Methods Studies published in 2021 were identified after the literature search of PubMed, Embase, and Cochrane. Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) was used to perform the quality assessment of studies. Data extraction of diagnosis effect included accuracy (ACC), sensitivity (SEN), specificity (SPE), positive predictive value (PPV), negative predictive value (NPV), area under curve (AUC), and Dice scores (Dices). The pooled effect with its 95% confidence interval (95%CI) was calculated by the random effects model. I-Square (I2) was used to test heterogeneity. To check the stability of the overall results, sensitivity analysis was conducted by recalculating the pooled effect of the remaining studies after omitting the study with the highest quality or the random effects model was switched to the fixed effects model. Funnel plot was used to evaluate publication bias. To reduce heterogeneity, recalculating the pooled effect of the remaining studies after omitting the study with the lowest quality or perform subgroup analysis. Results Twenty-five diagnostic tests of ICH via AI and doctors with overall high quality were included. Pooled ACC, SEN, SPE, PPV, NPV, AUC, and Dices were 0.88 (0.83∼0.93), 0.85 (0.81∼0.89), 0.90 (0.88∼0.92), 0.80 (0.75∼0.85), 0.93 (0.91∼0.95), 0.84 (0.80∼0.89), and 0.90 (0.85∼0.95), respectively. There was no publication bias. All of results were stable as revealed by sensitivity analysis and were accordant as outcomes via subgroups analysis. Conclusion Under the background of the fourth industrial revolution, AI might be an effective and efficient tool to assist doctors in the clinical diagnosis of ICH.


Introduction
Appearance of the fourth industrial revolution was based on the digitization and big data analysis [1]. e typical representatives were artificial intelligence (AI) and blockchain [2]. Without exception, there were more and more AI technologies or various software applied in medicine, especially in medical imageology [3]. Stroke was a major cause of death and disability globally; in particular, hemorrhagic strokes (including intracerebral and subarachnoid hemorrhage) had a relatively stable incidence adjusted for age in high-income countries but an increasing incidence in lowincome and middle-income countries each year [4]. Of the 15 million strokes reported worldwide annually, intracerebral hemorrhage (ICH) accounts for approximately 10% to 15% of all stroke cases in the United Statement, Europe, and Australia and approximately 20% to 30% of strokes in Asia [5]. e median 30-day mortality rate after ICH is approximately 15-50%, and only 20% of patients regain functional independence within three months after the ictus [6]. erefore, ICH, as a stroke subtype with high mortality and poor functional outcome in survivors, needed the accurate and objective evidence of neuroimaging to make a definite diagnosis [7]. AI used to diagnose ICH based on neuroimaging gradually became a trend to promote the development of intelligent medicine and efficiency of clinicians recently [8]. Apart from economic interest and development of AI industries, in the aspect of diagnostics, there was no evidence that AI could assist doctors in practically clinical work. In view of that the development of AI industries was quick as a flash, we intend to perform a novel systemic review and meta-analysis based on recent diagnostic tests, which were able to represent the state of the art AI technologies, to verify the hypothesis that AI might be an effective and efficient tool to diagnose ICH.

Inclusion Criteria.
(1) Language and regions of articles were not restricted; (2) articles were published in 2021; (3) diagnostic tests; (4) true-positive participates were patients suffered ICH; (5) true-negative participates were people without abnormal condition in neuroimaging; (6) the gold standard was that professional physicians, who were blind to tests, diagnose ICH or no ICH referring to the International Classification of Diseases and recent international standards guidelines; (7) full-automatic or semi-automatic diagnostic conclusions via AI technologies were used to compare with full-manual diagnostic outcomes via professional physician; (8) analysis or assessment of diagnosis effect was performed completely.

Quality Assessment.
e quality assessment of the included articles was performed via the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) by the software Review Manager 5.3 before data extraction. We considered that the study might be assessed to have higher quality for its larger number of included patients in studies with the same assessment in QUADAS-2.

Data Extraction.
All the original data used to assess diagnosis effect were extracted including accuracy (ACC), sensitivity(SEN), specificity (SPE), positive predictive value (PPV), negative predictive value (NPV), area under curve (AUC), and Dice scores (Dices),. In addition, some confounders, which might result in errors, were adjusted, including different diagnosis purposes, AI technologies, and other factors.

Statistical Analysis.
Relative numbers and their 95% confidence intervals (95%CI) were used to describe count data. Meta-analysis was performed using corresponding modules in Software for Statistics and Data Science (Stata, version 15.1; College Station, Texas 77845 USA). e pooled effect with its 95%CI was calculated by the random effects model. I-Square (I 2 ) was used to test the heterogeneity. Sensitivity analysis was performed to evaluate the stability of overall results by recalculating the pooled effect of the remaining studies after omitting the study with the highest quality or the random effects model was switched to fixed effects model. Funnel plot symmetry and Egger's regression were used to evaluate publication bias. To reduce heterogeneity, recalculating the pooled effect of the remaining studies after omitting the study with the lowest quality or perform subgroups analysis. All p values were two-sided with a significant level at 0.05.

Literature Search and Study Characteristics.
Totally, 142 articles were retrieved from 3 databases according to the strategy. After screening according to the inclusion and exclusion criteria, 25 articles  of diagnostic tests were enrolled ultimately ( Figure 1). A total of 23071 ICH patients participated in all the tests, who were manually diagnosed by professional physicians referring to the gold standard of ICH diagnosis in the latest international clinical guidelines ( Articles From Embase (n=116) Figure 1: Process of literature search. Table 1: Characters of studies included (" a " presented that 2 styles of hematoma volume were studied independently in one study. " b " presented that 2 solutions of ICH were studied independently in one study. " c " presented that 2 aims were studied independently in one study. " * " presented that the same first author performed another study). Katsuki) included two independent data extraction. Lu Li's study separated hematoma volume to "big" and "small" groups to study independently. Yu Lei's study studied the risk of ICH and occurrence of ICH independently. Stefan Pszczolkowski' study had two study aims independently: detection of ICH and prediction of prognosis in ICH patients. Masahito Katsuki wrote 2 different articles as the same first author.

Quality Assessment of Studies.
e assessment of article quality via QUADAS-2 is shown in Figure 2

Publication Bias and Sensibility Analysis.
ere was symmetrical distribution in funnel plots (Figure 4). In sensibility analysis, after the study with the highest quality   Table 2).
However, heterogeneity was still high. We considered that different aims of studies might be another source. erefore, we performed subgroup analysis of ICH detection, ICH segmentation, ICH prediction, and hematoma enlargement (

Discussion
We performed a novel systemic review and meta-analysis based on studies with high qualities in general. According to total meta-analysis of data, the diagnosis effect of AI was ACC > 0.83, Dices > 0.85, AUC > 0.80, SEN > 0.81, SPE > 0.88, PPV > 0.75, and NPV > 0.91 with a stable outcome of sensibility analysis, which might mean a relatively high agreement and similarity of full-manually  diagnostic conclusions, a relatively high authenticity of actual diagnostic conclusions, a relatively low rate of missed diagnosis and misdiagnosis, a relatively high accuracy of screening true ICH patients in people with risk of ICH, and a high accuracy of confirming true no risks of ICH in healthy people. Yet in the subgroup analysis of different aims, in addition to the great mass of outcomes in accord with total pooled effects, there were some invalid outcomes. e AUC of ICH detection was in the range of 0.64 to 1.10, which meant that it might be lack of authenticity for AI to detect ICH. e ACC of ICH segmentation was in the range of 0.37 to 1.33, which meant that the agreement of full-manually diagnostic conclusions might be controversial. For two abovementioned purposes, we considered that the factor-influenced identification of hematoma lesion via AI might be due to the fuzzy boundary between edema and hematoma during absorbing of ICH or in neuroimaging of small hematoma lesion. e NPV of ICH prediction was in the range of 0.62 to 1.08, which meant that AI might not confirm true ICH patients without some outcomes of prognosis. In this solution, we considered that subjectivity, which was unique to humans, might be the mingled influencing factor, because operation of AI was based on the binary system or other algorithmic languages, which was absolutely objective. Classification was usually involved in the assessment of prognosis in clinical work. Hence, when dealing with the common boundary of two grades, AI might not make decisions like humans flexibly, which might be a congenital defect of AI. However, generally, our results resembled the conclusion of meta-analysis published that it was effective for AI to detect brain metastasis [34].
Limits also appeared in our meta-analysis. We only selected articles published in 2021, which might influence the results because we considered that recent AI technologies might remedy previous defects, which would reduce       the heterogeneity. Significant heterogeneity was noted in our study like the published meta-analysis of AI used in prevalence and diagnosis of neurological disorders [35], the causes of which might be as follows: (1) the AI models used in these included studies were different. e operation mechanisms or databases of the AI models differed across studies. (2) e research objectives also differed including the detection of ICH, segmentation of ICH in neuroimaging, prediction of prognosis, and hematoma enlargement in ICH patients. (3) ICH patients participated in few studies included not only intraparenchymal hemorrhage but also intraventricular hemorrhage, subdural hemorrhage, or subarachnoid hemorrhage. (4) All the original data used to assess diagnosis effect could be influenced to each other. (5) Number of samples was stark contrast.
In our opinion, although AI as a medical tool will bring great commercial profits to its designers and make the clinical work of doctors more efficient, whether AI systems can be used to diagnose ICH still requires more research evidences with cross-regional, multicenter, and large sample size. e objective and accurate division of hematoma, perihematoma edema, infarction focus, and normal tissue, especially in the stage of hematoma absorption and perihematoma edema developing, is the key for AI to analyze neuroimaging data of ICH. Moreover, when designers and researchers are constructing the database for mechanical learning, some potential problems may appear that the etiology classification of ICH is ambiguous, and the choice of research indicators or dependent variables is not comprehensive enough. Addressing these defects is closely related to continuously optimizing the clinical guideline of ICH. erefore, while AI is updating, more evidences originated from high-quality and authoritative clinical researches are the real basis of its development of clinical applications.

Conclusion
Under the background of the fourth industrial revolution, AI might be an effective and efficient tool to assist doctors in the clinical diagnosis of ICH.

Data Availability
All data analyzed during this study are included in this published article.

Consent
Not applicable.