Rheumatologic diseases manifest themselves in varying combinations of symptoms and signs, particularly at early stages, and therefore make differential diagnosis a challenge, especially for nonrheumatologists including general practitioners. Since diagnosis at an early stage and adequate treatment improve prognosis, assistance in establishing diagnosis is desirable. Given the substantial progress in computer science in the last years, the idea of computers taking the role of diagnostic support is not far-fetched. Software applications have affected decision processes in clinical routine, for example, in controlling depth of anesthesia [
Common methodologies for expert systems.
Typical structure of a knowledge-based expert system. Based on Buchanan [
Pandey and Mishra distinguished between knowledge-based systems and intelligent computing systems [
The approaches to intelligent computing systems are artificial neuron nets, genetic algorithm, and fuzzy systems. Artificial neuron nets are built like biologic intelligent nervous systems and are regarded as learn-like [
Bayes’ theorem is a statistical method. The probability of a diagnosis is calculated with the accuracy of a test or a clinical finding and the prevalence of the disease [
Different methodologies are often combined, which are then called hybrid expert systems [
Already in 1959, Ledley and Lusted anticipated the use of computers in supporting decisions and proposed different mathematical models to emulate the reasoning in medical diagnosis [
Several somewhat outdated review articles explored the development and application of expert systems in medicine in general [
Besides an overview of characteristics, comprehensiveness, and validation of existing diagnostic expert systems in rheumatology, this systematic review seeks to point out whether the current expert systems fulfill the expectations of clinicians in daily practice and finally what the characteristics of an optimal system would be.
The systematic literature review was carried out following the PRISMA statement [
In the optimal scenario we anticipated to find comprehensive reports on each individual diagnostic expert system including information on the precise diagnostic algorithm, the targeted diseases, a well-described validation cohort, and predictive values for diagnostic performance. The data would allow for a statistical comparison of the expert systems. In a suboptimal scenario, only descriptive reports of expert systems will be found allowing for a comprehensive overview of the past developments without statistical comparability.
Medline, Embase, and Cochrane Library were searched using the following Medical Subject Heading (MESH) terms: “rheumatic diseases,” “rheumatology,” “arthritis,” “computer assisted diagnosis,” and “expert systems.” No restrictions were placed on publication date. Only literature in English or German was considered. The last search was run on February 10, 2014.
The literature was screened based on title and abstract of the records. All publications referring to diagnostic expert systems in rheumatology or in a rheumatic subfield were included. Reviews, editorials, and literature which described an expert system only used for education of healthcare providers and therefore not used in diagnostics were excluded. Also, literature that referred to an expert system used for identifying solely the stage of a disease and hence not used for diagnosing a disease itself was excluded. Records which described an expert system applied only to image analysis were not considered either. Literature referring to data mining strategies using index diagnoses or solely epidemiological variables was excluded as well. Figure
Selection of publications.
Year of the last update of the system, number of considered rheumatic diseases, targeted diseases, information to feed the expert systems (history, clinical exam, laboratory analyses, and imaging studies), methodology of the inference mechanism, and embedding of accepted disease criteria sets such as the American College of Rheumatology (ACR) or The European League Against Rheumatism (EULAR) criteria were extracted using standard forms.
For the description of the validation method and the performance, the following information was extracted from the articles: number of cases used for the validation, determination of the resulting diagnosis, identification of the correct diagnosis, the reference diagnosis, percentage of correctly identified cases, sensitivity and specificity, positive predictive values, negative predictive values, positive likelihood ratio, and negative likelihood ratio.
Only descriptive statistics are reported. Statistical analyses could not be performed due to the lack of information.
A total of 10,282 references were identified using the search strategy. Seventy-three articles related to diagnostic expert systems in rheumatology were included. Nine duplicates were excluded. The remaining 64 full text articles were then assessed. One record describing an expert system developed solely for education [
Table
Characteristics of the identified expert systems.
Name of ESa or first author | Year of last update | Number of diseases | Targeted diseases | Input for ESa | Methodology | Reference |
---|---|---|---|---|---|---|
Romano | 2009 | 2 | Prosthesis infection | Lb, Ic | Calculation tool | [ |
Watt | 2008 | 1 | Knee osteoarthritis | Hd, Ee, Ic | Bayesian belief network | [ |
Provenzano | 2007 | 3 | Chronic pain | Hd | Discriminant analysis | [ |
Binder | 2005 | 5 | Connective tissue diseases | Lb | Case based reasoning | [ |
Liuf | 2004 | 1 | RAg | Hd, Lb | Algorithm | [ |
Lim | 2002 | 24 | Arthritic diseases | Hierarchical fuzzy inference | [ | |
CADIAGf | 2001 | 170 | Rheumatic diseases | Hd, Ee, Lb, Ic | Rule based, fuzzy sets |
[ |
RENOIRf | 2001 | 37 | Rheumatic diseases | Hd, Ee, Lb, Ic | Rule based, fuzzy sets | [ |
RHEUMexpert | 1999 | Rheumatic diseases | Hd, Ee, Lb, Ic | Rule based | [ | |
Zupan | 1998 | 8 | Rheumatic diseases | Hd | Rule based | [ |
AI/RHEUM | 1998 | 59 | Rheumatic diseases | Hd, Ee, Lb, Ic | Rule based | [ |
Dzeroski | 1996 | 8 | Rheumatic diseases | Hd | Rule based and statistical | [ |
Hellerf | 1995 | 6 | Vasculitis | Hd, Ee, Lb | Bayesian classifier | [ |
Astion | 1994 | 1 | Giant cell arteritis | Hd, Ee, Lb | Neural networks | [ |
Barreto | 1993 | 2 | RAg and SLEh | Hd, Ee, Lb, Ic | Neural networks, fuzzy sets | [ |
MESICAR | 1993 | Rheumatic diseases | Model based |
[ | ||
RHEUMA | 1993 | 67 | Rheumatic diseases | Hd, Ee, Lb, Ic | Rule based | [ |
Bernelot Moens | 1992 | 15 | Rheumatic diseases | Hd, Ee, Lb, Ic | Bayes’ Theorem | [ |
Sereni | 1991 | 1 | Temporal arteritis | Hd, Ee, Lb | Bayes’ Theorem, decision tree | [ |
Rigby | 1991 | 1 | RAg | Hd, Ee | Bayesian and logistic regression | [ |
Schewef | 1990 | 32 | Knee pain | Hd | Rule based | [ |
Prust | 1986 | 2 | Ankylosing spondylitis and SLEh | Hd, Ee | Scoring tool | [ |
Gini | 1980 | 7 | Arthritic diseases | Hd | Rule based | [ |
Dostál | 1972 | 1 | RAg | Hd | Bayes’ Theorem | [ |
Fries | 1970 | 35 | Arthritic diseases | Hd | Statistical | [ |
Table
Validation of the identified expert systems.
Name of ESa or first author | Number of cases used for validation | Percentage of diagnoses correct | Sensitivity | Specificity | Reference |
---|---|---|---|---|---|
Romano | 32 | [ | |||
Watt | 200 | 100% | [ | ||
Provenzano | 511 | 22.9–69.7%b | [ | ||
Binder | 325 | 82.6% |
93.2% |
[ | |
Liu | 90 | 95% | 100% | 88% | [ |
Lim | No validation | [ | |||
CADIAGd | 54 | 48%e | [ | ||
RENOIRd | 32 | 75% | [ | ||
RHEUMexpert | 252 | 32–77%f | 70–73%f | [ | |
Zupan | 462 | 46.8% |
[ | ||
AI/RHEUMd | 94 | 80% | [ | ||
Dzeroski | 462 | 47.2–50.9%b | [ | ||
Heller | 12000 computer simulated cases | 84.15–99.9%f | [ | ||
Astion | 807 | 94.4% | 91.9% | [ | |
Barreto | No validation | [ | |||
MESICAR | No validation | [ | |||
RHEUMA | 51 | 89%e | [ | ||
Bernelot Moensd | 570 | 76%/80%b |
62% | 98% | [ |
Sereni | 341 | [ | |||
Rigby | No validation | [ | |||
Schewe | 358 | 74.4% | [ | ||
Prust | No validation | [ | |||
Gini | No validation | [ | |||
Dostál | 553 | 80% | [ | ||
Fries | 190 | 76% | [ |
The reference standards were chosen differently: diagnoses according to established criteria, consensus diagnoses, discharge diagnoses, and diagnoses provided by a rheumatologist were used most often as reference. Three expert systems presented certain criteria for the determination of the resulting diagnosis when several diagnoses were presented as a result or when a probability value was added to the diagnosis. Table
Reference diagnoses and the determinations of the resulting diagnoses.
Name of ES |
Reference diagnosis | Determination of the resulting diagnosis | Reference |
---|---|---|---|
Watt | NIH Osteoarthritis initiative data base | [ | |
Binder | Diagnosis according to established criteria | [ | |
Liu | Consensus of rheumatologists | [ | |
CADIAG | Discharge diagnosis | Among first 5 hypotheses | [ |
RENOIR | Discharge diagnosis | [ | |
RHEUMexpert | Discharge diagnosis | [ | |
AI/RHEUM | Initial diagnosis of a rheumatologist | At the possible level | [ |
Astion | Vasculitis database of the American College of Rheumatology | [ | |
RHEUMA | Discharge diagnosis | [ | |
Bernelot Moens | Outcome over time and consensus of rheumatologists | [ | |
Sereni | Biopsy | [ | |
Schewe | In the hypotheses list | [ | |
Dostál | Diagnosis provided by a rheumatologist | [ | |
Fries | Diagnosis provided by a rheumatologist | [ |
An article that reports on the applicability of a rheumatological expert system in clinical routine could not be identified in the published literature.
The main result of this systematic review is threefold. First, an overview over 25 different diagnostic expert systems designed for rheumatology is given. Second, it is shown that the different designs and validation methods of the expert systems hinder the comparison of their performances. Third, we found no publications reporting on the routine application of an expert system in rheumatology.
Artificial intelligence has achieved enormous progress in its development and computers have outclassed human beings in various fields, such as computer chess or IBM’s Watson winning on the quiz show “Jeopardy!” Given this progress in technology and the time period covered by this systematic review of over forty years, the low number of identified expert system is surprising. The reasons would be either low interest in supportive software or, more likely, the difficulties encountered in simulating the complex human diagnostic process. Spreckelsen et al. [
Nevertheless, the growing understanding of diseases and the corresponding findings or symptoms will facilitate the representation of medical knowledge and decision processes in the future.
In consequence of the variation in the method of validation, the achieved validation results could not be compared with each other. The reason for this variability probably lies in two elements.
First, the result of the expert systems to be compared with the reference diagnosis was presented in different ways. Some expert systems indicated a probability value of the calculated resulting diagnosis, and others present a hypotheses list. Final diagnoses in rheumatology often remain descriptive or incomplete and evolve over time as many of the rheumatic disorders present atypically and do not completely fulfill a diagnostic criteria set at the beginning. This issue is met by the presentation of the results as a hypotheses list or probability values, which can, as an important advantage, multiply the user’s own differential diagnosis and lead to more focused testing. Yet, this method causes difficulties in the validation and the comparison of expert systems. For example, the diagnostic accuracy is erroneously high if a diagnosis at a low position in the hypotheses list or a diagnosis with a low probability value is accepted as a correct resulting diagnosis during the validation process.
Second, there is a lack of widely accepted reference standards for the correct diagnosis to compare the resulting diagnosis with. Some authors used diagnoses in medical records or discharge diagnoses as a comparator assuming the correctness of their peers, some chose the consensus of rheumatologists, and others used diagnoses according to official diagnostic criteria sets. The latter is probably the most reliable way; however, even if international consensus criteria exist, there are still many different criteria sets especially for rare diseases where the superiority of one set over the other and in particular the threshold for a diagnosis remains a matter of debate. In addition, many of these criteria sets were established to obtain homogenous cohorts in clinical trials leading to a low sensitivity in early or mild disease.
Another approach was the assessment of the interobserver variability by Hernandez et al. [
The transferability of expert systems to the general population (the external validity) can be tested with a validation in a developer-independent clinical setting. Only AI/RHEUM, CADIAG, and RHEUMA [
Besides the internal and external validity, the following features are, according to Kawamoto et al., highly associated with an expert system’s ability to improve clinical practice: the availability at the time and location of decision making, the integration into clinical workflow, and the provision of recommendations rather than a pure assessment [
Kolarz et al. [
The reason for the absence of expert systems in clinical use hitherto has been discussed in detail in the literature. Mandl and Kohane claimed that health information technology in general was in arrears compared to other industries. Also they took the health information technology products as too specific and incompatible with each other [
In spite of computerized assistance, the user of the expert system needs rheumatologic fundamentals for the detection and the correct description of rheumatologic findings. CADIAG, AI/RHEUM, RENOIR, RHEUMexpert, and MESICAR were specifically developed for the assistance of nonrheumatologists [
The integration of widely accepted diagnostic criteria sets such as the ACR or EULAR criteria into the diagnostic process would increase the acceptance and credibility of an expert system. It also reduces the influence of individual diagnostic strategies of the developers. Nevertheless, only six of the identified expert systems reported the integration of such criteria sets into their expert database. The downside of diagnostic criteria originating primarily from classification criteria for the inclusion into clinical trials, however, is the generally low sensitivity in early disease. This insensitivity of some criteria, such as the 1987 ARA criteria for rheumatoid arthritis, forced Leitich et al. to modify the criteria using fuzzy sets to gain different levels of sensitivity [
Although a thorough systematic search has been performed in the most relevant databases, some reports could have been missed if written in other languages than English or German. As most of the current literature is published in English at least as an abstract, we are confident that we did not miss relevant articles on diagnostic expert systems in rheumatology. The number of expert systems which have remained unpublished because of their expected commercial use or the abortion of the system at an early stage is hard to estimate.
The reported expert systems showed a great variety in diseases spectrum, methodology, and validation status. This made a statistical comparison of the systems impossible.
And finally, the important topic of patient reported outcomes which are of increasing importance not only in clinical trials and patient’s follow-up but also in the diagnostic process was beyond the scope of this review.
In conclusion, this systematic review shows that the many attempts made for an ideal expert system in rheumatology in the past decades have not yet resulted in convincing validated tools allowing for reliable application in daily practice. Nevertheless, the demand in support by expert systems is pressing as the knowledge about the rheumatic diseases increases and the therapeutic options especially in early disease stages are growing constantly. An ideal diagnostic expert system in rheumatology would have the following characteristics. The expert system would allow for universal integration into the clinical workflow as well as rapid and intuitive data input. Since rheumatologic diagnoses cannot always be definite, the resulting diagnosis would have a probabilistic grade to indicate uncertainty. The system would also have an educational component to improve the nonexpert’s ability to recognize pathological findings. Finally, accepted diagnostic criteria sets would be applied to increase the general validity of the system’s diagnostic process.
Based on the demand of such a tool and the progress made hitherto it seems to be a matter of time until new and promising expert systems enter clinical practice.
The authors declare that there is no conflict of interests regarding the publication of this paper.
All authors have made substantial contribution to the analysis and interpretation of the data, have been involved in the revising of the manuscript, and have given final approval of the version to be published. Hannes Alder and Lukas M. Wildi were responsible for conception, design, and drafting of the manuscript, and Hannes Alder performed the data acquisition.
This study was supported by the University Hospital of Zurich, Zurich, Switzerland.