Accuracy of Ultrasound Diagnosis of Benign and Malignant Thyroid Nodules: A Systematic Review and Meta-Analysis

Background Distinguishing between benign and malignant thyroid nodules remains difficult. Ultrasound has been established as a non-invasive and relatively simple imaging technique for thyroid nodules. This study aimed to assess the diagnostic accuracy of conventional ultrasound and ultrasound elastography for the differentiation between benign and malignant thyroid nodules by meta-analyzing published studies. Methods Literature was retrieved from the PubMed and Embase databases from inception to May 31, 2022. The literature was screened using inclusion and exclusion criteria. The Quality Assessment of Diagnostic Accuracy Studies (QUADAS2) scale was used to assess the quality of the included literature. Publication bias of the included studies was assessed by Deek's funnel plot. Heterogeneity tests were performed using Cochrane Q statistic and I2 statistic. Results Finally, 9 articles were included. The meta-analysis showed that the combined sensitivity and specificity of ultrasound for the diagnosis of thyroid nodules were 0.88 [95% CI (0.83–0.91)] and 0.86 [95% CI (0.79–0.90)], respectively. The area under the curve (AUC) of the summary receiver operating characteristic curve (SROC) was 0.92 [95% CI (0.90–0.94)]. There was no significant publication bias in this study. Discussion. Existing evidence shows that ultrasound has a certain accuracy in diagnosing benign and malignant thyroid nodules, providing a scientific basis for thyroid assessment and diagnosis.


Introduction
yroid nodules are cystic or solid lumps that are most frequently asymptomatic. Nonetheless, large thyroid nodules have also been shown to interfere with the normal functioning of cardiovascular and respiratory functions [1,2]. Pathologically, thyroid nodules are dichotomized into benign nodules and malignant nodules. In general, most of the benign thyroid nodules are small in size, mild in symptoms, and have favorable treatment outcomes. erefore, accurate and effective determination of the nature of nodules is beneficial for clinical treatment planning and assessment of outcomes [3][4][5][6]. Currently, the clinical techniques used to distinguish benign and malignant thyroid nodules mainly include ultrasound, computed tomography, and nuclear imaging. Traditional ultrasonography is widely used in clinical practice due to its advantages of safety, low cost, ready availability, and no radiation exposure [7][8][9]. e ultrasound images of malignant thyroid nodules have the characteristics of irregular shape, unclear edge, inhomogeneous, calcification, low echo, and aspect ratio greater than 1." However, conventional ultrasound is limited for the diagnosis of malignant thyroid nodules in terms of small thyroid cancers, multiple nodules, and cystic nodules with internal hemorrhage. In addition, there are some thyroid nodules that are not obvious on ultrasound imaging. us, several studies have concluded that traditional ultrasound imaging techniques cannot actually meet the needs of current clinical practice [10][11][12].
Ultrasound elastography, a newly developed dynamic imaging technique, was first proposed by Ophir et al. in 1991 [13] and first applied to thyroid clinical practice by Lyshchik et al. in 2005 [14]. Subsequently, in 2010, Sebag et al. first reported the use of shear-wave elastography (SWE) to diagnose thyroid nodules [15]. In recent years, emerging studies have shown that ultrasound elastography is highly sensitive for differentiation between benign and malignant thyroid nodules and should serve as the first-line imaging modality for patients with thyroid nodule [16,17]. erefore, this study evaluated the diagnostic accuracy of conventional ultrasound and ultrasound elastography for the differentiation between benign and malignant thyroid nodules by meta-analyzing published studies.

Literature Source. Electronic databases, including
PubMed and Embase, were searched from inception to May 31, 2022. Keywords used for searching included ultrasonography and thyroid nodule. e combination of medical subject headings and free words was used to search relevant publications. e retrieved literature was checked manually and managed by EndNote X9.

Inclusion and Exclusion Criteria of Literature.
Studies that meet the following criteria were included: (1) the study evaluated the diagnostic utility of conventional ultrasound or ultrasound elastography for patients with thyroid nodules; (2) pathological biopsy was used as the "gold standard" for determination of the benignity or malignancy of the thyroid nodule; and (3) research could directly or indirectly obtain true positive, false positive, false negative, and true negative value. e exclusion criteria were as follows: (1) guidelines, reviews, meetings, reviews, meta-analysis, and other non-original articles; (2) repeated publication; and (3) incomplete data.

Literature Screening, Data Extraction, and Quality
Evaluation. Literature retrieval, screening, and data extraction were completed by two researchers independently. Two researchers made standardized tables to extract data from the included literature, including research author, research time, country, and type of experiment. e patient data were recorded, including the total number of cases, diagnostic reference standards, and the number of thyroid nodules. e number of true positive, false positive, true negative, and false negative was also extracted from the included studies. e quality of the included literature was evaluated by the Quality Assessment of Diagnostic Accuracy Studies (QUADAS2) scale. Two researchers cross-checked the quality assessment results. If there is any disagreement, the joint judgment result after consultation and discussion shall prevail.

Statistical
Analysis. Stata V 15.0 software was used for statistical analysis. e combined effect quantity, including sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, and diagnostic odds ratio, was obtained. e diagnostic capability was evaluated by drawing the subject operating characteristic curve (SROC). A larger area under the curve (AUC) often signified higher diagnostic accuracy. Heterogeneity test was performed using I 2 . In the included literature, P < 0.05 or I 2 > 50% indicated high heterogeneity; P > 0.1 or I 2 < 25% indicated low heterogeneity; and 25% ≤ I 2 ≤ 50% indicated moderate heterogeneity. If the inter-study heterogeneity is high, the random-effects model is used for meta-analysis; otherwise, a fixed-effect model is used for meta-analysis. Publication bias detection was performed using Deek's funnel plots. Two-sided P value < 0.05 denoted statistical significance.

Literature Search Results.
After the preliminary search, 480 studies were retrieved. According to the inclusion and exclusion criteria, 43 duplicate studies were excluded. After reading the title and abstract, 323 obviously unrelated studies were excluded. A total of 32 publications were downloaded and read for the full text. Finally, 9 studies were included, as shown in Figure 1.

Basic Characteristics of Included
Articles. All the 9 included articles were English publications that included 7 prospective single-center studies, 1 prospective multicenter study, and 1 retrospective study. A total of 1436 nodules were included, including 1006 benign nodules and 430 malignant nodules, as shown in Table 1.

Quality Evaluation of Included
Studies. QUADAS2 scale was used to evaluate the quality of the 9 included articles ( Figure 2). e articles we included were all of low risk.

Heterogeneity Test.
All included studies were tested for heterogeneity.
ere was significant inter-study heterogeneity (I 2 � 70%) (Figure 3), so the random-effects model was used for pooled analysis.

Consolidation Analysis.
e effect quantities of all included studies were statistically analyzed. e combined sensitivity and specificity were 0.

Fagan Nomogram Analysis.
A 50% predicted probability was used to simulate the clinical situation. e results showed that the post-test probability of a positive test result was 86%, while the negative likelihood ratio was 0.14 and the negative post-test probability was 1% ( Figure 9).

Meta-Regression and Subgroup Analysis.
ere was no significant difference in specificity between articles from China and those that are not (P � 0.28). Sensitivity was significantly different between studies in the Chinese group at 0. , respectively. ere were significant differences in terms of both the sensitivity and specificity (P < 0.05). Diagnosis was a potential factor for heterogeneity. e results are shown in Table 2 and Figure 10.

Publication Bias.
e results of publication bias detection are shown in Figure 11. e P value for the slope coefficient of Deek's funnel plot is 0.17, indicating no significant publication bias in the included studies.

Discussion
According to the inclusion criteria, 9 research articles with 1436 thyroid nodules from 1378 patients were selected to analyze the ultrasonic differentiation of benign and malignant thyroid nodules. Since high heterogeneity was observed in the analysis results, the random-effects model was applied in the data analysis. e sensitivity and specificity of ultrasound diagnosis were 0.88 [95% CI (0.83-0.91)] and 0.86  , respectively. In addition, there are significant differences in terms of diagnostic sensitivity and specificity between ultrasound elastography and conventional ultrasound, indicating that the diagnosis method may be a potential factor of heterogeneity. Due to its non-invasiveness, wide availability, and low cost, ultrasonography is still the preferred method for clinical examination of thyroid nodules. In recent years, the yroid Imaging Reporting and Data System (TIRADS) risk score has been introduced clinically to standardize the risk assessment of ultrasonographic diagnosis of malignant thyroid nodules [27][28][29][30].
e main advantage of the TIRADS score is its high accuracy for identifying suspicious thyroid nodules worthy of cytological examination, thereby achieving early detection while avoiding unnecessary biopsies [31,32]. However, TIRADS also has some limitations in practical applications in recent years. For instance, thyroid nodules of different classifications may have the same TIRADS score. A study from Italy in 2017 showed that the accuracy of the TIRADS score was approximately 27.2% [33]. In contrast, studies have shown that the specificity and sensitivity of fine-needle aspiration (FNA) in identifying malignant thyroid nodules were about 60%-98% and 54%-90%, respectively. FNA remains one of the gold standards for identifying malignant thyroid nodules [34][35][36][37].
Ultrasound evaluation of the lateral neck during the early assessment is helpful in determining the scope of the final operation [38]. Some studies have found that    preoperative neck ultrasound has changed the surgical method in 40% of patients [38][39][40]. At this stage, it is recommended that all patients with suspected thyroid nodules should undergo an ultrasound examination [16]. Hyperechoic/isoechoic (brighter than normal thyroid tissue or with the same echo) nodules are usually benign. Meanwhile, noticeable hypoechoic nodules increase the risk of malignancy [41,42]. Nodules with mixed cystic and solid components are less likely to be malignant than completely solid nodules [43,44]. "Taller-than-wide" appearance also increases the risk of malignancy [45,46]. Intra-nodal calcification has also been reported to increase the likelihood of malignancy [47,48]. A study of nearly 700 thyroid tumors found that more than half of malignant nodules (63%) lacked intra-nodal vessels on preoperative imaging [49].
Various cancerous processes alter the physical characteristics of affected tissues. Ultrasound sonography is a novel imaging technique that can provide information about tissue hardness [14,[50][51][52][53][54][55]. With emergence of commercial ultrasound systems, ultrasound elastography has been increasingly applied in various fields to verify its clinical applicability [51,52,[56][57][58][59]. Among the 40 patients examined with ultrasound elastography, 35 of the 40 benign nodules and 9 of the 11 malignant nodules have been correctly classified by ultrasound elastography with pathological examination as the reference standard [60].      International Journal of Clinical Practice e assessment and management of patients with thyroid nodules is no longer a one-size-fits-all proposition. e main challenge in the management of thyroid nodules is to identify malignant nodules while avoiding excessive use of aspirations and surgery at the same time.
erefore, advanced diagnostic methods that can accurately evaluate the benign and malignant thyroid nodules would be desirable. A customized method is advocated, which requires careful evaluation of each nodule to determine the possibility of malignancy [1]. Ultrasound can maximize the detection of clinically relevant thyroid lesions and reduce fine-needle aspiration of benign nodules to reduce over-diagnosis and over-treatment of benign nodules, achieving the best prognosis for patients and minimizing the cost of medical treatment [61]. e 9 studies included in this study have some heterogeneity after analysis, which might affect the reliability of the study conclusions to a certain extent. We suspected that possible reasons for high inter-study heterogeneity were related to small sample size and incomplete publication inclusion since databases other than PubMed and Embase were not searched. In addition, the experience of ultrasound operators would also affect the study results.   In conclusion, ultrasound is still an ideal way to detect thyroid nodules. In the future, additional research is required to improve ultrasonic diagnosis. Meanwhile, it can be combined with other relevant imaging technologies to improve the sensitivity and specificity of ultrasonic diagnosis and reduce unnecessary pathological aspirations. Furthermore, the diagnostic accuracy can be improved by fine-needle aspiration biopsy and other imaging examinations if the lesions are not determined by routine ultrasound.
Data Availability e data used and analyzed during the current study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.

Authors' Contributions
Mei Shi and Dandan Nong contributed equally to this work. International Journal of Clinical Practice 9