Comparison of Fine Needle Aspiration and Fine Needle Nonaspiration Cytology of Thyroid Nodules: A Meta-Analysis

Background. Fine needle aspiration cytology (FNAC) and fine needle nonaspiration cytology (FNNAC) are useful cost-effective techniques for preoperatively assessing thyroid lesions. Both techniques have advantages and disadvantages, and there is controversy over which method is superior. This meta-analysis was performed to evaluate the differences between FNAC and FNNAC for diagnosis of thyroid nodules. Methods. Primary publications were independently collected by two reviewers from PubMed, Web of Science, Google Scholar, EBSCO, OALib, and the Cochrane Library databases. The following search terms were used: fine needle, aspiration, capillary, nonaspiration, sampling without aspiration, thyroid, and cytology. The last search was performed on February 1, 2015. Results. Sixteen studies comprising 1,842 patients and 2,221 samples were included in this study. No statistically significant difference was observed between FNAC and FNNAC groups with respect to diagnostically inadequate smears, diagnostically superior smears, diagnostic performance (accuracy, sensitivity, specificity, negative predictive value, and positive predictive value), area under the summary receiver operating characteristic curve, average score of each parameter (background blood or clot, amount of cellular material, degree of cellular degeneration, degree of cellular trauma, and retention of appropriate architecture), and total score of five parameters. Conclusion. FNAC and FNNAC are equally useful in assessing thyroid nodules.


Introduction
Thyroid nodules are a common clinical problem, and 1-10% are malignant [1]. The incidence of thyroid cancer nearly tripled from 1975 to 2009, primarily as a result of an increase in papillary thyroid carcinoma [2]. Therefore, early diagnosis and treatment have become increasingly important in curing malignant thyroid carcinoma.
Fine needle aspiration cytology (FNAC) has been routinely used as the baseline investigation for diagnosis of nodular thyroid disease. Its advantages include minimal invasion and high sensitivity, specificity, and accuracy [3]. However, it has also disadvantages; the bloody smears caused by negative pressure during aspiration are detrimental to both cell concentration and cell morphology of the specimen, leading to an unsatisfactory specimen and improper cytological interpretation [4][5][6].
In an attempt to overcome these problems, fine needle nonaspiration cytology (FNNAC) was developed in France in 1982 by Briffod et al. [7] and described by Santos and Leiman in 1988 [6]. FNNAC avoids active aspiration and relies on capillary tension to suck the tissue sample into the needle bore; this reduces bleeding and minimizes trauma to thyroid tissue [8,9].
There are many conflicting studies regarding the superiority of FNNAC to FNAC [10][11][12][13][14][15][16][17][18]. Some studies have reported that FNNAC reduced bleeding and obtained higher quality samples [11][12][13]; other reports have indicated that the diagnostic adequacy of FNAC was higher than FNNAC [17,18] or that both methods were equally efficient [10,14,15]. Studies on the accuracy, sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV) of both techniques based on histopathology have also been inconclusive [10,15,[19][20][21].  Hence, we have conducted a systematic review and metaanalysis to evaluate the performance of FNAC and FNNAC in diagnosing nodular thyroid disease. We also aim to clarify the diagnostic performance of both techniques, which will provide physicians with a theoretical reference and guidelines to properly select between these two techniques.

Electronic Library Search.
Relevant publications were collected from PubMed, EBSCO, Google Scholar, OALib, and Cochrane databases. The search keywords used were fine needle, aspiration, capillary, nonaspiration, sampling without aspiration, thyroid, and cytology. There was no restriction on the publication date or language. We removed duplicated publications that were identified in multiple databases.

Study Inclusion/Exclusion
Criteria. All relevant titles, abstracts, and full papers identified by the prespecified search strategy were independently screened by two authors (Hongming Song and Chuankui Wei), and irrelevant articles were excluded. Search results were compared, and disagreements were resolved by discussion with the third reviewer (Kaiyao Hua).
The included studies reported comparison of performance between FNAC and FNNAC. Studies that did not refer to thyroid nodules and those that did not compare the cytological findings with histological results were excluded from this study. Letters, reviews, abstracts, editorial materials, and animal trials were also excluded from this study.
We also calculated the diagnostic performance of FNAC and FNNAC by comparing the cytological diagnosis of thyroid nodules with the histological results, regardless of whether the included studies adopted Mair et al. scoring system.

Data Extraction.
We extracted the following data from the included studies: the number of Categories 1 and 3 smears, the average score (mean ± SD) of each of the five objective parameters (background blood or clot, amount of cellular material, degree of cellular degeneration, degree of cellular trauma, and retention of appropriate architecture), and the average total score of the five parameters (mean ± SD). The numbers of true positive, false positive, false negative, and true negative results were evaluated. The diagnostic performance (accuracy, sensitivity, specificity, NPV, and PPV) of both techniques was extracted. The name of the first author, year of publication, study design, number of patients, number of lesions, and needle gauge were also reviewed and recorded ( Table 2).  The following six aspects were evaluated: random sequence generation, allocation concealment, blinding, incomplete outcome data, selective reporting, and other bias. All studies were classified as "unclear," "yes," or "no" to indicate "uncertain bias," "low-risk bias," or "high-risk bias," respectively. The assessment of risk of bias is described in Table 3.

Statistical Analysis.
The data from included studies were analyzed using Review Manager software (RevMan, version 5.3, Copenhagen, The Nordic Cochrane Centre, The Cochrane Collaboration, 2014). Each study was weighted by its sample size. For dichotomous variables such as the smear quality and accuracy of FNAC and FNNAC, odds ratios (OR) and 95% confidence intervals (95% CI) were calculated. The weighted mean difference and standardized mean difference were computed for continuous variables that had the same or different units in the assessing system, respectively. The mean difference (MD) and 95% CI were computed for the average score of each parameter and the average total score of the five parameters. Heterogeneity among the studies was assessed using the 2 test and 2 statistics. If the heterogeneity test did not reveal statistical significance ( 2 < 50%, > 0.1), the fixed-effects model was adopted; otherwise, the random-effects model was used. If the value was less than 0.05 and 95% CI did not contain the value 1 for OR or the value 0 for MD, the OR and MD were considered to be statistically significant. Publication bias was assessed by the funnel plot. The sensitivity analysis of the results was performed using the leave-one-out approach. The summary receiver operating characteristic (SROC) curve analysis was performed using Meta-Disc version 1.4 software. The corresponding area under the curve (AUC) was calculated as a global measurement of test performance; the closer the AUC to 1, the better the test performance.

Search Results.
A total of 527 records were identified from the databases. Among them, 30 full-text articles were assessed for potential eligibility. Seven articles were excluded because they did not use the Mair et al. scoring system or did not report the diagnostic performance of FNAC and FNNAC [3-6, 9, 30, 31]. Four articles that used the modified scoring system of Mair et al. were excluded (1-3 parameters were excluded from the Mair et al. scoring system) [16,[32][33][34]. One article was excluded owing to lack of assessment of smear quality and the diagnostic performance of FNAC and FNNAC [35]. Two articles that did not have available data for meta-analysis were excluded [36,37]. A final total of 16 articles met the inclusion criteria [8, 10-15, 19-21, 23-28]. The steps taken in selecting eligible articles are shown in Figure 1.

Characteristics of the Included Studies.
In this metaanalysis, the 16 included studies involved 1,842 individual patients and 2,221 samples collected by FNAC and FNNAC. Of these studies, 15 were prospective and only one was retrospective in design. The studies have great differences in the number of patients and samples, needle gauge, sex ratio, and mean age of patients. The results included diagnostically inadequate and superior smears, diagnostic performance (accuracy, sensitivity, specificity, NPV, and PPV), average scores of each parameter, and average total scores of the five parameters of the Mair et al. scoring system. Diagnostically inadequate smears collected using both techniques were reported in 12 studies [8, 10-14, 20, 23-27], while superior smears collected using both techniques were reported in 11 studies [8,[10][11][12][13][14][23][24][25][26][27]. The accuracy of both techniques as confirmed by histopathology was assessed in five studies [10,15,[19][20][21]; among these, sensitivity, specificity, NPV, and PPV were extracted from four studies [10,15,19,21], the average score of each of the five parameters was measured for both techniques in five studies [8,10,13,24,28], and the average of the total scores was calculated in five studies [8,10,13,14,24]. The characteristics of the included studies are described in Table 2.

Comparison of the Quality of Smears Collected by FNAC versus FNNAC.
The number of diagnostically superior smears collected via FNAC compared with FNNAC was assessed in 11 studies [8,[10][11][12][13][14][23][24][25][26][27]. The proportion of diagnostically superior smears in the FNAC and FNNAC groups ranged from 14.6 to 78.8% and from 12.3 to 79.6% in 11 studies, respectively. Smears unsuitable for diagnosis were collected using both techniques in 12 studies [8, 10-14, 20, 23-27]. The proportion of smears unsuitable for diagnosis ranged from 8.1 to 34.0% and from 8.1 to 38.0% in the FNAC and FNNAC groups, respectively. The pooled proportion of diagnostically superior smears were 891/1,844 (48.3%) and 951/1,844 (51.6%) in the FNAC and FNNAC groups, respectively; there was no statistically significant difference between the groups (MD 0.81, 95% CI 0.60-1.09, and = 0.16) (Figure 2(b)). Similarly, the pooled proportion of smears unsuitable for diagnosis was 316/1,912 (16.5%) and 296/1,912 (15.5%) in the FNAC and FNNAC groups, respectively; no statistically significant difference was observed between the groups (MD 1.09, 95% CI 0.91-1.30, and = 0.36) (Figure 2(a)).  (Figure 3). The sensitivity, specificity, NPV, and PPV were extracted from four studies [10,15,19,21], with no statistically significant difference observed between FNAC and FNNAC (Table 4). To analyze the SROC, the performances of the four diagnostic studies are shown in Table 5. The areas under the SROC curves were 0.9273 ± 0.0350 for FNAC and 0.9047 ± 0.0458 for FNNAC. No significant difference was observed between the AUCs of FNAC and FNNAC (Figure 4).  [8,10,13,24,28], and five studies calculated the mean of the total scores of each sample [8,10,13,14,24]. There was no statistically significant difference in the average scores of the five parameters or the mean of the total scores between the FNAC and FNNAC groups (Table 4). Forest plots show the average scores of the five parameters ( Figure 5) and the mean of the total scores ( Figure 6) for the FNNAC and FNAC techniques.

Discussion
Although many studies have compared the efficiency of FNAC and FNNAC techniques in evaluating thyroid nodules, there is no clear agreement as to which method performs better. To the best of our knowledge, this is the first metaanalysis to evaluate the smear quality and diagnostic performance of FNAC and FNNAC. The five parameters used for performance evaluation may interfere with each other; hence, if the scoring system excluding one or more parameters is used, the average score and total score may not accurately     reflect each parameter and the smear quality, respectively. Therefore, we strictly selected studies that used the scoring system of Mair et al. to assess the quality of smears obtained by FNAC and FNNAC.
It is well known that the smear quality may affect the cytological diagnosis of thyroid nodules. In this meta-analysis, we compared the quality of smears collected by FNAC and FNNAC using the Mair et al. scoring system and found no statistically significant difference between the quality of smears obtained by FNNAC and FNAC. A larger number of smears collected by FNNAC tended to be superior smears compared with those collected by FNAC; however, this was not statistically significant. We also observed a similar rate of smears unsuitable for diagnosis between FNNAC and FNAC groups.
"Background blood or clot" and "amount of cellular material" are two important criteria in assessing the quality of smears [1]. In theory, FNAC may cause more hemorrhage than FNNAC, and FNNAC may produce better cellular material than FNAC. Considering that the thyroid is a vascular organ, hemorrhage is also an important factor that can seriously affect the interpretation of results and thus lead to inaccurate diagnosis. In this meta-analysis, we did not find any difference in the background blood or clot, amount of cellular material, degree of cellular degeneration, degree of cellular trauma, retention of appropriate architecture, or mean score of the five parameters between FNNAC and FNAC groups.
The objective of fine needle biopsy is to investigate thyroid nodules. The diagnostic accuracy is important in determining whether patients with suspicious thyroid nodules need surgery. Five included studies reported the diagnostic accuracies of both techniques [10,15,[19][20][21]. We compared the diagnosis using both techniques with the histological results and found that the diagnostic accuracy was not significantly different between FNAC and FNNAC. There was also no statistical difference between FNAC and FNNAC regarding sensitivity, specificity, NPV, or PPV of diagnosis [10,15,19,21]. As a global measurement of diagnostic performance in a meta-analysis, the SROC curve summarized the joint distribution of sensitivity and specificity; the AUCs of FNAC and FNNAC were near to 1, with no significant difference observed between them, suggesting that both techniques are useful in diagnosing thyroid nodules.
Some studies reported that the execution order of FNNAC and FNAC techniques plays an important role in affecting the quality of smears. Although the order of FNNAC and FNAC sampling was preplanned in most of the included studies (FNAC followed by FNNAC was performed on patients in group A, FNNAC followed by FNAC was conducted in group B, or the technique used for biopsy was alternated sequentially for each patient), three studies had a high risk of bias based on low-quality data. One study conducted FNAC followed by FNNAC sampling for all cases, and two studies reversed the order of FNNAC and FNAC techniques for all patients. This might have led to the differences in results caused by the order of FNNAC and FNAC sampling. However, when we excluded these three studies, the execution order of FNNAC and FNAC made no difference to the quality of smears.
This meta-analysis had some potential limitations. First, numerous factors may have affected the consistency of results, as the included studies used various fine needle biopsy protocols (such as varying needle gauge and size of syringe volume). Moreover, there were differences in the level of suction pressure applied and the insertion depth of fine needles. These factors might have caused a small but possible risk of bias. Second, the sample size of included studies was small, especially for comparing the diagnostic accuracy of both techniques with the histological results; this might lead to the small-study effect; thus, the results obtained should be considered with caution. Third, we did not assess other complications such as nerve damage, tissue trauma, tumor seeding, or vascular injury associated with both techniques,       owing to a lack of data in the included studies. Finally, some studies reported that FNNAC combined with FNAC can obtain better quality cellular material [8,9], while other studies reported that a better diagnostic accuracy can be achieved by combining both techniques [13,23,26]. This suggests that a combination of both techniques may be more suitable for the investigation of patients with thyroid nodules. However, because of a lack of adequate evidence, we could not conduct a meta-analysis to compare the performance of a combination of both techniques with FNNAC or FNAC alone.

Conclusion
FNNAC and FNAC techniques are equally useful in the assessment of thyroid nodules. The selection of technique may be dependent on the personal preference of the operator.