Diagnostic accuracy of Afirma gene expression classifier, Afirma gene sequencing classifier, ThyroSeq v2 and ThyroSeq v3 for indeterminate (Bethesda III and IV) thyroid nodules: a meta-analysis

in Endocrine Connections
Authors:
Irfan Vardarli 5th Medical Department, Division of Endocrinology and Diabetes, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany

Search for other papers by Irfan Vardarli in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0009-0009-2554-0971
,
Susanne Tan Department of Endocrinology, Diabetes and Metabolism, Clinical Chemistry – Division of Laboratory Research Endocrine Tumor Center at WTZ/Comprehensive Cancer Center, University Hospital Essen, University of Duisburg-Essen, Essen, Germany

Search for other papers by Susanne Tan in
Current site
Google Scholar
PubMed
Close
,
Rainer Görges Department of Nuclear Medicine, University Hospital Essen, University of Duisburg-Essen, Essen, Germany

Search for other papers by Rainer Görges in
Current site
Google Scholar
PubMed
Close
,
Bernhard K Krämer 5th Medical Department, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany

Search for other papers by Bernhard K Krämer in
Current site
Google Scholar
PubMed
Close
,
Ken Herrmann Department of Nuclear Medicine, University Hospital Essen, University of Duisburg-Essen, Essen, Germany

Search for other papers by Ken Herrmann in
Current site
Google Scholar
PubMed
Close
, and
Christoph Brochhausen Institue of Pathology, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany

Search for other papers by Christoph Brochhausen in
Current site
Google Scholar
PubMed
Close

Correspondence should be addressed to I Vardarli: irfan.vardarli@medma.uni-heidelberg.de
Open access

Sign up for journal news

Objective

The management of thyroid nodules with indeterminate cytology (ITN) is still a challenge. To evaluate the performance of commercial molecular tests for ITN, we performed this comprehensive meta-analysis.

Methods

We performed an electronic search using PubMed/Medline, Embase, and the Cochrane Library. Studies assessing the diagnostic accuracy of Afirma gene expression classifier (GEC), Afirma gene sequencing classifier (GSC), ThyroSeq v2 (TSv2), or ThyroSeq v3 (TSv3) in patients with ITN (only Bethesda category III or IV) were selected; Statistical analyses were performed by using Stata.

Results

Seventy-one samples (GEC, n = 38; GSC, n = 16; TSv2, n = 9; TSv3, n = 8) in 53 studies, involving 6490 fine needle aspirations (FNAs) with ITN cytology with molecular diagnostics (GEC, GSC, TSv2, or TSv3), were included in the study. The meta-analysis showed the following pooled estimates: sensitivity 0.95 (95% CI: 0.94–0.97), specificity 0.35 (0.28–0.43), positive likelihood ratio (LR+) 1.5 (1.3–1.6), and negative likelihood ratio (LR−) 0.13 (0.09–0.19), with the best performance for TSv3 (area under the ROC curve 0.95 (0.93–0.96), followed by TSv2 (0.90 (0.87–0.92)), GSC (0.86 (0.82–0.88)), and GEC (0.82 (0.78–0.85)); the best rule-out property was observed for GSC (LR−, 0.07 (0.02–0.19)), followed by TSv3 (0.11 (0.05–0.24)) and GEC (0.16 (0.10–0.28), and the best rule-in was observed for TSv2 (LR+, 2,9 (1.4–4.6)), followed by GSC (1.9 (1.6–2.4)). A meta-regression analysis revealed that study design, Bethesda category, and type of molecular test were independent factors.

Conclusion

We showed that in patients with ITN, TSv3 has the best molecular diagnostic performance, followed by TSv2, GSC, and GEC. As regards rule-out malignancy, GSC, and rule-in, TSV2 is superior to other tests.

Abstract

Objective

The management of thyroid nodules with indeterminate cytology (ITN) is still a challenge. To evaluate the performance of commercial molecular tests for ITN, we performed this comprehensive meta-analysis.

Methods

We performed an electronic search using PubMed/Medline, Embase, and the Cochrane Library. Studies assessing the diagnostic accuracy of Afirma gene expression classifier (GEC), Afirma gene sequencing classifier (GSC), ThyroSeq v2 (TSv2), or ThyroSeq v3 (TSv3) in patients with ITN (only Bethesda category III or IV) were selected; Statistical analyses were performed by using Stata.

Results

Seventy-one samples (GEC, n = 38; GSC, n = 16; TSv2, n = 9; TSv3, n = 8) in 53 studies, involving 6490 fine needle aspirations (FNAs) with ITN cytology with molecular diagnostics (GEC, GSC, TSv2, or TSv3), were included in the study. The meta-analysis showed the following pooled estimates: sensitivity 0.95 (95% CI: 0.94–0.97), specificity 0.35 (0.28–0.43), positive likelihood ratio (LR+) 1.5 (1.3–1.6), and negative likelihood ratio (LR−) 0.13 (0.09–0.19), with the best performance for TSv3 (area under the ROC curve 0.95 (0.93–0.96), followed by TSv2 (0.90 (0.87–0.92)), GSC (0.86 (0.82–0.88)), and GEC (0.82 (0.78–0.85)); the best rule-out property was observed for GSC (LR−, 0.07 (0.02–0.19)), followed by TSv3 (0.11 (0.05–0.24)) and GEC (0.16 (0.10–0.28), and the best rule-in was observed for TSv2 (LR+, 2,9 (1.4–4.6)), followed by GSC (1.9 (1.6–2.4)). A meta-regression analysis revealed that study design, Bethesda category, and type of molecular test were independent factors.

Conclusion

We showed that in patients with ITN, TSv3 has the best molecular diagnostic performance, followed by TSv2, GSC, and GEC. As regards rule-out malignancy, GSC, and rule-in, TSV2 is superior to other tests.

Introduction

The most common endocrine cancer is thyroid carcinoma, which accounts for approximately 1.0–1.5% of all newly diagnosed cancers each year in the USA (1). Only approximately 10–15% of thyroid nodules, which are common, are malignant (2, 3). Fine needle aspiration (FNA) is a frequently used tool to evaluate thyroid nodules. However, up to 30% of the FNAs were classified as indeterminate (4), with Bethesda category III or IV according to the Bethesda System for Reporting Thyroid Cytopathology (TBSRTC), first introduced in 2009 (4). The Bethesda categories are defined as follows: category I (nondiagnostic), II (benign), III (atypia of undetermined significance/follicular lesion of undetermined significance; AUS/FLUS), IV (follicular neoplasm/suspicious for a follicular neoplasm; FN/SFN), V (suspicious for malignancy; SM), and VI (malignant). For categories III and IV, only 5–30% of the patients referred for surgery had malignant histopathological results (5). Therefore, there is a need to avoid unnecessary thyroid surgery. The guidelines recommend the use of molecular tests for further management of thyroid nodules with indeterminate cytology (ITN); however, for the choice of the suitable method and the interpretation of results, the prevalence of thyroid carcinoma must be concerned (6), as the PPV and NPV depend on the prevalence of a disease (7). There are various commercial molecular diagnostic tools for risk-stratificationin patients with ITN: e.g. (i) released in 2011, the first commercial test, Afirma gene expression classifier (GEC) (Veracyte Inc., South Francisco, CA, USA), is a microarray-based test, with the measurement of mRNA expression of 167 genes, with high sensitivity, negative predictive value, and low specificity, a rule-out test for thyroid cancer, with ‘benign’ or ‘suspicious’ test results (8); (ii) ThyroSeq v2 (ThyroSeq; University of Pittsburgh Medical Center, Pittsburgh, PA, USA and Sonic Healthcare, Austin, TX, USA), released in 2014, is a next-generation sequencing (NSG) test of RNA and DNA, designed as a ‘rule-in’ test, with ‘positive’ or ‘negative’ test results, where a ‘positive’ indicates a malignant potential (9); (iii) Veracyte Inc. replaced GEC with GSC in 2017, using next-generation RNA sequencing with whole-transcriptome analysis (10, 11); and (d) ThyroSeq v3 (ThyroSeq; University of Pittsburgh Medical Center and Sonic Healthcare), released in 2017, evaluates as a targeted NSG test, copy number alterations, fusions, abnormal gene expression, and point mutations in 112 genes associated with thyroid cancer (11).

Concerning the performance of commercial molecular diagnostic tools for ITN, nine meta-analyses have been published (12, 13, 14, 15, 16, 17, 18, 19, 20). GEC was investigated in three of them (12, 14, 15). Vuong et al. compared GEC with Afirma gene sequencing classifier (GSC) (17). Vargas-Valas et al. compared GEC with ThyroSeq v2 (13). Borowczyk et al. compared GSC with ThyroSeq v2 (16). GSC and TSv3 were compared in a meta-analysis by Lee et al. (19). DiGennaro et al. performed a meta-analysis comparing GEC, GSC, and TSv3 (20). Meta-analyses for GSC, ThyroSeq v2, and ThyroSeq v3 alone, respectively, are lacking. Only in the meta-analysis performed by Silaghi et al. were GEC, GSC, ThyroSeq v2, and ThyroSeq v3 all analyzed together (18). However, they included studies with ITN in Bethesda categories III, IV, and V.

A head-to-head comparison of commercial tools was performed only in one retrospective analysis published by Walts et al. (21), which was excluded from our study due to the inclusion of ITNs with Bethesda category V. In all other studies, the molecular diagnostic tools were compared not in the same FNA.

In some studies, Bethesda category V was also considered as undetermined. In some studies, noninvasive follicular thyroid neoplasms with papillary-like nuclear features (NIFTP) was considered benign, while in others, it was considered malignant. In our meta-analysis, NIFTP was considered as malignant.

The last meta-analyses on this topic were published in 2022 by Lee et al. (19) and DiGennaro et al. (20). Thereafter, between 2021 and 2023, 19 studies were published (for GEC, nine samples in eight studies; for GSC, eight samples in seven studies between 2021 and 2023; for TSv2, three studies between 2021 and 2023; and for TSv3, two studies between 2022 and 2023).

Therefore, we performed the present comprehensive meta-analysis comparing all relevant commercial molecular diagnostic tools, considering NIFTP as malignant and excluding studies with ITN with Bethesda category V.

Patients and methods

We performed the meta-analysis according to the updated Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guideline, using the Synthesizing Evidence from Diagnostic Accuracy Tests (SEDATE) guideline for reporting a diagnostic test accuracy meta-analysis (22, 23) (Supplementary Table 1, see section on supplementary materials given at the end of this article). We created a predefined study protocol, which was not registered. Informed consent or ethical approval was not required for this study.

Data search and study selection

We searched the electronic databases of PubMed/Medline, Embase, and the Cochrane Library systematically (updated on February 3, 2024). The search strategies are given in Supplementary Table 2, without language and time restrictions. In addition, we performed a ‘manual search’ in the references of included studies. Studies meeting the following inclusion criteria were included: patients with indeterminate thyroid nodule cytology after FNA (Bethesda classification category III or IV) with molecular testing using commercial tests (GEC, GSC, TSv2, or TSv3) in all ITN and histopathologic diagnosis as the gold standard. Exclusion criteria were as follows: data for the 2 × 2 table (true-positive (tp), false-positive (fp), false-negative (fn), and true-negative (tn)) not provided; inclusion of ITN with Bethesda category V; not published as a full-text article; case reports or case series; and duplication of a study or overlapping of patient samples (in this case, inclusion of the study with the longest follow-up or the larger sample).

Data extraction and quality assessment

All eligible articles were independently reviewed by two authors (IV and RG), who extracted the relevant data. In case of discrepancies a third author (ST) resolved concerns regarding eligibility. In Review Manager (RevMan) version 5.3 (Nordic Cochrane Center), we used the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool (24, 25), assessing the quality of the included studies in terms of biases affecting their applicability in four domains: index test, reference standard, patient selection, and flow and timing. Two authors (IV and RG) evaluated each of the items. The index test was defined as FNA cytology (Bethesda category III or IV). As the reference standard (gold standard), histological observation was considered. However, in ITN with a ‘benign’ (with GEC or GSC) or ‘negative’ (with TSv2 or TSv3) molecular diagnostic result, not all patients underwent surgical therapy; therefore, as an alternative, we used clinical follow-up. The outcome was defined as the histological diagnosis of thyroid cancer, with NIFTP considered malignant.

Statistical analysis

The hierarchical summary receiver operating characteristic (HSROC) and bivariate methods are the most appropriate methodological approaches (https://eunethta.eu/wp-content/uploads/2018/01/2014-05-19_meta-a_diagn_draft-gl_2nd_revision_clear_0.pdf) for the meta-analysis of diagnostic test accuracy studies. Therefore, to determine summary estimates of the sensitivity, specificity, diagnostic odds ratio, and likelihood ratios by the bivariate random effects model (26) for calculating summary estimates of sensitivity and specificity, and the HSROC curve for modeling the parameters for the ROC curves (27, 28, 29), we performed a meta-analysis applying the hierarchical logistic regression modeling using Stata, version 17 (Stata Corp, College Station, TX, USA) with the metandi, metandiplot, and midas commands (30). Funnel plots, meta-regression analysis, and the evaluation of funnel plot asymmetry were performed using Stata, version 17, with the MIDAS command (30). We assessed publication bias (bias across studies) by the Deeks’ funnel plot asymmetry test (30, 31) using Stata, version 17, with the MIDAS command, where a P < 0.1 indicated publication bias. As the continuity correction, 0.5 was used for cells containing zero (this is default for the metandi command in Stata) (32). Negative likelihood ratios less than 0.5 with 95% CIs not including 1.0 or positive likelihood ratios of greater than 2.0 were considered statistically significant (33, 34). Sensitivity and specificity were defined as the primary endpoints. Predefined secondary endpoints were: positive predictive value (PPV = tp/(tp + fp), where tp = true positives, fp = false positives, negative predictive value (NPV = tn/(fn + tn)), where tn = true negatives, fn = false negatives, positive likelihood ratio (LR+) (=sensitivity/(1 − specificity)), and negative likelihood ratio (LR−) (= 1 − sensitivity)/specificity).

Sensitivity analyses were performed by restricting the meta-analysis to subgroups and by excluding studies that are considered outliers in a statistical sense (35). Following subgroup analyses for the primary endpoint were predefined: Afirma (GEC + GSC) vs ThyroSeq (TSv2 + TSv3), GEC vs GSC, GSC vs TSv2, GSC vs TSv3, and TSv2 vs TSv3. For exploring heterogeneity, a meta-regression analysis (https://methods.cochrane.org/sdt/handbook-dta-reviews) with following predefined study-level covariates (potential confounders) was intended: (1) study design (prospective (yes) vs retrospective (no)), (2) study setting (multi-center (yes) vs single center (no)), (3) Bethesda category (Bethesda category III + IV) (yes) vs Bethesda category III (no); (4) molecular test (ThyroSeq (TSv2 or TSv3) (yes) vs Afirma (GEC or GSC) (no)), GSC (yes) vs GEC (no), TSv2 (yes) vs GSC, TSv3 (yes) vs GSC (no), TSv3 (yes) vs TSv2 (no), (5) number of FNA before molecular testing (repeat FNA (yes) vss initial FNA (no)), and (6) conflict of interest ((yes) vs without conflict of interest (no)).

Results

Study selection and characteristics

The literature request identified 2819 records with potentially relevant studies. As shown in Fig. 1, 53 studies with 71 samples met the inclusion and exclusion criteria and were included in the meta-analysis. The included studies had a total of 6490 FNAs with ITN cytology with molecular diagnostics (GEC, GSC, TSv2, and TSv3). The detailed characteristics of the included studies (9, 10, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86) are given in Supplementary Table 3. According to the QUADAS-2 tool (24, 25), the methodological quality of the included trials was acceptable (Fig. 2A and B).

Figure 1
Figure 1

Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) flow diagram for inclusion and exclusion of studies, according to the PRISMA guidelines, trials investigating the accuracy of Afirma gene expression classifier (GEC), Afirma gene sequencing classifier (GSC), ThyroSeq v2 (TSv2), or ThyroSeq v3 (TSv3) in patients with indeterminate thyroid nodules (only Bethesda category III or IV).

Citation: Endocrine Connections 13, 7; 10.1530/EC-24-0170

Figure 2
Figure 2
Figure 2

(A) Risk of bias and applicability concerns graph on each domain presented as a percentage across all included studies (with GEC, GSC, TSv2, or TSv3). n = 71 samples in 53 studies. GEC, Afirma gene expression classifier; GSC, Afirma gene sequencing classifier; TSv2, ThyroSeq v2; TSv3, ThyroSeq v3. (B) Risk of bias and applicability concerns summary for each included study. n = 71 samples in 53 studies.

Citation: Endocrine Connections 13, 7; 10.1530/EC-24-0170

In 61 samples, the study design was retrospective, and in ten samples, it was prospective. Fifty samples had a single-center setting, whereas 21 samples were in a multicenter setting. In 12 samples, only Bethesda III nodules were included, while 59 samples included Bethesda III or Bethesda IV. ThyroSeq (TSv2 and TSv3) was used in 17 samples, and Afirma (GEC and GSC) was used in 54 samples. In 58 samples, the initial FNA was performed before molecular testing, and in 13 samples, repeated FNAs were performed. In only one sample (37) out of 71, the authors had a conflict of interest. In none of the samples was more than one molecular test used for the same ITN, and in none of the samples a head-to-head comparison of the molecular tests was performed.

Risk of bias and publication bias

No significant evidence for publication bias was suggested by the Deeks’ funnel plot asymmetry test (Fig. 3).

Figure 3
Figure 3

Deeks‘ funnel plot asymmetry test for all included studies (with GEC, GSC, TSv2, and TSv3). P < 0.1 indicates asymmetry and potential publication bias. n = 71 samples in 53 studies.

Citation: Endocrine Connections 13, 7; 10.1530/EC-24-0170

Meta-analysis

In our meta-analysis, we included 71 samples in 53 trials, totaling 6490 FNAs with ITN cytology and molecular diagnostics (GEC, GSC, TSv2, or TSv3). Among these, 2041 patients had histopathological thyroid carcinoma.

Regarding all included studies (n = 71 samples in 53 studies), the summary estimates of sensitivity and specificity for all analyzed molecular tests (GEC, GSC, TSv2, and TSv3) were 0.95 (95% CI: 0.94–0.97) and 0.35 (95% CI: 0.28–0.43), respectively (Fig. 4); the pooled estimates of positive likelihood ratio (LR+) and negative likelihood ratio (LR−) were 1.5 (95% CI: 1.3–1.6) and 0.13 (95% CI: 0.09–0.19), respectively. The pooled estimate for the area under the ROC curve (AUC) was 0.89 (95% CI: 0.86–0.92) (Supplementary Table 4). Posttest probabilities are shown in Supplementary Fig. 1 and Table 1. The HSROC plot for all included studies is depicted in Fig. 5, where the study no. 4 by Chen et al. (72), no. 5 by Gortakowski et al. (78), no. 9 by Nikiforov et al. (38), and no. 32 by Loncar et al. (83) are outliers (Supplementary Fig. 2).

Figure 4
Figure 4

Forest plot illustrating sensitivity and specificity for all included trials (with GEC, GSC, TSv2, and TSv3), n = 71 samples in 53 studies. Pooled sensitivity: 0.95 (95% CI: 0.94–0.97), pooled specificity: 0.35 (0.28–0.43), pooled positive likelihood ratio (LR+): 1.5 (1.3–1.6), pooled negative likelihood ratio (LR−: 0.13 (0.09–0.19).

Citation: Endocrine Connections 13, 7; 10.1530/EC-24-0170

Figure 5
Figure 5

Hierarchical summary receiver-operating characteristics (HSROC) plot for all included studies (with GEC, GSC, TSv2, and TSv3), n = 71 samples in 53 studies. GEC, Afirma gene expression classifier; GSC, Afirma gene sequencing classifier; TSv2, ThyroSeq v2; TSv3, ThyroSeq v3.

Citation: Endocrine Connections 13, 7; 10.1530/EC-24-0170

Table 1

Pooled summary data and performance estimates, n = 71 samples in 53 studies.

Parameter GEC, GSC, TSv2, TSv3 (n = 71 samples) GEC (n = 38 samples) GSC (n = 16 samples) TSv2 (n = 9 samples) TSv3 (n = 8 samples)
Sensitivity 0.95 (0.94–0.97) 0.96 (0.94–0.98) 0.97 (0.91–0.99) 0.89 (0.83–0.95) 0.95 (0.91–0.97)
Specificity 0.35 (0.28–0.43) 0.23 (0.17–0.30) 0.50 (0.40–0.61) 0.66 (0.45–0.82) 0.47 (0.22–0.73)
LR+ 1.5 (1.3–1.6) 1.2 (1.1–1.4) 1.9 (1.6–2.4) 2.6 (1.4–4.6) 1.8 (1.0–3.0)
LR− 0.13 (0.09–0.19) 0.16 (0.10–0.28) 0.07 (0.02–0.19) 0.17 (0.1–0.31) 0.11 (0.05–0.24)
DOR 11.2 (7.3–17.2) 7.6 (4.3–13.7) 29.1 (9.7–88.8) 15.2 (5.1–45.2) 16 (5–57)
AUC 0.89 (0.86–0.92) 0.82 (0.78–0.85) 0.86 (0.82–0.88) 0.90 (0.87–0.92) 0.95 (0.93–0.96)
Post_Prob_Pos (%) 27 24 33 39 31
Post_Prob_Neg (%) 3 4 2 4 3

AUC, area under the ROC curve; DOR, diagnostic odds ratio; GEC, Afirma gene expression classifier; GSC, Afirma gene sequencing classifier; LR+, positive likelihood ratio; LR−, negative likelihood ratio; Post_Prob_Pos, positive posttest probability; Post_Prob_Neg, negative posttest probability; TSv2, ThyroSeq v2; TSv3, ThyroSeq v3.

In the sensitivity analysis, exploring the possible reasons for between-study heterogeneity, summary estimates remained unchanged after omitting the mentioned outlier samples: Chen et al. 2020 (72), Gortakowski et al. 2021 (78), Nikiforov et al. 2014 (38) and Loncar et al. 2023 (83). The summary estimates of sensitivity and specificity were 0.95 (95% CI: 0.94–0.97) and 0.34 (95% CI: 0.28–0.42), respectively (Supplementary Fig. 3). The pooled estimates of positive likelihood ratio (LR+) and negative likelihood ratio (LR−) were 1.5 (95% CI: 1.3–1.6) and 0.13 (95% CI: 0.09–0.19), respectively (Supplementary Table 5). The summary receiver-operating curve (SROC) for this analysis is depicted in Supplementary Fig. 4.

In subgroup analyses, within the subgroup with GEC (n = 38 samples), the pooled estimates of sensitivity and specificity were 0.96 (95% CI, 0.94–0.98) and 0.23 (95% CI: 0.17–0.30), respectively (Fig. 6). The summary estimates of positive likelihood ratio (LR+) and negative likelihood ratio (LR−) were 1.2 (95% CI: 1.1–1.4) and 0.16 (95% CI: 0.10–0.28), respectively. The AUC was 0.82 (95% CI: 0.78–0.85) (Supplementary Table 6). Posttest probabilities are shown in Supplementary Fig. 5 and Table 1. The hierarchical SROC (HSROC) curve for the subgroup GEC is depicted in Supplementary Fig. 6.

Figure 6
Figure 6

Forest plot illustrating sensitivity and specificity for the subgroup of studies with the Afirma gene expression classifier (GEC), n = 38 samples. Pooled sensitivity: 0.95 (95% CI: 0.94–0.97), pooled specificity: 0.35 (0.28–0.43), pooled positive likelihood ratio (LR+): 1.5 (1.3–1.6), pooled negative likelihood ratio (LR−): 0.13 (0.09–0.19).

Citation: Endocrine Connections 13, 7; 10.1530/EC-24-0170

In the subgroup with GSC (n = 16 samples), the pooled estimates of sensitivity and specificity were 0.97 (95% CI: 0.91–0.99) and 0.50 (95% CI: 0.40–0.61), respectively (Fig. 7); the summary estimates of positive likelihood ratio (LR+) and negative likelihood ratio (LR−) were 1.9 (95% CI: 1.6–2.4) and 0.07 (95% CI: 0.02–0.19), respectively. The AUC was 0.86 (95% CI: 0.82–0.88) (Supplementary Table 7). Posttest probabilities are shown in Supplementary Fig. 7 and Table 1. The hierarchical SROC (HSROC) curve for the subgroup GSC is depicted in Supplementary Fig. 8.

Figure 7
Figure 7

Forest plot illustrating sensitivity and specificity for the subgroup of studies with the Afirma gene sequencing classifier (GSC), n = 16 samples. Pooled sensitivity: 0.95 (95% CI: 0.94–0.97), pooled specificity: 0.35 (0.28–0.43), pooled positive likelihood ratio (LR+): 1.5 (1.3–1.6), pooled negative likelihood ratio (LR−): 0.13 (0.09–0.19).

Citation: Endocrine Connections 13, 7; 10.1530/EC-24-0170

In the subgroup with TSv2 (n = 9 samples), the pooled estimates of sensitivity and specificity were 0.89 (95% CI: 0.83–0.95) and 0.66 (95% CI: 0.45–0.82), respectively (Fig. 8). The summary estimates of positive likelihood ratio (LR+) and negative likelihood ratio (LR−) were 2.6 (95% CI: 1.4–4.6) and 0.17 (95% CI: 0.10–0.31), respectively. The AUC was 0.90 (95% CI: 0.87–0.92) (Supplementary Table 8). Posttest probabilities are shown in Supplementary Fig. 9 and Table 1. The hierarchical SROC (HSROC) curve for the subgroup GEC is depicted in Supplementary Fig. 10.

Figure 8
Figure 8

Forest plot illustrating sensitivity and specificity for the subgroup of studies with ThyroSeqv2 (TSv2), n = 9 samples. Pooled sensitivity: 0.89 (95% CI: 0.83–0.95), pooled specificity: 0.66 (0.45–0.82), pooled positive likelihood ratio (LR+): 2.6 (1.4–4.6), pooled negative likelihood ratio (LR−): 0.17 (0.1–0.31).

Citation: Endocrine Connections 13, 7; 10.1530/EC-24-0170

In the subgroup with TSv3 (n = 8 samples), the pooled estimates of sensitivity and specificity were 0.95 (95% CI: 0.91–0.97) and 0.47 (95% CI: 0.22–0.73), respectively (Fig. 9). The summary estimates of positive likelihood ratio (LR+) and negative likelihood ratio (LR−) were 1.8 (95% CI: 1.0–3.0) and 0.11 (95% CI: 0.05–0.24), respectively. The AUC was 0.95 (95% CI: 0.93–0.96) (Supplementary Table 9). Posttest probabilities are shown in Supplementary Fig. 11 and Table 1. The hierarchical SROC (HSROC) curve for the subgroup TSv3 is depicted in Supplementary Fig. 12.

Figure 9
Figure 9

Forest plot illustrating sensitivity and specificity for the subgroup of studies with ThyroSeqv3 (TSv3), n = 8 samples. Pooled sensitivity: 0.95 (95% CI: 0.91–0.97), pooled specificity: 0.47 (0.22–0.73), pooled positive likelihood ratio (LR+): 1.8 (1.0–3.0), pooled negative likelihood ratio (LR−): 0.11 (0.05–0.24).

Citation: Endocrine Connections 13, 7; 10.1530/EC-24-0170

In the subgroup with TSv2, the positive posttest probability was higher (39% vs 31%) (as shown in the Supplementary Figs 9 and 11) than in TSv3, favoring TSv2.

The meta-regression analysis showed that the covariates ‘study design’ (prospective vs retrospective), Bethesda category (category III or IV vs category III), and type of molecular test (TSv2 or TSv3 vs GEC or GSC), but not the covariate ‘study setting’ (multi-center vs single center), ‘number of FNA before molecular testing’ (repeat FNA vs initial FNA), and ‘conflict of interest’ (yes vs no) were not found to be independent influencing factors (Fig. 10 and Supplementary Table 10). Furthermore, the meta-regression showed the following results: (i) GSC is superior to GEC (Supplementary Fig. 13, Supplementary Table 11, Table 1); (ii) TSv2 and TSv3 have significantly different diagnostic accuracy (Supplementary Fig. 14, Supplementary Table 12, and Table 1); (iii) TSv2 is superior to GSC (Supplementary Fig. 15, Supplementary Table 13); (iv) TSv3 and GSC are not significantly different (Supplementary Fig. 16, Supplementary Table 14).

Figure 10
Figure 10

Meta-regression analysis in all included studies (GEC, GSC, TSv2, or TSv3), n = 71 samples in 53 studies, for the following covariates: (1) Study design (prospective (yes) vs retrospective (no)), (2) study setting (multi-center (yes) vs single center (no)), (3) molecular test (ThyroSeq (TSv2 or TSv3) (yes) vs Afirma (GEC or GSC) (no)), (4) Bethesda category (Bethesda III or IV (yes) vs Bethesda III (no)), (5) number of FNA before molecular testing (repeat FNA (yes) vs initial FNA (no)), and (6) conflict of interest ((yes) vs without conflict of interest (no)).

Citation: Endocrine Connections 13, 7; 10.1530/EC-24-0170

Discussion

In this study, we performed a meta-analysis for the diagnostic accuracy of molecular tests (GEC, GSC, TSv2, and TSv3) for the detection of thyroid carcinoma in patients with ITN. We included 71 samples in 53 studies, comprising a total of 6490 FNAs with ITN cytology and molecular diagnostics (GEC, GSC, TSv2, or TSv3). Concerning all included studies (GEC, GSC, TSv2, and TSv3), the summary estimates of sensitivity and specificity for detecting thyroid cancer were 0.95 (95% CI: 0.94–0.97) and 0.35 (95% CI: 0.28–0.43), respectively. The pooled estimates of positive likelihood ratio (LR+) and negative likelihood ratio (LR−) were 1.5 (95% CI: 1.3–1.6) and 0.13 (95% CI: 0.09–0.19), respectively. There was some degree of between-study heterogeneity, but no indication for publication bias. Sensitivity analysis showed no influence of particular studies on the summary estimates. The meta-regression analysis revealed that study design, Bethesda category, and type of molecular test are independent factors for the estimates sensitivity and specificity. The diagnostic performance drops in the mentioned order: TSv3 > TSv2 > GSC > GEC.

In subgroup analyses (for GEC, GSC, TSV, and TSv3, respectively), as shown in Table 1, the summary estimates of sensitivity, specificity, positive and negative likelihood ratios were different. Interestingly, in the subgroup with TSv2, the positive posttest probability was higher (39% vs 31%) than in TSv3, favoring TSv2.

A head-to-head comparison of commercial tools was performed only in one retrospective analysis published by Walts et al. (21), which was excluded in our study due to the inclusion of Bethesda V cytology. In all other studies, the molecular diagnostic tools were compared not in the same FNA.

Nine meta-analyses have been published (12, 13, 14, 15, 16, 17, 18, 19, 20) concerning the performance of commercial molecular diagnostic tools for ITN. GEC was investigated in three of them (12, 14, 15). Vuong et al. compared GEC with GSC (17). Vargas-Valas et al. compared GEC with ThyroSeq v2 (13). Borowczyk et al. compared GSC with ThyroSeq v2 (16). GSC was compared with ThyroSeq v3 in two meta-analyses (19, 20). Meta-analyses for GSC, ThyroSeq v2, and ThyroSeq v3 alone, respectively, are lacking. Only in the meta-analysis performed by Silaghi et al., GEC, GSC, ThyroSeq v2, and ThyroSeq v3 were analyzed all together (18).

Santhanam et al. (12) included only six studies analyzing GEC; they showed a high pooled sensitivity (0.96 (95% CI: 0.92–0.98)) and low specificity (0.31 (95% CI: 0.26–0.35)), which we could confirm; they stated that this makes it an excellent rule out malignancy.

To avoid unnecessary surgery and answer the question, if the patient could be followed up, a rule-out test aimed to predict benign nodules should be applied. If the intention is to predict malignancy, a rule-in test must be used (87); in case of a positive test, surgery would be recommended. A minimal posttest probability for malignancy of 50–75% could be considered appropriate as a rule-in test (13). Predictive values, e.g. posttest probabilities, depend on the prevalence of the disease (7). With a disease prevalence of 20–40%, for a PPV of 60% (an accepted limit for rule-out testing), a specificity of 80% or more is required, while for a rule-out test, a sensitivity of 90% or more is required (13).

We could also confirm the findings of Vargas-Valas (13); they compared GEC and TSV2 and found high sensitivity for both GEC and TSv2, as well as low specificity for GEC and a high specificity for TSv2, where both qualified as good to intermediate quality evidence.

Valderbarrano et al. included 19 trials for comparison of post-marketing findings vs the initial clinical validation findings of a thyroid nodule GEC. They suggested that the initial study cohort was not representative, questioning the diagnostic performance of GEC (14).

Liu et al. included 18 trials investigating GEC in ITS. They described a high sensitivity (95%) but a low specificity (22.1%), making GEC to a rule-out test. They stated that over half of GEC-suspicious ITN still require further validation, which limits its use in clinical practice. We could confirm these findings.

Vuong et al. included seven studies (17) to compare (not head-to-head) the clinical impact and diagnostic performance of GEC and GSC, comparing the benign call rates (BCR) and resection rates (RR), risk of malignancy (ROM), sensitivity, and specificity. In our study, for comparison of GEC with GSC, we analyzed 54 samples. They reported high sensitivity for both GEC and GSC (0.93 (95% CI: 0.90–0.98) and 0.94 (95% CI: 0.91–0.98), respectively) and a low specificity for both GEC and GSC (0.25 (95% CI: 0.09–0.40) and 0.43 (95% CI: 0.24–0.62), respectively); we could confirm these findings. Compared with GEC, they found for GSC a higher BCR (65% vs 43.8%, P<0.001), a lower RR (26.8% vs 50.1%; P<0.001), and a higher ROM (60.1% vs 37.6%; P<0.001); we did not analyze these parameters.

Borowczyk et al. compared GSC (16 studies) with ThyroSeq v2 (five studies) (16). They reported a high sensitivity (0.98 (95% CI: 0.96–0.99) and negative predictive value (NPV) (0.91 (95% CI: 0.85–0.96)) for GEC, which is helpful for ruling out malignancy in ITN. They interpreted ThyroSeqv2 (TSv2) with higher specificity and acceptable sensitivity as having the ‘potential for use as an all-round test of malignancy of thyroid nodules’ (16). We could confirm the findings regarding the sensitivity and specificity in both tests as well as the potential of GEC as a rule-out test, but we could not confirm the potential for an ‘all-round’ test.

Silaghi et al. included in their meta-analysis 40 samples (18) for comparing GEC, GSC, TSv2, and TSv3. In contrast to our study, they also included studies with ITN with Bethesda V category (we excluded trials with ITN with Bethesda V category). They showed the best performance for TSv3 with an AUC of 0.95 (95% CI: 0.93–0.97), followed by GSC (AUC, 0.90 (95 CI, 0.87–0.97), and TSv2 (AUC, 0.88, (95% CI: 0.85–0.90). Concerning the rule-out potential, TSv3 (LR−, 0.02 (95% CI: 0.0–2.69)) was superior to GEC (LR−, 0.18 (95% CI: 0.10–0.33). Compared to GSC (LR+, 1.9 (95% CI: 1.3–2.8)), TSv2 (LR+, 3.5 (95% CI: 2.2–5.5)), and TSv3 (LR+, 2.8 (95% CI: 1.2–6.3)), showed better ‘rule-in’ properties. They concluded that GSC and TSv3 have been proved to outperform in abilities to rule out malignancy and that TSv2 still ranks as the best rule-in molecular test. As in our study, they considered NIFTP as malignant. Except for the results regarding the AUC, we could confirm the findings and conclusions of Silaghi et al. (18), although we included 71 samples in 53 trials and excluded trials with ITN with Bethesda V category. In our meta-analysis, TSv3 showed also the best performance (AUC, 0.95 (95% CI: 0.93–0.96)), but followed by TSv2 (0.90 (95% CI: 0.87–0.92)), GSC (0.86 (95% CI: 0.82–0.88)), and GEC (0.82 (95% CI, 0.78–0.85)).

Lee et al. (19) included in their meta-analysis 13 samples and compared GSC (seven samples) with TSv3 (six samples). They also considered NIFTP as malignant but included only samples where all patients with ITNs with GEC suspect or with TSv3 positive molecular tests as well as all patients with ITNs with GEC benign or with TSv3 negative molecular tests underwent surgery. They found high sensitivity (0.97, (95% CI: 0.90–0.99)), 0.95 (95% CI: (0.91–0.97), respectively) and high NPV (0.96, (95% CI: 0.94–0.98), 0.92 (95% CI: 0.86–0.97), respectively) in GSC as well as TSV3, and no difference in diagnostic performances between GSC and TSv3. This suggested that both tests ‘have the potential’ to rule out malignancy in ITN and that both are appropriate tests to determine the malignancy in ITN. We could confirm their findings. As in our study, they also included for example the trial by Steward et al. (63) (for TSv3), and Desai et al. (9), where all patients underwent surgery.

DiGennaro et al. performed a meta-analysis including 48 samples for GEC (n = 38), GSC (n = 10), and TSv3 (n = 6). They found high diagnostic accuracy of molecular tests for the assessment of malignancy in ITN; however, they suggested that limitations and their potential clinical impacts must be addressed. Except for the results for AUC (best AUC for GSC (0.91 (95% CI: 0.62–0.92)), followed by TSv3 (0.90 (95% CI: 0.63–0.92)), and GEC (0.83 (95% CI: 0.74–0.89)), their results for sensitivity, specificity, LR+, LR−, and AUC were similar to our results.

This study has few limitations. First, the included studies have different sample properties (study design, setting, thyroid cancer prevalence, blinding, ITN determination frequency, conflicts of interest, number of FNAs, patient selection criteria for molecular testing). Most of the studies were performed retrospectively, and statistical adjustments are not possible, as patient-level data are lacking. Secondly, not all ITNs with benign or negative molecular tests underwent surgery, thus the true false-negative rate is unknown. Thirdly, in the included studies, the comparisons for the molecular tests (GEC, GSC, TSv2, and TSv3) are not head-to-head comparisons, in other words, comparison of different molecular tests on the same cytological sample. The only one published study with head-to-head comparison (21) was excluded in our study due to ITN with Bethesda V category.

In conclusion, our results indicate that molecular diagnostic in patients with ITN is valuable to avoid unnecessary thyroid surgery. For ruling out malignancy, GSC is superior to TSv3, GEC, and TSv2; for ruling in malignancy, TSV2 is superior to GSC and TSv3. The diagnostic performance drops in the mentioned order: TSv3, TSv2, GSC, and GEC. Further studies with prospective design, blinding, and surgery (with histopathological diagnosis of ITNs) in all patients with molecular diagnostics, to determine the true false-negative rate and prevalence, would be useful.

Supplementary materials

This is linked to the online version of the paper at https://doi.org/10.1530/EC-24-0170.

Declaration of interest

The authors declare that there is no conflict of interest that could be perceived as prejudicing the impartiality of the study reported.

Funding

This study did not receive any specific grant from any funding agency in the public, commercial, or not-for-profit sector.

References

  • 1

    Pellegriti G, Frasca F, Regalbuto C, Squatrito S, & Vigneri R. Worldwide increasing incidence of thyroid cancer: update on epidemiology and risk factors. Journal of Cancer Epidemiology 2013 2013 965212. (https://doi.org/10.1155/2013/965212)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 2

    Haugen BR, Alexander EK, Bible KC, Doherty GM, Mandel SJ, Nikiforov YE, Pacini F, Randolph GW, Sawka AM, Schlumberger M, et al.2015 American Thyroid Association management guidelines for adult patients with thyroid nodules and differentiated thyroid cancer: the American Thyroid Association guidelines task force on thyroid nodules and differentiated thyroid cancer. Thyroid 2016 26 1133. (https://doi.org/10.1089/thy.2015.0020)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 3

    Vuong HG, Ngo HTT, Bychkov A, Jung CK, Vu TH, Lu KB, Kakudo K, & Kondo T. Differences in surgical resection rate and risk of malignancy in thyroid cytopathology practice between Western and Asian countries: a systematic review and meta-analysis. Cancer Cytopathology 2020 128 238249. (https://doi.org/10.1002/cncy.22228)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 4

    Cibas ES, & Ali SZ. The 2017 Bethesda system for reporting thyroid cytopathology. Thyroid 2017 27 13411346. (https://doi.org/10.1089/thy.2017.0500)

  • 5

    Wang CCC, Friedman L, Kennedy GC, Wang H, Kebebew E, Steward DL, Zeiger MA, Westra WH, Wang Y, Khanafshar E, et al.A large multicenter correlation study of thyroid nodule cytopathology and histopathology. Thyroid 2011 21 243251. (https://doi.org/10.1089/thy.2010.0243)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 6

    Haugen BR. 2015 American Thyroid Association management guidelines for adult patients with thyroid nodules and differentiated thyroid cancer: what is new and what has changed? Cancer 2017 123 372381. (https://doi.org/10.1002/cncr.30360)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 7

    Altman DG, & Bland JM. Diagnostic tests 2: predictive values. BMJ 1994 309 102. (https://doi.org/10.1136/bmj.309.6947.102)

  • 8

    Alexander EK, Kennedy GC, Baloch ZW, Cibas ES, Chudova D, Diggans J, Friedman L, Kloos RT, LiVolsi VA, Mandel SJ, et al.Preoperative diagnosis of benign thyroid nodules with indeterminate cytology. New England Journal of Medicine 2012 367 705715. (https://doi.org/10.1056/NEJMoa1203208)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 9

    Desai D, Lepe M, Baloch ZW, & Mandel SJ. ThyroSeq v3 for Bethesda III and IV: an institutional experience. Cancer Cytopathology 2021 129 164170. (https://doi.org/10.1002/cncy.22362)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 10

    Patel KN, Angell TE, Babiarz J, Barth NM, Blevins T, Duh QY, Ghossein RA, Harrell RM, Huang J, Kennedy GC, et al.Performance of a genomic sequencing classifier for the preoperative diagnosis of cytologically indeterminate thyroid nodules. JAMA Surgery 2018 153 817824. (https://doi.org/10.1001/jamasurg.2018.1153)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 11

    Patel J, Klopper J, & Cottrill EE. Molecular diagnostics in the evaluation of thyroid nodules: current use and prospective opportunities. Frontiers in Endocrinology (Lausanne) 2023 14 1101410. (https://doi.org/10.3389/fendo.2023.1101410)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 12

    Santhanam P, Khthir R, Gress T, Elkadry A, Olajide O, Yaqub A, & Driscoll H. Gene expression classifier for the diagnosis of indeterminate thyroid nodules: a meta-analysis. Medical Oncology 2016 33 14. (https://doi.org/10.1007/s12032-015-0727-3)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 13

    Vargas-Salas S, Martinez JR, Urra S, Dominguez JM, Mena N, Uslar T, Lagos M, Henriquez M, & Gonzalez HE. Genetic testing for indeterminate thyroid cytology: review and meta-analysis. Endocrine-Related Cancer 2018 25 R163R177. (https://doi.org/10.1530/ERC-17-0405)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 14

    Valderrabano P, Hallanger-Johnson JE, Thapa R, Wang X, & McIver B. Comparison of postmarketing findings vs the initial clinical validation findings of a thyroid nodule gene expression classifier: a systematic review and meta-analysis. JAMA Otolaryngology – Head and Neck Surgery 2019 145 783792. (https://doi.org/10.1001/jamaoto.2019.1449)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 15

    Liu Y, Pan B, Xu L, Fang D, Ma X, & Lu H. The diagnostic performance of Afirma gene expression classifier for the indeterminate thyroid nodules: a meta-analysis. BioMed Research International 2019 2019 7150527. (https://doi.org/10.1155/2019/7150527)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 16

    Borowczyk M, Szczepanek-Parulska E, Olejarz M, Wieckowska B, Verburg FA, Debicki S, Budny B, Janicka-Jedynska M, Ziemnicka K, & Ruchala M. Evaluation of 167 gene expression classifier (GEC) and ThyroSeq v2 diagnostic accuracy in the preoperative assessment of indeterminate thyroid nodules: bivariate/HROC meta-analysis. Endocrine Pathology 2019 30 815. (https://doi.org/10.1007/s12022-018-9560-5)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 17

    Vuong HG, Nguyen TPX, Hassell LA, & Jung CK. Diagnostic performances of the Afirma gene sequencing classifier in comparison with the gene expression classifier: a meta-analysis. Cancer Cytopathology 2021 129 182189. (https://doi.org/10.1002/cncy.22332)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 18

    Silaghi CA, Lozovanu V, Georgescu CE, Georgescu RD, Susman S, Nasui BA, Dobrean A, & Silaghi H. Thyroseq v3, Afirma GSC, and microRNA panels versus previous molecular tests in the preoperative diagnosis of indeterminate thyroid nodules: a systematic review and meta-analysis. Frontiers in Endocrinology (Lausanne) 2021 12 649522. (https://doi.org/10.3389/fendo.2021.649522)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 19

    Lee E, Terhaar S, McDaniel L, Gorelik D, Gerhard E, Chen C, Ma Y, Joshi AS, Goodman JF, & Thakkar PG. Diagnostic performance of the second-generation molecular tests in the assessment of indeterminate thyroid nodules: a systematic review and meta-analysis. American Journal of Otolaryngology 2022 43 103394. (https://doi.org/10.1016/j.amjoto.2022.103394)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 20

    DiGennaro C, Vahdatzad V, Jalali MS, Toumi A, Watson T, Gazelle GS, Mercaldo N, & Lubitz CC. Assessing bias and limitations of clinical validation studies of molecular diagnostic tests for indeterminate thyroid nodules: systematic review and meta-analysis. Thyroid 2022 32 11441157. (https://doi.org/10.1089/thy.2022.0269)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 21

    Walts AE, Sacks WL, Wu HH, Randolph ML, & Bose S. A retrospective analysis of the performance of the RosettaGX((R)) Reveal thyroid miRNA and the Afirma gene expression classifiers in a cohort of cytologically indeterminate thyroid nodules. Diagnostic Cytopathology 2018 46 901907. (https://doi.org/10.1002/dc.23980)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 22

    Moher D, Liberati A, Tetzlaff J, Altman DG & PRISMA Group. Preferred Reporting Items for Systematic Reviews and Meta-Analyses: the PRISMA statement. PLoS Medicine 2009 6 e1000097. (https://doi.org/10.1371/journal.pmed.1000097)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 23

    Sotiriadis A, Papatheodorou SI, & Martins WP. Synthesizing evidence from diagnostic accuracy tests: the SEDATE guideline. Ultrasound in Obstetrics and Gynecology 2016 47 386395. (https://doi.org/10.1002/uog.15762)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 24

    Whiting P, Rutjes AWS, Reitsma JB, Bossuyt PMM, & Kleijnen J. The development of QUADAS: a tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews. BMC Medical Research Methodology 2003 3 25. (https://doi.org/10.1186/1471-2288-3-25)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 25

    Whiting PF, Rutjes AWS, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, Leeflang MMG, Sterne JAC, Bossuyt PMM & QUADAS-2 Group. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Annals of Internal Medicine 2011 155 529536. (https://doi.org/10.7326/0003-4819-155-8-201110180-00009)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 26

    Reitsma JB, Glas AS, Rutjes AWS, Scholten RJPM, Bossuyt PM, & Zwinderman AH. Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. Journal of Clinical Epidemiology 2005 58 982990. (https://doi.org/10.1016/j.jclinepi.2005.02.022)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 27

    Rutter CM, & Gatsonis CA. A hierarchical regression approach to meta-analysis of diagnostic test accuracy evaluations. Statistics in Medicine 2001 20 28652884. (https://doi.org/10.1002/sim.942)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 28

    Harbord RM, Deeks JJ, Egger M, Whiting P, & Sterne JAC. A unification of models for meta-analysis of diagnostic accuracy studies. Biostatistics 2007 8 239251. (https://doi.org/10.1093/biostatistics/kxl004)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 29

    Gatsonis C, & Paliwal P. Meta-analysis of diagnostic and screening test accuracy evaluations: methodologic primer. American Journal of Roentgenology 2006 187 271281. (https://doi.org/10.2214/AJR.06.0226)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 30

    Deeks JJ, Macaskill P, & Irwig L. The performance of tests of publication bias and other sample size effects in systematic reviews of diagnostic test accuracy was assessed. Journal of Clinical Epidemiology 2005 58 882893. (https://doi.org/10.1016/j.jclinepi.2005.01.016)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 31

    van Enst WA, Ochodo E, Scholten RJPM, Hooft L, & Leeflang MM. Investigation of publication bias in meta-analyses of diagnostic test accuracy: a meta-epidemiological study. BMC Medical Research Methodology 2014 14 70. (https://doi.org/10.1186/1471-2288-14-70)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 32

    Sweeting MJ, Sutton AJ, & Lambert PC. What to add to nothing? Use and avoidance of continuity corrections in meta-analysis of sparse data. Statistics in Medicine 2004 23 13511375. (https://doi.org/10.1002/sim.1761)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 33

    Wilson MC, Henderson MC, & Smetana GW. Chapter 5. Evidence-based clinical decision making. In The Patient History; an Evidence-Based Approach to Differential Diagnosis, 2nd ed. Eds. Wilson MC, Henderson MC, & Smetana GW. New York, NY, USA: McGraw-Hill, 2012.

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 34

    McGee S. Simplifying likelihood ratios. Journal of General Internal Medicine 2002 17 646649. (https://doi.org/10.1046/j.1525-1497.2002.10750.x)

  • 35

    Harbord RM, & Whiting P. metandi: meta-analysis of diagnostic accuracy using hierarchical logistic regression. In Meta-Analysis in Stata: an Updated Collection from the Stata Journal, 2nd ed., pp. 211229. Eds. Palmer TM, & Sterne JAC. College Station, Texas, USA: Stata Press, 2016.

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 36

    Harrell RM, & Bimston DN. Surgical utility of Afirma: effects of high cancer prevalence and oncocytic cell types in patients with indeterminate thyroid cytology. Endocrine Practice 2014 20 364369. (https://doi.org/10.4158/EP13330.OR)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 37

    McIver B, Castro MR, Morris JC, Bernet V, Smallridge R, Henry M, Kosok L, & Reddi H. An independent study of a gene expression classifier (Afirma) in the evaluation of cytologically indeterminate thyroid nodules. Journal of Clinical Endocrinology and Metabolism 2014 99 40694077. (https://doi.org/10.1210/jc.2013-3584)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 38

    Nikiforov YE, Carty SE, Chiosea SI, Coyne C, Duvvuri U, Ferris RL, Gooding WE, Hodak SP, LeBeau SO, Ohori NP, et al.Highly accurate diagnosis of cancer in thyroid nodules with follicular neoplasm/suspicious for a follicular neoplasm cytology by ThyroSeq v2 next-generation sequencing assay. Cancer 2014 120 36273634. (https://doi.org/10.1002/cncr.29038)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 39

    Brauner E, Holmes BJ, Krane JF, Nishino M, Zurakowski D, Hennessey JV, Faquin WC, & Parangi S. Performance of the Afirma gene expression classifier in Hurthle cell thyroid nodules differs from other indeterminate thyroid nodules. Thyroid 2015 25 789796. (https://doi.org/10.1089/thy.2015.0049)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 40

    Noureldine SI, Olson MT, Agrawal N, Prescott JD, Zeiger MA, & Tufano RP. Effect of gene expression classifier molecular testing on the surgical decision-making process for patients with thyroid nodules. JAMA Otolaryngology – Head and Neck Surgery 2015 141 10821088. (https://doi.org/10.1001/jamaoto.2015.2708)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 41

    Marti JL, Avadhani V, Donatelli LA, Niyogi S, Wang B, Wong RJ, Shaha AR, Ghossein RA, Lin O, Morris LGT, et al.Wide inter-institutional variation in performance of a molecular classifier for indeterminate thyroid nodules. Annals of Surgical Oncology 2015 22 39964001. (https://doi.org/10.1245/s10434-015-4486-3)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 42

    Yang SE, Sullivan PS, Zhang J, Govind R, Levin MR, Rao JY, & Moatamed NA. Has Afirma gene expression classifier testing refined the indeterminate thyroid category in cytology? Cancer Cytopathology 2016 124 100109. (https://doi.org/10.1002/cncy.21624)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 43

    Nikiforov YE, Carty SE, Chiosea SI, Coyne C, Duvvuri U, Ferris RL, Gooding WE, LeBeau SO, Ohori NP, Seethala RR, et al.Impact of the multi-gene ThyroSeq next-generation sequencing assay on cancer diagnosis in thyroid nodules with atypia of undetermined significance/follicular lesion of undetermined significance cytology. Thyroid 2015 25 12171223. (https://doi.org/10.1089/thy.2015.0305)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 44

    Chaudhary S, Hou Y, Shen R, Hooda S, & Li Z. Impact of the Afirma gene expression classifier result on the surgical management of thyroid nodules with category III/IV cytology and its correlation with surgical outcome. Acta Cytologica 2016 60 205210. (https://doi.org/10.1159/000446797)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 45

    Sacks WL, Bose S, Zumsteg ZS, Wong R, Shiao SL, Braunstein GD, & Ho AS. Impact of Afirma gene expression classifier on cytopathology diagnosis and rate of thyroidectomy. Cancer Cytopathology 2016 124 722728. (https://doi.org/10.1002/cncy.21749)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 46

    Samulski TD, LiVolsi VA, Wong LQ, & Baloch Z. Usage trends and performance characteristics of a "gene expression classifier" in the management of thyroid nodules: an institutional experience. Diagnostic Cytopathology 2016 44 867873. (https://doi.org/10.1002/dc.23559)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 47

    Villabona CV, Mohan V, Arce KM, Diacovo J, Aggarwal A, Betancourt J, Amer H, Jose T, DeSantis P, & Cabral J. Utility of ultrasound versus gene expression classifier in thyroid nodules with atypia of undetermined significance. Endocrine Practice 2016 22 11991203. (https://doi.org/10.4158/EP161231.OR)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 48

    Witt RL. Outcome of thyroid gene expression classifier testing in clinical practice. Laryngoscope 2016 126 524527. (https://doi.org/10.1002/lary.25607)

  • 49

    Wu JX, Young S, Hung ML, Li N, Yang SE, Cheung DS, Yeh MW, & Livhits MJ. Clinical factors influencing the performance of gene expression classifier testing in indeterminate thyroid nodules. Thyroid 2016 26 916922. (https://doi.org/10.1089/thy.2015.0505)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 50

    Al-Qurayshi Z, Deniwar A, Thethi T, Mallik T, Srivastav S, Murad F, Bhatia P, Moroz K, Sholl AB, & Kandil E. Association of malignancy prevalence with test properties and performance of the gene expression classifier in indeterminate thyroid nodules. JAMA Otolaryngology – Head and Neck Surgery 2017 143 403408. (https://doi.org/10.1001/jamaoto.2016.3526)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 51

    Baca SC, Wong KS, Strickland KC, Heller HT, Kim MI, Barletta JA, Cibas ES, Krane JF, Marqusee E, & Angell TE. Qualifiers of atypia in the cytologic diagnosis of thyroid nodules are associated with different Afirma gene expression classifier results and clinical outcomes. Cancer Cytopathology 2017 125 313322. (https://doi.org/10.1002/cncy.21827)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 52

    Hang JF, Westra WH, Cooper DS, & Ali SZ. The impact of noninvasive follicular thyroid neoplasm with papillary-like nuclear features on the performance of the Afirma gene expression classifier. Cancer Cytopathology 2017 125 683691. (https://doi.org/10.1002/cncy.21879)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 53

    Azizi G, Keller JM, Mayo ML, Piper K, Puett D, Earp KM, & Malchoff CD. Shear wave elastography and Afirma gene expression classifier in thyroid nodules with indeterminate cytology: a comparison study. Endocrine 2018 59 573584. (https://doi.org/10.1007/s12020-017-1509-9)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 54

    Deaver KE, Haugen BR, Pozdeyev N, & Marshall CB. Outcomes of Bethesda categories III and IV thyroid nodules over 5 years and performance of the Afirma gene expression classifier: a single-institution study. Clinical Endocrinology 2018 89 226232. (https://doi.org/10.1111/cen.13747)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 55

    Livhits MJ, Kuo EJ, Leung AM, Rao J, Levin M, Douek ML, Beckett KR, Zanocco KA, Cheung DS, Gofnung YA, et al.Gene expression classifier vs targeted next-generation sequencing in the management of indeterminate thyroid nodules. Journal of Clinical Endocrinology and Metabolism 2018 103 22612268. (https://doi.org/10.1210/jc.2017-02754)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 56

    Taye A, Gurciullo D, Miles BA, Gupta A, Owen RP, Inabnet 3rd WB, Beyda JN, & Marti JL. Clinical performance of a next-generation sequencing assay (ThyroSeq v2) in the evaluation of indeterminate thyroid nodules. Surgery 2018 163 97103. (https://doi.org/10.1016/j.surg.2017.07.032)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 57

    Angell TE, Heller HT, Cibas ES, Barletta JA, Kim MI, Krane JF, & Marqusee E. Independent comparison of the Afirma genomic sequencing classifier and gene expression classifier for cytologically indeterminate thyroid nodules. Thyroid 2019 29 650656. (https://doi.org/10.1089/thy.2018.0726)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 58

    Jug R, Parajuli S, Ahmadi S, & Jiang XS. Negative results on thyroid molecular testing decrease rates of surgery for indeterminate thyroid nodules. Endocrine Pathology 2019 30 134137. (https://doi.org/10.1007/s12022-019-9571-x)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 59

    Endo M, Nabhan F, Porter K, Roll K, Shirley LA, Azaryan I, Tonkovich D, Perlick J, Ryan LE, Khawaja R, et al.Afirma gene sequencing classifier compared with gene expression classifier in indeterminate thyroid nodules. Thyroid 2019 29 11151124. (https://doi.org/10.1089/thy.2018.0733)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 60

    Harrell RM, Eyerly-Webb SA, Golding AC, Edwards CM, & Bimston DN. Statistical comparison of Afirma Gsc and Afirma Gec outcomes in a community endocrine surgical practice: early findings. Endocrine Practice 2019 25 161164. (https://doi.org/10.4158/EP-2018-0395)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 61

    San Martin VT, Lawrence L, Bena J, Madhun NZ, Berber E, Elsheikh TM, & Nasr CE. Real-world comparison of afirma GEC and GSC for the assessment of cytologically indeterminate thyroid nodules. Journal of Clinical Endocrinology and Metabolism 2020 105 dgz099. (https://doi.org/10.1210/clinem/dgz099)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 62

    Marcadis AR, Valderrabano P, Ho AS, Tepe J, Swartzwelder CE, Byrd S, Sacks WL, Untch BR, Shaha AR, Xu B, et al.Interinstitutional variation in predictive value of the ThyroSeq v2 genomic classifier for cytologically indeterminate thyroid nodules. Surgery 2019 165 1724. (https://doi.org/10.1016/j.surg.2018.04.062)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 63

    Steward DL, Carty SE, Sippel RS, Yang SP, Sosa JA, Sipos JA, Figge JJ, Mandel S, Haugen BR, Burman KD, et al.Performance of a multigene genomic classifier in thyroid nodules with indeterminate cytology: a prospective blinded multicenter study. JAMA Oncology 2019 5 204212. (https://doi.org/10.1001/jamaoncol.2018.4616)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 64

    Geng Y, Aguilar-Jakthong JS, & Moatamed NA. Comparison of Afirma gene expression classifier with gene sequencing classifier in indeterminate thyroid nodules: a single-institutional experience. Cytopathology 2021 32 187191. (https://doi.org/10.1111/cyt.12920)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 65

    Wang MM, Beckett K, Douek M, Masamed R, Patel M, Tseng CH, Yeh MW, Leung AM, & Livhits MJ. Diagnostic value of molecular testing in sonographically suspicious thyroid nodules. Journal of the Endocrine Society 2020 4 bvaa081. (https://doi.org/10.1210/jendso/bvaa081)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 66

    Vora A, Holt S, Haque W, & Lingvay I. Long-term outcomes of thyroid nodule AFIRMA GEC testing and literature review: an institutional experience. Otolaryngology–Head and Neck Surgery 2020 162 634640. (https://doi.org/10.1177/0194599820911718)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 67

    Papoian V, Rosen JE, Lee W, Wartofsky L, & Felger EA. Differentiated thyroid cancer and Hashimoto thyroiditis: utility of the Afirma gene expression classifier. Journal of Surgical Oncology 2020 121 10531057. (https://doi.org/10.1002/jso.25875)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 68

    Arosemena M, Thekkumkattil A, Valderrama ML, Kuker R, Castillo RP, Sidani C, Gonzalez ML, Casula S, & Kargi AY. American thyroid association sonographic risk and Afirma gene expression classifier alone and in combination for the diagnosis of thyroid nodules with Bethesda category III cytology. Thyroid 2020 30 16131619. (https://doi.org/10.1089/thy.2019.0673)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 69

    Sultan R, Levy S, Sulanc E, Honasoge M, & Rao SD. Utility of Afirma gene expression classifier for evaluation of indeterminate thyroid nodules and correlation with ultrasound risk assessment: single institutional experience. Endocrine Practice 2020 26 543551. (https://doi.org/10.4158/EP-2019-0350)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 70

    Jug R, Foo WC, Jones C, Ahmadi S, & Jiang XS. High-risk and intermediate-high-risk results from the ThyroSeq v2 and v3 thyroid genomic classifier are associated with neoplasia: independent performance assessment at an academic institution. Cancer Cytopathology 2020 128 563569. (https://doi.org/10.1002/cncy.22283)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 71

    Carty SE, Ohori NP, Hilko DA, McCoy KL, French EK, Manroa P, Morariu E, Sridharan S, Seethala RR, & Yip L. The clinical utility of molecular testing in the management of thyroid follicular neoplasms (Bethesda IV nodules). Annals of Surgery 2020 272 621627. (https://doi.org/10.1097/SLA.0000000000004130)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 72

    Chen T, Gilfix BM, Rivera J, Sadeghi N, Richardson K, Hier MP, Forest VI, Fishman D, Caglar D, Pusztaszeri M, et al.The role of the ThyroSeq v3 molecular test in the surgical management of thyroid nodules in the Canadian public health care setting. Thyroid 2020 30 12801287. (https://doi.org/10.1089/thy.2019.0539)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 73

    Polavarapu P, Fingeret A, Yuil-Valdes A, Olson D, Patel A, Shivaswamy V, Matthias TD, & Goldner W. Comparison of Afirma GEC and GSC to nodules without molecular testing in cytologically indeterminate thyroid nodules. Journal of the Endocrine Society 2021 5 bvab148. (https://doi.org/10.1210/jendso/bvab148)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 74

    Scola WH, Linhares SM, Handelsman RS, Picado O, Khan ZF, Farra JC, & Lew JI. Molecular testing has limited utility in the surgical evaluation of Bethesda III thyroid nodules. Journal of Surgical Research 2021 268 209213. (https://doi.org/10.1016/j.jss.2021.06.026)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 75

    Onken AM, VanderLaan PA, Hennessey JV, Hartzband P, & Nishino M. Combined molecular and histologic end points inform cancer risk estimates for thyroid nodules classified as atypia of undetermined significance. Cancer Cytopathology 2021 129 947955. (https://doi.org/10.1002/cncy.22489)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 76

    Zhang L, Smola B, Lew M, Pang J, Cantley R, Pantanowitz L, Heider A, & Jing X. Performance of Afirma genomic sequencing classifier vs gene expression classifier in Bethesda category III thyroid nodules: an institutional experience. Diagnostic Cytopathology 2021 49 921927. (https://doi.org/10.1002/dc.24765)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 77

    Nishino M, Mateo R, Kilim H, Feldman A, Elliott A, Shen C, Hasselgren PO, Wang H, Hartzband P, & Hennessey JV. Repeat fine needle aspiration cytology refines the selection of thyroid nodules for Afirma gene expression classifier testing. Thyroid 2021 31 12531263. (https://doi.org/10.1089/thy.2020.0969)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 78

    Gortakowski M, Feghali K, & Osakwe I. Single institution experience with Afirma and Thyroseq testing in indeterminate thyroid nodules. Thyroid 2021 31 13761382. (https://doi.org/10.1089/thy.2020.0801)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 79

    Yang Z, Zhang T, Layfield L, & Esebua M. Performance of Afirma gene sequencing classifier versus gene expression classifier in thyroid nodules with indeterminate cytology. Journal of the American Society of Cytopathology 2022 11 7478. (https://doi.org/10.1016/j.jasc.2021.07.002)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 80

    Jin X, Lew M, Pantanowitz L, Smola B, & Jing X. Performance of Afirma genomic sequencing classifier and histopathological outcome are associated with patterns of atypia in Bethesda category III thyroid nodules. Cancer Cytopathology 2022 130 891898. (https://doi.org/10.1002/cncy.22625)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 81

    Steinmetz D, Kim M, Choi JH, Yeager T, Samuel K, Khajoueinejad N, Buseck A, Imtiaz S, Fernandez-Ranvier G, Lee D, et al.How effective is the use of molecular testing in preoperative decision making for management of indeterminate thyroid nodules? World Journal of Surgery 2022 46 30433050. (https://doi.org/10.1007/s00268-022-06744-1)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 82

    Gajzer DC, Tjendra Y, Kerr DA, Algashaamy K, Zuo Y, Menendez SG, Jorda M, Garcia-Buitrago M, Gomez-Fernandez C, & Velez Torres JM. Probability of malignancy as determined by ThyroSeq v3 genomic classifier varies according to the subtype of atypia. Cancer Cytopathology 2022 130 881890. (https://doi.org/10.1002/cncy.22617)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 83

    Loncar I, van Velsen EFS, Massolt ET, van Kemenade FJ, van Engen-van Grunsven ACH, van Hemel BM, van Nederveen FH, Netea-Maier R, Links TP, Peeters RP, et al.European experience with the Afirma gene expression classifier for indeterminate thyroid nodules: a clinical utility study in the Netherlands. Head and Neck 2023 45 22272236. (https://doi.org/10.1002/hed.27472)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 84

    Jin X, Lew M, Pantanowitz L, Iyengar JJ, Haymart MR, Papaleontiou M, Broome D, Sandouk Z, Raja SS, Hughes DT, et al.Performance of Afirma genomic sequencing classifier and histopathological outcome in Bethesda category III thyroid nodules: initial versus repeat fine-needle aspiration. Diagnostic Cytopathology 2023 51 698704. (https://doi.org/10.1002/dc.25203)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 85

    Kim NE, Raghunathan RS, Hughes EG, Longstaff XR, Tseng CH, Li S, Cheung DS, Gofnung YA, Famini P, Wu JX, et al.Bethesda III and IV thyroid nodules managed nonoperatively after molecular testing with Afirma GSC or Thyroseq v3. Journal of Clinical Endocrinology and Metabolism 2023 108 e698e703. (https://doi.org/10.1210/clinem/dgad181)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 86

    Tjendra Y, Kerr DA, Zuo Y, Menendez SG, Jorda M, Gomez-Fernandez C, & Velez Torres JM. Probability of malignancy and molecular alterations as determined by ThyroSeq v3 genomic classifier in Bethesda category IV. Cancer Cytopathology 2023 131 586595. (https://doi.org/10.1002/cncy.22737)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 87

    Deeks JJ, & Altman DG. Diagnostic tests 4: likelihood ratios. BMJ 2004 329 168169. (https://doi.org/10.1136/bmj.329.7458.168)

Supplementary Materials

 

  • Collapse
  • Expand
  • Figure 1

    Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) flow diagram for inclusion and exclusion of studies, according to the PRISMA guidelines, trials investigating the accuracy of Afirma gene expression classifier (GEC), Afirma gene sequencing classifier (GSC), ThyroSeq v2 (TSv2), or ThyroSeq v3 (TSv3) in patients with indeterminate thyroid nodules (only Bethesda category III or IV).

  • Figure 2

    (A) Risk of bias and applicability concerns graph on each domain presented as a percentage across all included studies (with GEC, GSC, TSv2, or TSv3). n = 71 samples in 53 studies. GEC, Afirma gene expression classifier; GSC, Afirma gene sequencing classifier; TSv2, ThyroSeq v2; TSv3, ThyroSeq v3. (B) Risk of bias and applicability concerns summary for each included study. n = 71 samples in 53 studies.

  • Figure 3

    Deeks‘ funnel plot asymmetry test for all included studies (with GEC, GSC, TSv2, and TSv3). P < 0.1 indicates asymmetry and potential publication bias. n = 71 samples in 53 studies.

  • Figure 4

    Forest plot illustrating sensitivity and specificity for all included trials (with GEC, GSC, TSv2, and TSv3), n = 71 samples in 53 studies. Pooled sensitivity: 0.95 (95% CI: 0.94–0.97), pooled specificity: 0.35 (0.28–0.43), pooled positive likelihood ratio (LR+): 1.5 (1.3–1.6), pooled negative likelihood ratio (LR−: 0.13 (0.09–0.19).

  • Figure 5

    Hierarchical summary receiver-operating characteristics (HSROC) plot for all included studies (with GEC, GSC, TSv2, and TSv3), n = 71 samples in 53 studies. GEC, Afirma gene expression classifier; GSC, Afirma gene sequencing classifier; TSv2, ThyroSeq v2; TSv3, ThyroSeq v3.

  • Figure 6

    Forest plot illustrating sensitivity and specificity for the subgroup of studies with the Afirma gene expression classifier (GEC), n = 38 samples. Pooled sensitivity: 0.95 (95% CI: 0.94–0.97), pooled specificity: 0.35 (0.28–0.43), pooled positive likelihood ratio (LR+): 1.5 (1.3–1.6), pooled negative likelihood ratio (LR−): 0.13 (0.09–0.19).

  • Figure 7

    Forest plot illustrating sensitivity and specificity for the subgroup of studies with the Afirma gene sequencing classifier (GSC), n = 16 samples. Pooled sensitivity: 0.95 (95% CI: 0.94–0.97), pooled specificity: 0.35 (0.28–0.43), pooled positive likelihood ratio (LR+): 1.5 (1.3–1.6), pooled negative likelihood ratio (LR−): 0.13 (0.09–0.19).

  • Figure 8

    Forest plot illustrating sensitivity and specificity for the subgroup of studies with ThyroSeqv2 (TSv2), n = 9 samples. Pooled sensitivity: 0.89 (95% CI: 0.83–0.95), pooled specificity: 0.66 (0.45–0.82), pooled positive likelihood ratio (LR+): 2.6 (1.4–4.6), pooled negative likelihood ratio (LR−): 0.17 (0.1–0.31).

  • Figure 9

    Forest plot illustrating sensitivity and specificity for the subgroup of studies with ThyroSeqv3 (TSv3), n = 8 samples. Pooled sensitivity: 0.95 (95% CI: 0.91–0.97), pooled specificity: 0.47 (0.22–0.73), pooled positive likelihood ratio (LR+): 1.8 (1.0–3.0), pooled negative likelihood ratio (LR−): 0.11 (0.05–0.24).

  • Figure 10

    Meta-regression analysis in all included studies (GEC, GSC, TSv2, or TSv3), n = 71 samples in 53 studies, for the following covariates: (1) Study design (prospective (yes) vs retrospective (no)), (2) study setting (multi-center (yes) vs single center (no)), (3) molecular test (ThyroSeq (TSv2 or TSv3) (yes) vs Afirma (GEC or GSC) (no)), (4) Bethesda category (Bethesda III or IV (yes) vs Bethesda III (no)), (5) number of FNA before molecular testing (repeat FNA (yes) vs initial FNA (no)), and (6) conflict of interest ((yes) vs without conflict of interest (no)).

  • 1

    Pellegriti G, Frasca F, Regalbuto C, Squatrito S, & Vigneri R. Worldwide increasing incidence of thyroid cancer: update on epidemiology and risk factors. Journal of Cancer Epidemiology 2013 2013 965212. (https://doi.org/10.1155/2013/965212)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 2

    Haugen BR, Alexander EK, Bible KC, Doherty GM, Mandel SJ, Nikiforov YE, Pacini F, Randolph GW, Sawka AM, Schlumberger M, et al.2015 American Thyroid Association management guidelines for adult patients with thyroid nodules and differentiated thyroid cancer: the American Thyroid Association guidelines task force on thyroid nodules and differentiated thyroid cancer. Thyroid 2016 26 1133. (https://doi.org/10.1089/thy.2015.0020)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 3

    Vuong HG, Ngo HTT, Bychkov A, Jung CK, Vu TH, Lu KB, Kakudo K, & Kondo T. Differences in surgical resection rate and risk of malignancy in thyroid cytopathology practice between Western and Asian countries: a systematic review and meta-analysis. Cancer Cytopathology 2020 128 238249. (https://doi.org/10.1002/cncy.22228)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 4

    Cibas ES, & Ali SZ. The 2017 Bethesda system for reporting thyroid cytopathology. Thyroid 2017 27 13411346. (https://doi.org/10.1089/thy.2017.0500)

  • 5

    Wang CCC, Friedman L, Kennedy GC, Wang H, Kebebew E, Steward DL, Zeiger MA, Westra WH, Wang Y, Khanafshar E, et al.A large multicenter correlation study of thyroid nodule cytopathology and histopathology. Thyroid 2011 21 243251. (https://doi.org/10.1089/thy.2010.0243)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 6

    Haugen BR. 2015 American Thyroid Association management guidelines for adult patients with thyroid nodules and differentiated thyroid cancer: what is new and what has changed? Cancer 2017 123 372381. (https://doi.org/10.1002/cncr.30360)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 7

    Altman DG, & Bland JM. Diagnostic tests 2: predictive values. BMJ 1994 309 102. (https://doi.org/10.1136/bmj.309.6947.102)

  • 8

    Alexander EK, Kennedy GC, Baloch ZW, Cibas ES, Chudova D, Diggans J, Friedman L, Kloos RT, LiVolsi VA, Mandel SJ, et al.Preoperative diagnosis of benign thyroid nodules with indeterminate cytology. New England Journal of Medicine 2012 367 705715. (https://doi.org/10.1056/NEJMoa1203208)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 9

    Desai D, Lepe M, Baloch ZW, & Mandel SJ. ThyroSeq v3 for Bethesda III and IV: an institutional experience. Cancer Cytopathology 2021 129 164170. (https://doi.org/10.1002/cncy.22362)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 10

    Patel KN, Angell TE, Babiarz J, Barth NM, Blevins T, Duh QY, Ghossein RA, Harrell RM, Huang J, Kennedy GC, et al.Performance of a genomic sequencing classifier for the preoperative diagnosis of cytologically indeterminate thyroid nodules. JAMA Surgery 2018 153 817824. (https://doi.org/10.1001/jamasurg.2018.1153)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 11

    Patel J, Klopper J, & Cottrill EE. Molecular diagnostics in the evaluation of thyroid nodules: current use and prospective opportunities. Frontiers in Endocrinology (Lausanne) 2023 14 1101410. (https://doi.org/10.3389/fendo.2023.1101410)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 12

    Santhanam P, Khthir R, Gress T, Elkadry A, Olajide O, Yaqub A, & Driscoll H. Gene expression classifier for the diagnosis of indeterminate thyroid nodules: a meta-analysis. Medical Oncology 2016 33 14. (https://doi.org/10.1007/s12032-015-0727-3)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 13

    Vargas-Salas S, Martinez JR, Urra S, Dominguez JM, Mena N, Uslar T, Lagos M, Henriquez M, & Gonzalez HE. Genetic testing for indeterminate thyroid cytology: review and meta-analysis. Endocrine-Related Cancer 2018 25 R163R177. (https://doi.org/10.1530/ERC-17-0405)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 14

    Valderrabano P, Hallanger-Johnson JE, Thapa R, Wang X, & McIver B. Comparison of postmarketing findings vs the initial clinical validation findings of a thyroid nodule gene expression classifier: a systematic review and meta-analysis. JAMA Otolaryngology – Head and Neck Surgery 2019 145 783792. (https://doi.org/10.1001/jamaoto.2019.1449)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 15

    Liu Y, Pan B, Xu L, Fang D, Ma X, & Lu H. The diagnostic performance of Afirma gene expression classifier for the indeterminate thyroid nodules: a meta-analysis. BioMed Research International 2019 2019 7150527. (https://doi.org/10.1155/2019/7150527)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 16

    Borowczyk M, Szczepanek-Parulska E, Olejarz M, Wieckowska B, Verburg FA, Debicki S, Budny B, Janicka-Jedynska M, Ziemnicka K, & Ruchala M. Evaluation of 167 gene expression classifier (GEC) and ThyroSeq v2 diagnostic accuracy in the preoperative assessment of indeterminate thyroid nodules: bivariate/HROC meta-analysis. Endocrine Pathology 2019 30 815. (https://doi.org/10.1007/s12022-018-9560-5)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 17

    Vuong HG, Nguyen TPX, Hassell LA, & Jung CK. Diagnostic performances of the Afirma gene sequencing classifier in comparison with the gene expression classifier: a meta-analysis. Cancer Cytopathology 2021 129 182189. (https://doi.org/10.1002/cncy.22332)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 18

    Silaghi CA, Lozovanu V, Georgescu CE, Georgescu RD, Susman S, Nasui BA, Dobrean A, & Silaghi H. Thyroseq v3, Afirma GSC, and microRNA panels versus previous molecular tests in the preoperative diagnosis of indeterminate thyroid nodules: a systematic review and meta-analysis. Frontiers in Endocrinology (Lausanne) 2021 12 649522. (https://doi.org/10.3389/fendo.2021.649522)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 19

    Lee E, Terhaar S, McDaniel L, Gorelik D, Gerhard E, Chen C, Ma Y, Joshi AS, Goodman JF, & Thakkar PG. Diagnostic performance of the second-generation molecular tests in the assessment of indeterminate thyroid nodules: a systematic review and meta-analysis. American Journal of Otolaryngology 2022 43 103394. (https://doi.org/10.1016/j.amjoto.2022.103394)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 20

    DiGennaro C, Vahdatzad V, Jalali MS, Toumi A, Watson T, Gazelle GS, Mercaldo N, & Lubitz CC. Assessing bias and limitations of clinical validation studies of molecular diagnostic tests for indeterminate thyroid nodules: systematic review and meta-analysis. Thyroid 2022 32 11441157. (https://doi.org/10.1089/thy.2022.0269)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 21

    Walts AE, Sacks WL, Wu HH, Randolph ML, & Bose S. A retrospective analysis of the performance of the RosettaGX((R)) Reveal thyroid miRNA and the Afirma gene expression classifiers in a cohort of cytologically indeterminate thyroid nodules. Diagnostic Cytopathology 2018 46 901907. (https://doi.org/10.1002/dc.23980)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 22

    Moher D, Liberati A, Tetzlaff J, Altman DG & PRISMA Group. Preferred Reporting Items for Systematic Reviews and Meta-Analyses: the PRISMA statement. PLoS Medicine 2009 6 e1000097. (https://doi.org/10.1371/journal.pmed.1000097)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 23

    Sotiriadis A, Papatheodorou SI, & Martins WP. Synthesizing evidence from diagnostic accuracy tests: the SEDATE guideline. Ultrasound in Obstetrics and Gynecology 2016 47 386395. (https://doi.org/10.1002/uog.15762)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 24

    Whiting P, Rutjes AWS, Reitsma JB, Bossuyt PMM, & Kleijnen J. The development of QUADAS: a tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews. BMC Medical Research Methodology 2003 3 25. (https://doi.org/10.1186/1471-2288-3-25)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 25

    Whiting PF, Rutjes AWS, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, Leeflang MMG, Sterne JAC, Bossuyt PMM & QUADAS-2 Group. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Annals of Internal Medicine 2011 155 529536. (https://doi.org/10.7326/0003-4819-155-8-201110180-00009)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 26

    Reitsma JB, Glas AS, Rutjes AWS, Scholten RJPM, Bossuyt PM, & Zwinderman AH. Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. Journal of Clinical Epidemiology 2005 58 982990. (https://doi.org/10.1016/j.jclinepi.2005.02.022)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 27

    Rutter CM, & Gatsonis CA. A hierarchical regression approach to meta-analysis of diagnostic test accuracy evaluations. Statistics in Medicine 2001 20 28652884. (https://doi.org/10.1002/sim.942)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 28

    Harbord RM, Deeks JJ, Egger M, Whiting P, & Sterne JAC. A unification of models for meta-analysis of diagnostic accuracy studies. Biostatistics 2007 8 239251. (https://doi.org/10.1093/biostatistics/kxl004)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 29

    Gatsonis C, & Paliwal P. Meta-analysis of diagnostic and screening test accuracy evaluations: methodologic primer. American Journal of Roentgenology 2006 187 271281. (https://doi.org/10.2214/AJR.06.0226)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 30

    Deeks JJ, Macaskill P, & Irwig L. The performance of tests of publication bias and other sample size effects in systematic reviews of diagnostic test accuracy was assessed. Journal of Clinical Epidemiology 2005 58 882893. (https://doi.org/10.1016/j.jclinepi.2005.01.016)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 31

    van Enst WA, Ochodo E, Scholten RJPM, Hooft L, & Leeflang MM. Investigation of publication bias in meta-analyses of diagnostic test accuracy: a meta-epidemiological study. BMC Medical Research Methodology 2014 14 70. (https://doi.org/10.1186/1471-2288-14-70)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 32

    Sweeting MJ, Sutton AJ, & Lambert PC. What to add to nothing? Use and avoidance of continuity corrections in meta-analysis of sparse data. Statistics in Medicine 2004 23 13511375. (https://doi.org/10.1002/sim.1761)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 33

    Wilson MC, Henderson MC, & Smetana GW. Chapter 5. Evidence-based clinical decision making. In The Patient History; an Evidence-Based Approach to Differential Diagnosis, 2nd ed. Eds. Wilson MC, Henderson MC, & Smetana GW. New York, NY, USA: McGraw-Hill, 2012.

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 34

    McGee S. Simplifying likelihood ratios. Journal of General Internal Medicine 2002 17 646649. (https://doi.org/10.1046/j.1525-1497.2002.10750.x)

  • 35

    Harbord RM, & Whiting P. metandi: meta-analysis of diagnostic accuracy using hierarchical logistic regression. In Meta-Analysis in Stata: an Updated Collection from the Stata Journal, 2nd ed., pp. 211229. Eds. Palmer TM, & Sterne JAC. College Station, Texas, USA: Stata Press, 2016.

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 36

    Harrell RM, & Bimston DN. Surgical utility of Afirma: effects of high cancer prevalence and oncocytic cell types in patients with indeterminate thyroid cytology. Endocrine Practice 2014 20 364369. (https://doi.org/10.4158/EP13330.OR)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 37

    McIver B, Castro MR, Morris JC, Bernet V, Smallridge R, Henry M, Kosok L, & Reddi H. An independent study of a gene expression classifier (Afirma) in the evaluation of cytologically indeterminate thyroid nodules. Journal of Clinical Endocrinology and Metabolism 2014 99 40694077. (https://doi.org/10.1210/jc.2013-3584)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 38

    Nikiforov YE, Carty SE, Chiosea SI, Coyne C, Duvvuri U, Ferris RL, Gooding WE, Hodak SP, LeBeau SO, Ohori NP, et al.Highly accurate diagnosis of cancer in thyroid nodules with follicular neoplasm/suspicious for a follicular neoplasm cytology by ThyroSeq v2 next-generation sequencing assay. Cancer 2014 120 36273634. (https://doi.org/10.1002/cncr.29038)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 39

    Brauner E, Holmes BJ, Krane JF, Nishino M, Zurakowski D, Hennessey JV, Faquin WC, & Parangi S. Performance of the Afirma gene expression classifier in Hurthle cell thyroid nodules differs from other indeterminate thyroid nodules. Thyroid 2015 25 789796. (https://doi.org/10.1089/thy.2015.0049)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 40

    Noureldine SI, Olson MT, Agrawal N, Prescott JD, Zeiger MA, & Tufano RP. Effect of gene expression classifier molecular testing on the surgical decision-making process for patients with thyroid nodules. JAMA Otolaryngology – Head and Neck Surgery 2015 141 10821088. (https://doi.org/10.1001/jamaoto.2015.2708)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 41

    Marti JL, Avadhani V, Donatelli LA, Niyogi S, Wang B, Wong RJ, Shaha AR, Ghossein RA, Lin O, Morris LGT, et al.Wide inter-institutional variation in performance of a molecular classifier for indeterminate thyroid nodules. Annals of Surgical Oncology 2015 22 39964001. (https://doi.org/10.1245/s10434-015-4486-3)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 42

    Yang SE, Sullivan PS, Zhang J, Govind R, Levin MR, Rao JY, & Moatamed NA. Has Afirma gene expression classifier testing refined the indeterminate thyroid category in cytology? Cancer Cytopathology 2016 124 100109. (https://doi.org/10.1002/cncy.21624)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 43

    Nikiforov YE, Carty SE, Chiosea SI, Coyne C, Duvvuri U, Ferris RL, Gooding WE, LeBeau SO, Ohori NP, Seethala RR, et al.Impact of the multi-gene ThyroSeq next-generation sequencing assay on cancer diagnosis in thyroid nodules with atypia of undetermined significance/follicular lesion of undetermined significance cytology. Thyroid 2015 25 12171223. (https://doi.org/10.1089/thy.2015.0305)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 44

    Chaudhary S, Hou Y, Shen R, Hooda S, & Li Z. Impact of the Afirma gene expression classifier result on the surgical management of thyroid nodules with category III/IV cytology and its correlation with surgical outcome. Acta Cytologica 2016 60 205210. (https://doi.org/10.1159/000446797)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 45

    Sacks WL, Bose S, Zumsteg ZS, Wong R, Shiao SL, Braunstein GD, & Ho AS. Impact of Afirma gene expression classifier on cytopathology diagnosis and rate of thyroidectomy. Cancer Cytopathology 2016 124 722728. (https://doi.org/10.1002/cncy.21749)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 46

    Samulski TD, LiVolsi VA, Wong LQ, & Baloch Z. Usage trends and performance characteristics of a "gene expression classifier" in the management of thyroid nodules: an institutional experience. Diagnostic Cytopathology 2016 44 867873. (https://doi.org/10.1002/dc.23559)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 47

    Villabona CV, Mohan V, Arce KM, Diacovo J, Aggarwal A, Betancourt J, Amer H, Jose T, DeSantis P, & Cabral J. Utility of ultrasound versus gene expression classifier in thyroid nodules with atypia of undetermined significance. Endocrine Practice 2016 22 11991203. (https://doi.org/10.4158/EP161231.OR)

    • PubMed
    • Search Google Scholar
    • Export Citation
  • 48

    Witt RL. Outcome of thyroid gene expression classifier testing in clinical practice. Laryngoscope 2016 126 524527. (https://doi.org/10.1002/lary.25607)

  • 49

    Wu JX, Young S, Hung