Clinical Accuracy of Instrument-Read SARS-CoV-2 Antigen Rapid Diagnostic Tests (Ag-IRRDTs)

This systematic review (PROSPERO registration number: CRD42021282476) aims to collect and analyse current evidence on real-world performance based on clinical accuracy of instrument-read rapid antigen diagnostic tests (Ag-IRRDTs) for SARS-CoV-2 identification. We used PRISMA Checklist and searched databases (PubMed, Web of Science Core Collection and FIND) for publications evaluating the accuracy of SARS-CoV-2 Ag-IRRDTs as of 30 September 2021, and included 40 independent clinical studies resulting in 48 Ag-IRRDT datasets with 137,770 samples. Across all datasets, pooled Ag-IRRDT sensitivity was 67.1% (95% CI: 65.9%–68.3%) and specificity was 99.4% with a tight CI. Pooled sensitivity and specificity of SARS-CoV-2 Ag-IRRDTs did not demonstrate a significant superiority over SARS-CoV-2 rapid antigen tests which do not require a reader instrument, even in the case where surveillance and screening datasets were excluded from the analysis. Nevertheless, they provide connectivity advantages and remove operator interface (in results-reading) issues. The lower sensitivity of certain brands of Ag-IRRDTs can be overcome in high prevalence areas with high frequency of testing. New SARS-CoV-2 variants are major concern for current and future diagnostic performance of these tests.


Introduction
Since the World Health Organization declared COVID-19 a pandemic in March 2020, rapid and accurate testing for SARS-CoV-2 become essential for clinical management and effective isolation of COVID-19 patients. While qRT-PCR instruments detect viral nucleic acid in few hours, and are considered the gold standard for detecting COVID-19, they have high purchasing and running costs and require dedicated staff to operate. Both Ag-RDTs and Ag-IRRDTs are rapid, low cost, portable, and simple to operate devices which can be used in point-of-care (POC) use as well as in hospitals, schools and sports communities. FIAs constitute a subset of Ag-IRRDTs. An Ag-IRRDT device provides user-independent test results. Since there is an electro-optical reader of the Ag-IRRDT, there is a possibility of connection to laboratory-information system in the hospitals which provide ease of documentation and archiving, but they are also amenable to point-of-care testing. Ag-IRRDT assays consist of lateral flow cartridges where the specimens are manually loaded. Results are read on a small portable electronic reader. ese devices are considered easy to use without much training required. However, they are not suitable for batch testing, as (in most cases) only a single sample can be analyzed at a time and the device requires 3-20 min process period, in addition to about 5-min disinfection and drying procedure [1]. It should be noted that (although FIAs dominate the Ag-IRRDT world market at present), a definition of Ag-IRRDT does not exclusively imply operation of tests using fluorescence principle alone, since a reader instrument may sense the lines on the sensor cassette by visible reflectance principle. A Comparison of reported SARS-CoV-2 probes can be found elsewhere [2,3]. However, currently only few of them are employed in commercially available instruments.
Reports and guidelines of regulatory agencies [11][12][13] and healthcare authorities [14,15] also take the performances of Ag-IRRDTs into consideration in varied depth scale.
is SR attempts to give a current overview of manufacturer independent studies for an objective assessment of Ag-IRRDTs, applying some specific inclusion criteria, as of 30, September, 2021. To the best of our knowledge, present study is a unique systematic review in the literature which has been concentrated specifically on Ag-IRRDTs. Such independent reviews can be helpful in differentiation studies of test devices eligible for reimbursement in worldwide healthcare systems.

Survey Methodology.
e PRISMA flow-diagram [16] and standard guidelines for systematic reviews were followed as shown in Figure 1. Additionally, the systematic review was registered on PROSPERO (Registration number: CRD42021282476).

Search Strategy. Databases PubMed, Web of Science
Core Collection, as well as the Foundation for Innovative New Diagnostics (FIND) website were searched using the terms of SARS-CoV-2, COVID-19, coronavirus, evaluation, accuracy, point of care testing, POC tests, fluorescence immunoassay, fluorescence, FIA and rapid antigen test. Two authors (A.E. and A.U.K) performed the Search Strategy. Disagreements were resolved by continued discussions until a unanimous decision was reached in a session with the participation of all authors, third author (P.C.) acting as a referee.

Inclusion Criteria.
Only peer-reviewed publications and reports were included (preprints were not included in the analysis). No language restrictions were applied. If existed, publications with a tested sample population size of less than 30 were excluded.
Studies based on saliva samples were excluded in this study due to evidence regarding use of saliva as Ag-RDT specimen type has conflicting results [12].
While new brands of rapid antigen test devices for SARS-CoV-2 enter into the market, their performances were reported to be markedly lower than the manufacturers' specs [17][18][19]. Hence independent analyses are essential for accurate judgement of device performances. Assessment of independence from manufacturers was based on whether a study received financial support from a test manufacturer or any study author was affiliated with a test manufacturer. Here, only those independent (non-manufacturer sponsored) Ag-IRRDT-based studies were included. Reagent, device and other consumable materials donations were exempt from exclusion decision.
Only those studies which clearly report sample size, sensitivity and specificity of their measurements were included in this analysis with qRT-PCR as the reference standard.
is SR takes the assumption that qRT-PCR testing is the most appropriate measure of comparison for the diagnosis of COVID-19. While viral culture might provide better measurements, it suffers from other implementation issues. Nevertheless, some studies reporting their results in reference to viral culture were also included in the SR.
Descriptive analyses of all studies were performed to estimate pooled sensitivity and specificity in comparison to qRT-PCR testing.

Data Extraction and Analysis.
Studies were screened, their characteristics were extracted independently by each reviewer. Each of the reviewers were acting blind during this process. Two reviewers (A.U.K and A.E, A.U.K and P.C., or A.E and P.C.) reviewed the titles and abstracts of all publications independently, then followed by a full-text review for those eligible, to select the articles for inclusion in this study. Any disagreements were resolved by the participation of third reviewer in joint discussions. e last name of the first author of a study was used along with the country where testing took place, the manufacturer and model names of the Ag-IRRDT kits, total number of subjects, sample condition (fresh or un-fresh), sample types (NP, MT, OP, AN), compliance with manufacturer instructions for use (IFU), the number of positive qRT-PCR samples, reported sensitivity and specificities and ranges of Ct values of the reference standard. e results were tabulated using a reference number for each dataset.
Pooled data results were also given. Sensitivity and specificity for each test were presented with 95% confidence intervals (CIs). Data extraction was independently performed using 2-by-2 contingency tables of the number of true positives, false positives, false negatives and true negatives, and data according to viral load (high or low, according to Ct cut-offs defined within studies) were separately extracted. e results were presented using the forest plots of sensitivity and specificity, in each case. Pooled sensitivities and specificities were computed according to test manufacturer.

Statistical Analysis and Data Synthesis.
Raw data were extracted from the studies and performance estimates were recalculated. Forest plots indicating sensitivity and specificity and their CIs for each test, as well as for polled sensitivity and specificity and their CIs are plotted. en, the heterogeneity between studies was visually evaluated. Accuracy parameters and their CIs were recalculated. In order to assess the uncertainty introduced by sample size, the 95% CIs were calculated using Wilson's method.
A group-analysis was performed for a test group if three or more datasets were available under its title, otherwise only a descriptive analysis was performed, and sensitivity-specificity ranges were reported.
Point estimates of accuracy parameters for SARS-CoV-2 detection were reported relative to their qRT-PCR results with 95% confidence intervals (CIs). e metaanalyses and relevant plots were constructed by using "metafor" package and a bivariate model package "mada" in R 4.0.1 software (R Foundation for Statistical Computing, Vienna, Austria) and RStudio (RStudio, Inc., Boston, MA, USA) (version 1.3).

Methodological Quality Assessment and Publication Bias.
Assessment of the quality of the included studies were independently performed by two authors (A.E. and P.C) using the diagnostic test accuracy quality assessment tool of the Joanna Briggs Institute (https://jbi.global/criticalappraisal-tools). Discrepancies were resolved in a discussion session with the participation of all authors. Quality (risk of bias) grading were accomplished as follows: Total score ≤49; low-quality (high-risk of bias), total score 50-69: moderate-quality (moderate-risk of bias); total score ≥70%, high-quality (low risk of bias). Funnel plots were constructed to detect publication bias.

Sensitivity Analysis.
Estimation of sensitivity and specificity analysis was planned by excluding surveillance and screening studies. e results of each sensitivity analysis were compared against overall results to assess the potential bias introduced by considering surveillance and screening studies.

Analytical Comparisons.
is study design was confined to clinical diagnostic studies, therefore a comparison with analytical studies was beyond the scope of this SR.

Comparing Performances of SARS-CoV-2 Ag-RDTs against SARS-CoV-2 Ag-IRRDTs.
We searched earlier papers that present an overview of commercial SARS-CoV-2 Ag-RDTs not requiring a reading instrument. We then compared the performance results of studies dealing with SARS-CoV-2 Ag-IRRDTs against earlier SRs which report performances of commercial SARS-CoV-2 Ag-RDTs not requiring a reading instrument.

Comparing Performances of SARS-CoV-2 Ag-IRRDTs against Combination of SARS-CoV-2 Ag-RDTs and Ag-
IRRDTs. As another benchmarking, the overall sensitivity measure reported in other SRs which include both SARS-CoV-2 Ag-RDTs and Ag-IRRDTs was compared with the overall sensitivity measure reported in this SR (which includes only SARS-CoV-2 Ag-IRRDTs).

Summary of Studies.
is SR included 48 clinical accuracy datasets reported in 40 sources with a total number of 137,770 samples and 5,925 samples with confirmed SARS-CoV-2 by qRT-PCR.

Overall Performance of Ag-IRRDTs.
Across all analysed samples, the pooled Ag-IRRDT sensitivity and specificity were 67.1% (95% CI 66.7% to 69.1%) and 99.4% (95% CI 99.4% to 99.4%), respectively. Table 1 displays all 48 datasets gathered on the Ag-IRRDT based studies that were eligible in this SR. Figure 2 shows forest plots of these 48 tests included in this SR, as well as their pooled result (Accuracy estimates with 95% confidence interval were calculated using the Wilson score method).
Diagnostic odds ratio (for all 48

Methodological Quality Assessment.
e diagnostic test accuracy quality assessment tool of the Joanna Briggs Institute diagnostic accuracy checklist was used (with 480 entries with 48 resulting scores) to examine the quality of each study that has been included in this SR. e highest International Journal of Microbiology quality score of the included studies was 88.9/100 (12 studies). e lowest quality score was 55.6/100 (4 studies).
Overall, there were no low-quality studies, 62.5% of highquality, and 37.5% of moderate-quality studies.

Publication Bias.
For the publication bias assessment, a funnel plot is drawn including all datasets of this SR along with the results of Egger's tests are shown in Figure 3. In this plot, the effect size was taken as the logarithm of odds ratio.

International Journal of Microbiology
Regression Test for funnel plot asymmetry (using weighted regression with multiplicative dispersion model and standard error as predictor) yields t � 5.7391, p < 0.0001 and limit estimate value of intercept is b � − 5.3701 (CI: − 6.4775, − 4.2626). Inspection of the size of intercept shows that it differs significantly from zero, indicating funnel plot asymmetry, hence (possible) publication bias.

Prevalence of SARS-CoV-2.
Prevalence rate (the number of qRT-PCR positive samples within the study population) varied between 0.4% and 78.7%. Pooled prevalence rate was 4.3%. However, it was noted that the prevalence of SARS-CoV-2 in most of these studies did not reflect the prevalence in the local populations, hence introducing a bias in the studies.

Symptomatic and Asymptomatic COVID-19 Population.
Although most of the datasets reported in studies included in this SR were related to symptomatic COVID-19 cases, majority of samples were collected from asymptomatic individuals. is is because of the fact that three datasets alone [1,40,48] were surveillance studies including a total number of 107,514 Ag-IRRDT samples, comprising 78% of overall sample count and having 638 qRT-PCR verified positive cases in total. Test for equality of sensitivities yields χ 2 � 325.36, and for specificities χ 2 � 925.33, both with df � 21 and p < 0.01 indicate the existence of heterogeneity in this sub-group of Ag-IRRDTstudies which reported conformity to manufacturers' instructions. In this sub-group, correlation between sensitivities and false positive rates was weak (ρ � 0.183 with 95% CI: − 0.259-0.561).
e non-conforming sub-group accuracy values were lower than the overall accuracies of Ag-RRDTs. In this sub-group, test for equality of sensitivities yield χ 2 � 526.4, and test for equality of specificities provide χ 2 � 570.9, both with df � 26 and p < 0.01, indicating substantial heterogeneity. Figure 4(a) displays the forest plots related to conformity to manufacturers' instructions for use of Ag-IRRDTs.

Analysis by Sample
Type. Nasopharyngeal (NP) samples with oropharyngeal (OP), anterior nasal (AN) or mid-turbinate (MT) swab samples, or with their combinations were assessed to categorize tests by sample type. Note that saliva tests were excluded in this SR. e most common sample type evaluated was NP swabs (in 32 studies, 66.7%) followed by AN (in 7 studies, 14.6%). Hence, NP swab samples were separately analysed for their accuracy performance against other sample types. Figure 4(b) displays forest plots related to sample types. NP swab samples achieved a pooled sensitivity of 0.651 (95% CI: 0.635-0.665). DOR � 300.9 (95% CI: 271.3-333.9) and test results for equality of sensitivities in pooled NP swabs (χ 2 � 512.4, p < 0.01) demonstrate heterogeneity in sensitivity values for tests done using NP swabs.

Ag-IRRDT Sensitivity by Ct Value.
is is used as a surrogate for viral load to estimate the limit of detection of antigen tests. A single Ct threshold value of Ct � 30 was selected and sensitivities of available datasets were investigated according to specified threshold, rather than using multiple Ct values. As expected, all Ag-IRRDTs showed higher sensitivity values in samples with high viral loads, and sensitivity dropped beyond Ct >30 (Table 2).

Meta-Regression.
A meta-regression was not performed due to substantial heterogeneity in reporting subgroups.

Manufacturer Based Accuracies.
Overall pooled sensitivity of five different Ag-IRRDT brands with the available database of more than three studies, altogether comprising 40 clinical accuracy datasets with 135,624 samples was 68.3%. Pooled specificity of the same sub-group was 99.4%. In this sub-group, correlation coefficient of sensitivities and false positive rates is ρ � 0.843. Figure 5 displays forest plots of these Ag-IRRDT brands. Eyeball test on forest plots and pooled diagnostic odds ratio DOR � 353.097, (95% CI: 322.423-386.688), positive likelihood ratio of LR+ � 112.514 (95% CI: 104.723-120.884), negative likelihood ratio of LR− � 0.319 (95% CI: 0.306-0.332), as well as test for equality of sensitivities calculated as ρ 2 � 278.83, p < 0.01, and test for equality of specificities calculated as χ 2 � 867.72, p < 0.01 show that heterogeneity in datasets for five major Ag-IRRDT manufacturers is high.
is SR highlights the top performance of the LumiraDX including 10 studies with pooled sensitivity of 81.8%, a sample size of 4,697 and with relatively narrow ranges of CIs for both sensitivity and specificity. Although Shenzen Bioeasy FIA demonstrated the highest sensitivity value of 87.2%, the number of studies and sample size (3, 410) were low. Note that its 95% CIs have the widest ranges for both sensitivity and specificity. SD Biosensor Standard F group had the highest number of test samples (79,030). Removing surveillance studies from SD Biosensor Standard F group did not change the pooled sensitivity value (54.4%) and reduced the pooled specificity from 99.4% to 98.5%. On the other hand, removing surveillance studies from Quidel Sofia demonstrated an increase of pooled sensitivity value from 68.7% to 74.6% (and a decrease in pooled specificity value from 99.7% to 98.5%) for 11,500 samples, placing Quidel Sofia among good performers.

Results of Comparing Performances of SARS-CoV-2 Ag-RDTs against SARS-CoV-2 Ag-IRRDTs.
Hayer et al. [5] present an overview of commercial SARS-CoV-2 Ag-RDTs not requiring a reading instrument with 19 studies investigating five different Ag-RDTs presented detailed population characteristics and Ct values. Only three commercial Ag-RDTs have been assessed in multiple studies, and of these, only two brands had adequate levels of performance; their sensitivity estimates were around 80%. ese two Ag-RDTs with the available database of more than eight studies, reported a specificity of 97% in the majority of the trials.
On the other hand, present SR includes more than 12 times the number of samples, 2.5 times the number of different all peer-reviewed studies and more than twice the number different brands with respect to earlier study [5] which did not include mass-surveillance reports, as shown in Table 3. Top performers of our SR include one brand with 10 datasets with pooled sensitivity of 81.8%, a sample size of 4,697 and with relatively narrow ranges of CIs for both sensitivity and specificity. Another good performer of our SR presents the highest sensitivity value of 87.2%, with 3 datasets and 410 samples.

Results of Comparing Performances of SARS-CoV-2 Ag-IRRDTs against Combination of SARS-CoV-2 Ag-RDTs and SARS-CoV-2 Ag-IRRDTs.
Pooled sensitivity measure reported in another SR [10] was compared with the pooled sensitivity reported in this SR, when the datasets from preprints (about 37% of their dataset count) were excluded. In this case, the new sensitivity value was reported as 0.672 (95% CI: 0.629-0.713) which came close to the value of Table 2: Sensitivities extracted from 13 different Ag-IRRDT performance evaluation studies related to 15 cases for reference qRT-PCR values of Ct <30 (waived between 29 and 33) which yields an increase in pooled sensitivity value of the group from 0.73 to 0.85.

Discussion
Lower sensitivities of Ag-IRRDT tests are due to falsenegative results in some patients. erefore, any negative result for a symptomatic patient should be confirmed by qRT-PCR test. is reduces the clinical utility of rapid antigen tests in low prevalence areas. Nevertheless, Ag-IRRDT tests can be useful in areas where molecular testing is not available or overloaded.
It should be noted here that it is currently unclear how test positivity (by any test) translates into clinical infectiousness and person-to-person spread [52].
Ag-IRRDT tests may vary in analytical sensitivity. is is one reason for differing clinical sensitivities of these tests. It was shown that [23] the relationship between Ct and viral load was poor for samples with Ct values >33. e large variation of clinical sensitivities between different brands of Ag-IRRDTs could also be due to individual study design, operator competencies and quality of the Ag-IRRDT itself. e lower sensitivity demonstrated by certain brands of Ag-IRRDTs can be overcome in high prevalence areas with high frequency of testing that may partly relieve some concerns around sensitivity [7,57].
In reference to qRT-PCR validation, ideal Ag-IRRDT sensitivity as a function of Ct value would be a flat curve. However, this is not the case in practice, and sensitivity decreases as Ct value increases. e rate of decrease in sensitivity happens to be at a faster pace beyond a certain Ct level. us, the likelihood of false-negative antigen test results becomes higher at lower viral loads. While some studies detected no difference in the mean Ct values between symptomatic patients and asymptomatic patients [41], others reported that symptomatic patients displayed lower Ct values than asymptomatic COVID-19 patients, and a Ct value of 30 is the threshold for SARS-CoV-2 infectivity [22,38]. Moreover, it was shown [37] that different sensitivity versus Ct value patterns prevail in symptomatic and asymptomatic patient groups.
It should be noted here that all measurement conditions cannot be expected to be the same for every study. For example, measurement temperature may also affect Ag-IRRDT sensitivity and specificity results [58], but only few reports include their measuring temperature ranges. Similarly, a lack of evidence to guide optimal nasal swab testing can increase the risk of false-negative test results [59]. Whether SARS-CoV-2 antigen-detection using a rapid test with self-collected nasal swab or professional-collected nasopharyngeal swab makes a difference can be another issue [60]. Cross-reactivity from other viral samples (like dengue, syphilis, hepatitis B and rheumatoid factor) are usually not considered by most researchers. Currently, most disturbing parameter is the existence of new SARS-CoV-2 variants [61] that may adversely affect rapid antigen test performance [62]. It should be pointed out here that the sensitivity of any COVID-19 tests to new SARS-CoV-2 variants were not considered in the studies included in this review.
As the research on specific problems [63][64][65] related to COVID-19 is exponentially growing, use of reliable, cost effective and fast means of diagnosing the disease become very valuable. In order to meet this need, numerous nonmolecular tests such as SARS-CoV-2 Ag-RDTs and SARS-CoV-2 Ag-IRRDTs have been introduced by different manufacturers in the worldwide market. e SR presented in this paper have shown that (contrary to expectations), overall pooled sensitivity and specificity of SARS-CoV-2 Ag-IRRDTs did not demonstrate a significant superiority over SARS-CoV-2 Ag-RDTs which do not require a reader instrument, even in the case where surveillance and screening datasets were excluded from the analysis. Nevertheless, they provide connectivity advantages and reduce operator interface (reading) issues.
One possible limitation of the present SR design is the assumption (as in the previously published SRs) that qRT-PCR testing is the standard measure of reference. Viral culture might provide a better measure of comparison; however, it suffers considerable implementation problems in practice. In addition, the present SR did not assess the influences of age, gender, symptom duration and sample collector (a swab sample obtained by a trained professional or a self-collected swab) on the accuracy of Ag-IRRDTs.

Conclusions
Most manufacturers of Ag-IRRDTs can produce high specificity tests, but their sensitivities are low and there are significant differences in their sensitivity (15%-99%). e lower sensitivity of certain brands of Ag-IRRDTs can be overcome in high prevalence areas with high frequency of testing. Conformity to the manufacturers' instructions for use in testing procedure improves the accuracy of these tests. New SARS-CoV-2 variants are major concern and they should be evaluated in the future studies.

Data Availability
Data available from corresponding author upon reasonable request.

Ethical Approval
is study did not require an ethical approval because the systematic review was based on published research.

Conflicts of Interest
e authors declare that there is no conflicts of interest.

Authors' Contributions
AUK conceived the study, co-ordinated contributions from the co-authors. PC, AET drafted the work, prepared the figures and tables. All authors took part equally in designing the study, searching the databases, screening papers against eligibility criteria, methodological quality assessment of included studies and analysing the data. All authors participated in the preparation of final manuscript, gave final approval of the version to be published and agree to be accountable for all aspects of the work.