Confirming Multiplex RT-qPCR Use in COVID-19 with Next-Generation Sequencing: Strategies for Epidemiological Advantage

Rapid identification and tracking of emerging SARS-CoV-2 variants are critical for understanding the transmission dynamics and developing strategies for interrupting the transmission chain. Next-Generation Sequencing (NGS) is an exceptional tool for whole-genome analysis and deciphering new mutations. The technique has been instrumental in identifying the variants of concern (VOC) and tracking this pandemic. However, NGS is complex and expensive for large-scale adoption, and epidemiological monitoring with NGS alone could be unattainable in limited-resource settings. In this study, we explored the application of RT-qPCR-based detection of the variant identified by NGS. We analyzed a total of 78 deidentified samples that screened positive for SARS-CoV-2 from two timeframes, August 2020 and July 2021. All 78 samples were classified into WHO lineages by whole-genome sequencing and then compared with two commercially available RT-qPCR assays for spike protein mutation(s). The data showed good concordance between RT-qPCR and NGS analysis for specific SARS-CoV-2 lineages and characteristic mutations. RT-qPCR assays are quick and cost-effective and thus can be implemented in synergy with NGS for screening NGS-identified mutations of SARS-CoV-2 for clinical and epidemiological interest. Strategic use of NGS and RT-qPCR can offer several COVID-19 epidemiological advantages.


Introduction
e coronavirus (COVID-19) pandemic started in December 2019 in Wuhan, China. It has been considered one of the deadliest infectious disease outbreaks in recent world history.
e causative agent of COVID-19 is the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). SARS-CoV-2 is a positive-sense RNA virus belonging to the Coronaviridae family, genus Betacoronavirus, and subgenus sarbecovirus [1,2]. Coronavirus has had devastating effects on the human population and to date is estimated to have caused over 5 million deaths worldwide [3]. Rapid accurate diagnosis of SARS-CoV-2 is the most crucial step in the management of COVID-19-mostly achieved with reverse transcription-quantitative polymerase chain reaction (RT-qPCR). e assays detect highly conserved regions in the open reading frame (ORF) 1a or 1b and the nucleocapsid (N) gene of SARS-CoV-2 [4][5][6].
Currently, the virus continues to be a global agent of infection. e highly mutagenic nature of SARS-CoV-2 has assaulted many countries with second or third waves of the outbreak [7,8]. Mutations with higher transmissibility, a more intense disease state, and that are less likely to respond to vaccines or treatments, have been classified by the World Health Organization (WHO) as variants of concern (VOC; Table 1). Recent epidemiological reports released by WHO indicated five VOCs: ( (Cascella et al.) [7] Genomic changes in the receptor-binding domain (RBD), a region of the spike protein that studs SARS-CoV-2 to the outer cell surface, are linked to increased capacity to strike in several outbreak phases in different parts of the world [9]. More recently, South Africa reported a new SARS-CoV-2 variant to the WHO. Omicron (B.1.1.529) was first detected in specimens collected in Botswana. On November 26, 2021, the Technical Advisory Group on SARS-CoV-2 Virus Evolution (TAG-VE) advised WHO to designate B.1.1.529 as the fifth VOC [10].
ere continues to be a need for swift and cost-effective SARS-CoV-2 variant detection and monitoring. Genomic sequencing is the gold standard and most reliable method for the detection of such changes in the viral genome. e standard Sanger sequencing method is highly accurate but it can only sequence a small fraction of the genome [12]. Sanger sequencing is also laborious, time-consuming, and expensive for large-scale sequencing projects that require rapid turnaround times. ese attributes make Sanger sequencing less attractive for SARS-CoV-2 sequencing for variant identification and monitoring.
Targeted Next-Generation Sequencing (NGS) is also a reliable method to identify variant strains of pathogens, including viruses [13]. e principal advantage of NGS over other techniques like Sanger sequencing or RT-qPCR is that scientists and laboratorians do not require prior knowledge of existing nucleotide sequences. Moreover, NGS has higher discovery power and higher throughput [13]. In the current pandemic, NGS has widely been employed to detect and identify novel mutated viral variants of SARS-CoV-2 [14]. Although widespread adoption of NGS in clinical laboratories offers effective variant discovery, several challenges impede the routine use of NGS in these settings. Besides the need for multifaceted NGS validation studies [15], NGS testing is complicated by the high level of necessary human expertise and the higher cost of scalability for routine pathogen/variant detection. Moreover, the interpretation of results generated by NGS can be intricately complex and their applicability to clinical decision-making is another issue altogether. ese complexities pose the need to progress practical methodologies to identify SARS-CoV-2 mutagenic variants quickly and cost-effectively.
PCR is the gold standard for the detection of preidentified genomic sequences and variations. Several RT-qPCR tests were developed and commercialized very quickly after the first SARS-CoV-2 genome was sequenced. Technology has proven to be the most reliable tool for the diagnosis of COVID-19 and was adopted globally because of its lower cost and complexity. Likewise, RT-qPCR can be deployed for mass-scale detection of a new mutation after it is discovered by NGS.
is approach has been widely deployed in environmental surveillance of known COVID-19 variants, and RT-qPCR is reported as a gold standard test for COVID-19 surveillance in wastewater. Droplet digital PCR (ddPCR) is a modified PCR method where amplification is performed in submicroliter droplets, and the number of positive droplets is enumerated for absolute quantification of the targets or the genomic variations. ddPCR provides the absolute quantification of targets and is also reported to be more sensitive for the detection of known SARS-CoV-2 mutations [16].
Various variant-specific RT-qPCR assays have been developed. For example, spike protein mutation (L452R) is a characteristic mutation of the Delta variant, and several multiplex mutation-specific RT-qPCR assays have been developed to detect VOCs via NGS-identified mutations. RT-qPCR assays are widely adopted because of their lower turnaround time (<24 hrs). Capital investment and operational cost of RT-qPCR are also significantly cheaper than NGS [17]. In the United States, RT-qPCR-based COVID-19 testing is reimbursed at ∼$100/sample, compared to $300-$1000/sample for COVID-19 sequencing. NGS cost is highly dependent on the sample volume and instrument throughput. To achieve the lowest published cost, the laboratory has to sequence ∼30,000 samples in a single batch on a Million Dollar instruments.
Recent advances in RT-qPCR instruments have led to the development of smaller, cheaper, and portable instruments, and the COVID-19 pandemic has boosted the adoption of such instrumentation. Bechtold et al. have reported the development of RT-qPCR assays for the detection of VOCs on the basis of single nucleotide polymorphisms (N501Y, E484K, and deletion HV69/70) in spike protein.
is assay is also validated for the field application using a portable peakPCR [18].
According to a technical report published by the European Centre for Disease Prevention and Control, the WHO Regional Office for Europe-related variant detection methods suggests NGS should perform for the confirmation of the newly emerged VOCs not for detection and prevalence calculating variants [19].
As soon as new mutations are discovered by NGS, academic and commercial researchers have rushed to design qPCR assays for the detection of the same mutation [20].
ermo Fisher Scientific has developed mutation-specific assays as mutations are discovered. Several other companies (GT Molecular, PerkinElmer, Promega, and Twist Biosciences) developed multiplex RT-qPCR panels targeting mutations characteristic of the variant. Combinations of such reactions are available in kit format for the detection of known mutations defining the variants. ese kits are widely used for SARS-CoV-2-variants surveillance in the environmental samples, such as wastewater [21].
We have compared the variant detection by two commercially available RT-qPCR-based solutions to whole-genome sequencing. e adoption of these kits in clinical surveillance has been restricted because of the limited clinical utility for individual patient variant identification.
RT-qPCR-based variant detection is based on limited known mutations, compared to a whole-genome analysis by NGS which can identify known mutations as well as discover new mutations.
With the acknowledgment of these limitations, the current study proposes that RT-qPCR could be utilized to extend the mass scale detection of the mutation(s) discovered by NGS. Strategic deployment of NGS for discovering new mutations followed by mass surveillance by RT-qPCR could improve the epidemiological surveillance of this pandemic. Rapid detection of known variants could also potentially have a clinical application if future variants with different clinical manifestations and treatment needs are discovered.

Sample Collection.
is study used 78 deidentified sample remnants from nasopharyngeal or oropharyngeal swabs (catalog# 202003, Nest Biotechnology Jiangsu, China) collected from patients that screened positive for the presence of SARS-CoV-2 following RNA extraction and RT-qPCR at Advanta Genetics (https://aalabs.com/) in Tyler, Texas. All the clinical samples were collected from Texas residents who tested positive for SARS-CoV-2 with established protocols targeting N1 and N2 genes with established primer and probe design. Eleven samples were collected and archived during the early (August 2020) pandemic. e remaining 67 samples were collected in July 2021 following the global outbreak of the Delta (B.1.617.2) variant. We qualified the samples with moderate to high viral load by cycle threshold (Ct) values ≤30 for N1 and N2 genes by RT-qPCR testing on the LightCycler ® 480 System (Roche). We also included 4 samples with low viral amplification (Ct � 30-35; sample 210-213) in the study to evaluate the applicability of RT-qPCR and NGS for reduced Ct values (Ct values are inversely proportional to amplification thresholds; Table 2). Written consent was obtained from the patients for participating in the study, and only residual diagnostic samples were used.

RNA Extraction.
Total RNA was extracted from nasopharyngeal or oropharyngeal swabs collected and transported to the lab in MANTACC Transport Medium or Viral Transport Medium (VTM) purchased from Criterion Clinical (https://criterionclinical.com/). RNA extraction was carried out in a preamplification environment within a Biosafety level 2 (BSL-2) facility using the Roche MagNA Pure 96 System and Viral NA Small Volume Kits. Briefly, samples were lysed with 340 uL of lysis buffer and 10 uL of proteinase K at 55°C for 10 minutes, followed by extraction via the Roche MagNA Pure 96 instrument. Extracted nucleic acids were immediately sealed with a PCR clean sealing film (Cat #T329-1 Simport Scientific Inc. QC J3G 4S5 Canada) and frozen at −80°C until sequencing was imminent.

Library Preparation and Sequencing.
Samples were sequenced in two laboratories using the Illumina Sequencing platform, 67 samples were sequenced at Fulgent Genetic (https://www.fulgentgenetics.com), and the remaining 21 samples were sequenced at Advanta Genetics. Sequencing libraries were prepared using the Illumina COVIDSeq protocol (Illumina Inc, USA). Total RNA was primed with random hexamers, and first-strand cDNA was synthesized using reverse transcriptase. e SARS-CoV-2 genome was amplified using the two sets of primers to produce amplicons spanning the entire genome of SARS-CoV-2. e amplified product was then processed for tagmentation and adapter ligation using 24 IDT for Illumina Nextera UD Indexes Set A. Further cleanup and pooling were performed as per protocols provided by the manufacturer (Illumina Inc, USA). A COVIDSeq positive control (Wuhan-Hu-1) and one no template control (NTC) were processed with each library batch. Representative libraries were quantified using a Qubit 2.0 fluorometer (Invitrogen, Inc.), and fragment sizes were analyzed in Agilent 5200 Fragment Analyzer. Libraries were pooled into an equimolar concentration, and the pool was further normalized to 1nM concentration. e final library pool was denatured and neutralized with 0.2N NaOH and 400 mM Tris-HCL (pH-8), respectively. Denatured libraries were further diluted to a 2 pm loading concentration. Dual indexed paired-end sequencing with 75 bp read length was carried out using the HO flow cell (150 cycles) on the Illumina MiniSeq ® instrument.

NGS Data Analysis.
Illumina BaseSpace (https:// basespace.illumina.com) bioinformatics pipeline was used for sequencing QC, FASTQ Generation, genome assembly, and identification of SARS-CoV-2 variants. Briefly, the Binary Base Call (BCL) raw sequencing files generated by Illumina MiniSeq ® sequencing platforms were uploaded to the Illumina BaseSpace online portal and demultiplexed to FASTQ format using the FASTQ Generation (Version: 1.0.0.) application. e raw FASTQ files were trimmed, sorted, and checked for quality (Q > 30) using the FASTQ-QC application within the BaseSpace. QC passed FASTQ files were aligned against the SARS-CoV-2 reference genome (NCBI RefSeq NC_045512.2) using Bio-IT Processor (Version: 0x04261818). en, DRAGEN COVID Lineage (Version: 3.5.4) application in BaseSpace was used to generate a single consensus FASTA file for all the samples sequenced on a single flow cell. Finally, single consensus FASTQ was also analyzed for lineage assignment using the web version of Phylogenetic Assignment of Named Global Outbreak Lineages (PANGOLIN) software (https:// pangolin.cog-uk.io). Only the consensus variants identified by both applications were used for further analysis.

Phylogenic Analysis.
e FASTQ sequence file was analyzed and visualized for evolutionary relationships through the open-source toolkit Nextstrain (https://clades. nextstrain.org/). GSAID database for global SARS-CoV-2 sequence analysis, available from the Nexstrain server, was used to retrieve representative variant sequences [22]. e NCBI databank was used to retrieve the original Wuhan strain SARS-CoV-2 sequence. All the individual consensus Global Health, Epidemiology and Genomics genome sequence files were aligned by using the Clustal-W multiple sequence alignment tool [23]. e phylogenetic analysis was carried out utilizing the Clustal omega server and the phylogenetic tree was constructed using the Mega X tool [24] with default parameters of the maximum likelihood method. e further analysis aimed at investigating the conservation of spike protein in reference sequences versus clinical strains of SARS-CoV-2 from our study using bioinformatics tools. e protein sequences for different ORFs were determined by either annotation by IBM Functional Genomics Platform. [25] T-COFFEE and PRALINE software [26,27] were used for the alignment of spike proteins from different isolates and mutation position analysis ( Figure 1).
GT Molecular assays were provided in two different kits containing the variant-specific reference standard and mutation-specific primer-probe. Amplifications are performed according to the manufacturer's instructions in separate master mix preparations as described in Table 3. Briefly, RNA was reverse transcribed for 10 minutes at 53°C followed by enzyme activation for 2 minutes at 95°C, and 40 40 cycles of 15 seconds at 95°C for Denaturation and 60 seconds at 52°C for Annealing/Extension. Reactions were performed by using qScript 1-Step Virus ToughMix

Results
e 78 randomly selected positive SARS-CoV-2 samples were from two separate periods in the pandemic. NGS of the 11 samples from August 2020 revealed eight different lineages, but none of the lineages were VOC according to the WHO classification ( Figure 1). All samples (100% [67/67]) sequenced from July 2020 revealed the SARS-CoV-2 Delta (B.1.617.2, AY3, and AY 25) VOC with sublineages AY.1 to AY.3. Incidentally, the six samples concurrently sequenced at both laboratories were identified as Delta (B.1.617.2) VOC. Unfortunately, raw data (FASTQ files) were not available from the samples sequenced at Fulgent Genetics. However, the raw data from the 21 samples sequenced at Advanta Genetics was analyzed for phylogenetic relationship and mutation discovery ( Table 2). is data revealed novel mutations belonging to existing prominent lineages along with convergent mutations of different lineages and one unique mutation Figure 1.

Note.
No Brazilian or United Kingdom lineages were identified. Two groups of samples (F & D) lacked omnipresent mutation (614: D->G) which is present in most variants of concern. Group F is of particular interest as it had L981F * Detected in some sequences but not all. We then turned our focus to testing the 67 Delta (B.1.617.2) samples by using RT-qPCR methodology targeting three (L452R, T478K, and P681R) characteristic mutations identified through sequencing. We tested each sample using two different commercially available ( ermo Fisher Scientific and GT Molecular) assays and compared the results.
e Delta (B.1.617.2) classifying mutation (L452R) was correctly identified by GT Molecular RT-qPCR-based assay, and the test showed 100% concordance for all 67 samples that were sequenced as Delta (B.1.617.2). However, the ermo Fisher Scientific assay for the same target (L452R) did not amplify in 4 out of 67 samples otherwise identified as Delta variant by NGS (Table 3;  Supplementary Table S1). All four samples had a relatively low viral load (Ct > 25), with overall higher Ct values for all the RT-qPCR assays. Considering the relatively low sensitivity of mutation-specific RT-qPCR compared to the target detection RT-qPCR in general and the lower sensitivity of ermo Fisher Scientific assays, this slight discrepancy is not alarming. Overall, the GT Molecular assay targeting the L452R mutations had a 4.21 ± 2.3 lower Ct value when the same RNA template was tested with both assays suggesting higher sensitivity of the GT Molecular assay. Moreover, 5 of 67 samples were negative for T478K (GT Molecular), and 12/67 were negative for P681R specific PCR ( ermo Fisher Scientific) using RT-qPCR (Table 3). Unfortunately, we could not verify the absence of these mutations because NGS data was not available for the 67 samples sequenced at Fulgent Genetics. us, the L452R mutation remained the Global Health, Epidemiology and Genomics most informative marker for RT-qPCR-based detection of the Delta variant. All 11 samples sequenced as nonvariants of concern were negative for all three Delta variant-specific mutations (Table 3). Interestingly, a Beta and Gamma variant classifying mutation (E484K) was identified (both by RT-qPCR assays and NGS) in one sample, which is otherwise classified as a Delta variant by NGS and carries an L452R mutation. is mutation combination should be monitored and further investigated for its clinical significance.
Of note, the 4 samples with lower viral amplification (30 Ct) that were included in this study were able to be characterized by NGS and both RT-qPCR assays. Two out of the four samples were identified as Delta (B.1.617.2) variants with the remaining two identified as non-VOC. erefore, NGS and RT-qPCR methodologies can potentially be used for SARS-CoV-2 variant detection from the samples with lower viral amplification (1000-10 copies).
In addition to NGS and RT-qPCR variant concordance of the 78 samples, the results of this study reveal significantly reduced diversity of SARS-CoV-2 variants from July 2020 to August 2021. We detected 8 lineages among the 11 samples tested from July 2020 compared to a single Delta variant lineage with three sublineages (Delta) among 67 samples collected in August 2021 (Figure 2). ese findings are important for understanding the evolution of SARS-CoV-2 variants in Texas (Figure 2) and support other studies showing the predominance and infectivity of the Delta variant [28].

Discussion
e emergence of new SARS-CoV-2 variants with higher infection rates and morbidity continues to cause the global scientific community concern. To manage further transmission and control of infection, genomic surveillance is important for the identification and tracking of novel variants. NGS is a very useful tool for identifying new strains of COVID-19 and other infectious pathogens. NGS can be used to detect novel pathogenic mutations and can also be used to determine the rate of pathogen evolution.
Although NGS is the most reliable method for detecting mutations in SARS-CoV-2, the methodology is not practically applicable for large-scale surveillance, particularly in resource-limited settings. Factors like continuous validation studies, logistic challenges, database validity, cost-benefit analysis, and high technical expertise make the implementation of NGS in routine clinical settings difficult. Comparatively, RT-qPCR-a gold standard for diagnosing SARS-CoV-2-is a method that can be extended for variant detection and monitoring in clinical settings. Although the cost of sequencing has plummeted in the last decade, and $1000 human genome is indeed a reality, the capital investment of the instrument (Illumina NovSeq) alone is ∼a million USD prior to any sequencing application. COVID-19 genome sequencing cost ranges from $100 to $400 (COGS [Cost of Goods] only). However, the laboratory must batch 1000s of samples to achieve the lowest cost. e fastest practically attainable turnaround time is ∼24 hrs for low throughput sequencing platforms which have the highest per sample cost. But RT-qPCR can be performed within a few hours and per sample costs (COGS: $5-$10) are a fraction of NGS. RT-qPCR also results in an easy-to-interpret numerical value (Ct value) compared to the complex NGS output (FASTQ files) requiring additional resources and time for analysis. Accordingly, this study examined two commercially available RT-qPCR assays for the detection of SARS-CoV-2 mutagenic variant and mutation detection and compared the results with NGS. Both assays were able to detect L452R mutation with 100% (67/67; GT Molecular) and 94% (63/67; ermo Fisher Scientific) accuracy when compared to NGS. While NGS is an essential tool for sequencing the entire genome and identification of new mutations, this study   suggests that RT-qPCR can aptly serve as an easy-to-deploy, cost-effective, and time-sensitive solution for the detection of known mutations for mass surveillance. Likewise, this approach has been previously applied for surveillance of leprosy and identification of zoonotic transmission in the United States [29,30]. e authors used NGS data to develop an algorithm for the classification of global variants and deployed RT-qPCR to understand the local transmission dynamics. e results in this study are promising because the RT-qPCR lineage classification showed no mismatches when compared with the 21 sequenced samples that had raw data. Although these results are encouraging because of the low cost of scalability of SARS-CoV-2 mutation detection with RT-qPCR, other research studies using similar virus sequencing comparison methods have been less successful. Khan and Cheung [31] noted the presence of mismatches when comparing SARS-CoV-2 between RT-qPCR and sequencing data. Elaswad and Fawzy [32] also found this to be the case when comparing RT-qPCR assays with available SARS-CoV-2 genomes isolated from animals. Similarly, Hoang et al. [33] noted missed detection with RT-qPCR assays for influenza A (H1) when compared with sequencing. Although these studies add some concern, it does appear strategic deployment of both NGS and RT-qPCR technologies for the discovery and monitoring of emerging SARS-CoV-2 mutations is likely to advance better strategies for epidemiological characteristics.
Even though the unavailability of the raw data (FASTQ files) from the 67 samples remains the limitation of the study, phylogenetic analysis of the 21 samples tested at Advanta Genetics was clustered as expected; all the VOC and non-VOC samples were grouped appropriately. is suggests a good potential for the use of RT-qPCR approaches in the detection of preidentified mutations and possibly application in low-cost surveillance of known variants. Importantly, this study does not suggest the RT-qPCR as a replacement for NGS because RT-qPCR assays utilized in this study were designed to target only a few amino acid motifs compared to NGS, which covers a wider breadth of the virus genome.

Conclusion
ere are two important takeaways from this study. First, the NGS data provided further evidence of the rapid evolution of SARS-CoV-2 lineages including the highly transmissible Delta variant in the East Texas region and suggests the continued threat of COVID-19. is finding is consistent with other research and further supports the need for rapid, cost-effective monitoring of variant mutations. Second, the current study endorses the potential of RT-qPCR assays as a solution for more accessible variant monitoring. e data showed concordance with RT-qPCR and NGS analysis for specific SARS-CoV-2 lineages and characteristic mutations.
us, the deployment of RT-qPCR testing for the detection of known SARS-CoV-2 variants may be extremely beneficial.
e key difference between the NGS and RT-qPCR is discovery power, scalability, and throughput. Both technologies are reliable and highly sensitive. RT-qPCR can detect only known sequences with help of specific probes and primers. In contrast, NGS does not need prior information about the sequence, but NGS is less cost-effective for low target numbers and is a time-consuming method. NGS can detect thousands of targeted regions with single-base resolution. RT-qPCR is cost-effective, and its familiar workflow made the detection of a limited set of variants and low target numbers easy [34]. Accordingly, is it suggested that RT-qPCR is a quick and cost-effective alternative to sequencing for screening known mutations of SARS-CoV-2 for clinical and epidemiological interest, especially in developing countries where COVID-19 diagnostic centers are limited by regional sequencing laboratories for screening the mutations in the SARS-CoV-2 clinical samples. e findings in this study depict great potential for RT-qPCR to be an effective strategy offering several epidemiological advantages.  Global Health, Epidemiology and Genomics