Full Genomic Sequences of H5N1 Highly Pathogenic Avian Influenza Virus in Human Autopsy Specimens Reveal Genetic Variability and Adaptive Changes for Growth in MDCK Cell Cultures

The entire H5N1 highly pathogenic avian influenza viral genomes were identified in the frozen autopsy specimens: the trachea, lung, colon, and intestinal feces from a patient who died of the disease in 2006. Phylogenetic analysis of the viral genomes showed that these viruses belonged to clade 1 and were the reassortants generated from the reassortment of the viruses within the same clade. The sequencing data from the autopsy specimens revealed at least 8 quasispecies of the H5N1 viruses across all 4 specimen types. These sequences were compared to those derived from the virus isolates grown in Madin Darby canine kidney (MDCK) cells. The virus isolates from the trachea, lung, and fecal specimens showed 27 nucleotide substitutions, leading to the changes of 18 amino acid residues. However, there was no change in the amino acid residues that determined the viral virulence. The changes were more commonly observed in the lung, particularly in the HA and NA genes. Our study suggested that the adaptation changes for the viral fitness to survive in a new host species (MDCK cells) might involve many genes, for example, the amino acid substitution 177G or 177W adjacent to the receptor-binding residues in the HA1 globular head and the substitution M315I in PB2. However, a mutation changes near the receptor binding domain may play an important role in determining the cell tropism and is needed to be further explored.


Introduction
Presently, the influenza A virus consists of 18 hemagglutinin (HA) and 11 neuraminidase (NA) subtypes. Viruses of various HA and NA combinations, i.e., between H1-H16 and N1-N9, are found in avian species [1,2], while H17N10 and H18N11 are batlike viruses [2][3][4]. The subtypes that infect humans comprise H1N1, H2N2 (already extinct), and H3N2 viruses. The receptor preference for avian influenza (AI) viruses and human influenza viruses is different. Avian viruses prefer epithelial cell receptors containing α2,3-galactose linked-sialic acid (α2,3 gal-SA). In contrast, human viruses prefer epithelial cells containing α 2,6 galactose linked-sialic acid (α 2,6 gal-SA) [5]. Most AI viruses are of low pathogenicity and cause asymptomatic or mild infection in avian species, but some strains of H5 and H7 are highly pathogenic. Occasionally, some avian viruses cross the species barrier to infect human. The fatality rate of seasonal influenza viruses in humans is generally minimal, while that of avian viruses in humans is as high as 33% for the H5N1 highly pathogenic avian influenza (HPAI) viruses that first emerged in Hong Kong in 1997 [6, 7], 53% worldwide for H5N1 HPAI viruses that have reemerged since 2003 [8], and 39% for H7N9 virus [9]. Thailand first reported an H5N1 AI outbreak in poultry and humans in January 2004. The last human case was reported in 2006, while it was 2008 for the last poultry outbreak. There were a total of 25 human cases with 17 deaths or a fatality rate of 68% [8].
The pathology of human influenza and H5N1 HPAI is markedly different. Human influenza virus infections are confined mainly to the respiratory tract, while H5N1 HPAI virus infections spread beyond the respiratory tract to distal end organs [10]. Previous investigators demonstrated the presence of both genomic RNA and antigenomic RNA in various organs, including the lung, intestine, spleen, trachea, brain, heart, liver, kidneys, and lymph nodes [11][12][13][14][15]. The H5N1 virus has never been isolated from human specimens outside of the respiratory tract, except those in our study. Moreover, we also announced the discovery of full-genome sequences of H5N1 HPAI virus in archival organ tissues of a dead case by next-generation sequencing (NGS) technique [16], but the sequence analysis has not been performed yet. In this study, the amino acid sequences of various viral genes from the autopsy specimens were aligned with those from the virus isolates, and the substitutions that may be associated with the viral adaptation for replication in a new host species were determined.

Materials and Methods
2.1. Ethical Issues. This study was approved by the Institutional Review Board, Faculty of Medicine Siriraj Hospital, Mahidol University, Thailand.

2.
2. An H5N1 Patient and Clinical Specimens. In August 2006, a 59-year-old man who lived in Nong Bua Lam Phu Province, Northeast of Thailand, was hospitalized with severe pneumonia which later progressed to acute respiratory distress syndrome and multiorgan failure. Several attempts to diagnose H5N1 avian influenza by genome detection failed. Nevertheless, he was treated with a standard regimen of oseltamivir on day 16 after the onset of symptoms when a history of contact to sick and dead chickens was obtained. He died of the disease at day 28 after the onset. Autopsy tissues and intestinal fecal samples were collected for disease diagnosis by viral genome detection, virus isolation in MDCK cells, and indirect immunofluorescence assay (IFA). The remaining fresh tissues were stored frozen at -80°C.

Laboratory Investigation.
Each tissue (the trachea, lung, colon, spleen, and liver) was washed and processed individu-ally with separate sets of instruments. Conventional reverse transcription-polymerase chain reaction was performed using the protocol described by Poddar [17] and the World Health Organization [18]. However, the results were inconclusive with all tissue samples investigated, including the intestinal fecal sample. An indirect immunofluorescence assay (IFA) was carried out by staining the impression smears of tissues with a monoclonal antibody to the influenza A nucleoprotein (Chemicon), and positive results were obtained with the trachea and lung epithelial cells.
Tissues samples (the trachea, lung, colon, spleen, and liver) were ground in viral growth media (VGM) containing Earle's Minimal Essential Medium (EMEM, ThermoFisher) without fetal bovine serum supplementation, and the intestinal fecal samples was suspended in VGM. The specimen suspensions were centrifuged, and the supernatants were inoculated onto MDCK cell monolayers maintained in VGM. Three serial blind passages were carried out, and virus isolation was successful with samples from the trachea, lung, and fecal samples. These H5N1 virus isolates were named A/Thailand/NBL1/2006-L, A/Thailand/NBL1/2006-T, and A/Thailand/NBL1/2006-F, for the lung, trachea, and fecal specimen origin, respectively. The virus isolates at the 4 th passage were subjected to full genome sequencing by the Sanger method, and the sequencing data was deposited in the GenBank database (Table 1).

Next-Generation Sequencing of Direct Autopsy
Specimens. NGS was conducted on tissue samples after long-term storage for almost 10 years at -80°C. Total RNA was extracted from the trachea, lung, colon, and fecal samples using TRIzol Reagent (Invitrogen) and were cleaned using the RNeasy Mini Kit (Qiagen). The extracted RNA was reverse transcribed into cDNA using the SuperScript III First-Strand Synthesis System (Invitrogen). The cDNA was then used as a template to amplify all 8 segments of the influenza viral genome using Platinum Taq HiFi (Invitrogen). The entire genomic segment of PB2, PB1, PA, HA, NP, and NA (>1 kb) were amplified as overlapping fragments, namely, PB2a, PB2b, PB1a, PB1b, PAa, PAb, HAa, HAb, NPa, NPb, NAa, and NAb, respectively. The M and NS genomic segments were amplified as a single segment. The nucleotide sequences of these sequencing primers can be obtained on request. A 50 μl volume PCR reaction consisted of 5 μl of 10x PCR buffer, 2 μl of 50 mM MgSO 4 , 1.5 μl of 10mM dNTPs, 2 μl of each 10 μM forward and reverse primers, 0.2 μl of Platinum Taq HiFi, 5 μl of cDNA, and distilled water. The PCR was carried out in a GeneAmp PCR system 2400 thermal cycler (Applied Biosystems). The reaction was composed of 1 cycle of holding at 94°C for 5 minutes, followed by 35 cycles of amplification (94°C for 15 seconds, 55°C for 30 seconds, and 68°C for 2 minutes), and a final extension step at 68°C for 10 minutes. The amplified PCR products were then purified using the QIAquick Gel Extraction kit (Qiagen). Full-genome sequencing was performed in three steps: library preparation, emulsion PCR (emPCR), and high-throughput sequencing using the GS Junior platform (Roche Diagnostics). Briefly, 500 ng of each amplified DNA fragments including PB2a, PB2b, PB1a, PB1b, PAa, PAb, 2 BioMed Research International HAa, HAb, NPa, NPb, NAa, Nab, M, and NS were pooled together and ligated to the adapters followed by emulsion PCR and sequencing. Four DNA libraries with different barcodes obtained from the trachea, lung, colon, and fecal samples were generated by using the GS DNA Rapid Library Preparation kit (Roche) according to the manufacturer's protocol.
2.5. NGS Data Analysis. Raw sequencing data were classified into each sample based on the barcode sequences within the DNA libraries. Secondary data analysis was performed using the CLC Genomics Workbench version 8.0.1 (http:// www.clcbio.com/). Low-quality data (Q score < 30) and the adaptor sequences were trimmed and removed from the sequencing reads. The passed filter (PF) reads (Q score > 30) were used for mapping and alignment with the influenza reference genome of A/Thailand/1(KAN-1)/2004 (H5N1) (GenBank accession no. AY555144-AY555151). Each specimen's total reads ranged between 100,000 and 200,000, and the number of mapped reads ranged between 98% and 99%. The average read length for all samples ranged between 241 and 309 base pairs. The viral sequences obtained from each tissue were deposited in GenBank under the accession numbers MG668904 to MG668911 for the colon, MG668920 to MG668927 for the trachea, MG668928 to MG668935 for the lung, and MG668912 to MG668919 for the fecal specimen as shown in Table 1. The raw reads were deposited in the Sequence Read Archive under the BioProject accession number PRJNA494792 [16]. Nucleotide variations were examined and calculated for percentages of minor mutations in each gene. The genome signature corresponding to the relative frequency of amino acid residues for each gene was graphical generated using WebLogo (https://weblogo.berkeley .edu/logo.cgi) [19]. The prediction of mutational positions on the HA protein structure was investigated using a model of PDB code: 3FKU [20] and analyzed using UCSF Chimera program version 1.13.1 [21].
2.6. Phylogenetic Tree Construction. The reference strains with known genetically clades were retrieved from the WHO (https://www.who.int/influenza/gisrs_laboratory/ h5n1_nomenclature/en/) [22] and NCBI GenBank databases (https://www.ncbi.nlm.nih.gov). The concatenated entire coding sequence (CDS) from 8 segments (PB2, PB1, PA, HA, NP, NA, M, and NS) and individual segments of H5N1 viruses isolated in Thailand were phylogenetically analyzed. The sequences were aligned using ClustalW multiple sequence alignment in BioEdit program version 7.0.4.1, and a phylogenetic tree was constructed using MEGA 7.0 software (http://megasoftware.net/) [23]. The evolutionary distances were estimated by using the neighbor-joining method and maximum composite likelihood algorithm. The reliability of the neighbor-joining tree was estimated by bootstrapping analysis using 1,000 replicate datasets. The supporting bootstrap value of greater than 80% was shown at the nodes of each cluster.

Selection Pressure Analysis. Human H5N1 and H5N6
viruses isolated between 1997 and 2018 were analyzed across clades for the ratio of nonsynonymous (dN) to synonymous (dS) substitution (dN/dS) on a codon-by-codon basis using the estimate selection for each codon (HyPhy) application under the model of Hasegawa-Kishino-Yano method in MEGA 7.0 software [23].

Heterogeneity of H5N1 Influenza Viral Genomes in
Various Tissue Organs. This study explored the H5N1 influenza viral genomes in four kinds of autopsy specimens: the trachea, lung, colon, and intestinal fecal specimen by NGS. The results showed that the H5N1 viral genome was detected in all of the specimens investigated. Deep sequencing analyses of the entire genome in every tissue demonstrated the virus quasispecies. An alignment of the viral nucleotide sequences from these 4 specimens showed 8 quasispecies in 3 BioMed Research International 6 gene segments, i.e., PB2, PB1, PA, HA, NP, and NA, but not in M and NS. Four positions of mixed nucleotide residues were present in the PB2 segment. Mixing A and G was found at 7 positions, while mixing T and G was found at 1 position (Table 2 and Fig. S1).

Virus Adaptation for Growth in MDCK Cells.
Three influenza virus isolates derived from the trachea, lung, and fecal samples at the 4 th passage in MDCK cells were sequenced by the Sanger method. The sequencing data have been deposited in the GenBank database since 2009 with the accession numbers shown in Table 1 The genome signature of the viruses found in the autopsy specimens demonstrated the frequency or proportion of amino acid variations corresponding to the point mutations found in the virus isolates (Figure 1(b)). Most variations were more frequent with the lung isolate, particularly in the HA and NA gene segments. Four amino acid changes were observed in the HA1 domain, which contained the receptor binding site (RBS) for viral attachment to the host cell surface, determining the host cell tropism or specificity. The alignment of HA1 amino acid sequences derived from the autopsy specimens, clinical isolates, and A/Thailand/1(-KAN-1)/2004 (H5N1) clade 1 virus showed that the point mutations 177G or 177W were adjacent to the receptor binding residues 179H, 186E, 190L, and 191Y (Figure 1(c)). The amino acid position 177 was in the globular head closed to the RBS, as demonstrated by the 3D-crystalized structure (Figure 1(d)). Sequence variation or genetic drift on the HA1 domain might influence the receptor binding, adaptive change for host cell tropism, and affected the viral antigenic epitopes. We also found the nucleotide 945G in PB2 of the trachea tissue, while 945A was found in the lung, colon, and fecal specimens (Fig. S1). This led to the nonsynonymous mutation of M315I in the virus isolate from the trachea. This M315I amino acid change may be necessary for adaptation to grow in MDCK cells since the 315I was found in virus isolates from all kinds of clinical specimens.  [24,25]. The phylogenetic analysis demonstrated that our viruses belonged to genetic clade 1 and were closely related to other H5N1 isolates in Thailand, Vietnam, and Cambodia  (Figure 3). The finding led to the speculation that our viruses were reassortants. We then further phylogenetically analyzed each gene segment individually. The result confirmed that these H5N1 viruses were reassortants, which obtained 7 gene segments: PB2, PB, PA, NP, NA, M, and NS from A/Tree sparrow/Thailand/VSMU-14-KR/2005-like virus and the HA gene segment from A/Chicken/Kohn Kaen/NIAH330/2004like virus (Fig. S3).

Genetic Variability across H5 Virus
Clades. The HA amino acid sequences across clades of human H5N1 and H5N6 viruses circulating from 1997 to 2018 were compared. The result showed the genetic variability at positions 47, 56,      BioMed Research International 177, and 293 in our lung virus isolate. In contrast, the HA sequences from all 4 autopsy specimens and the virus isolates from tracheal and fecal specimens were highly conserved. However, the estimating ratio of the nonsynonymous (dN) to synonymous (dS) substitutions (dN/dS) of <1 indicated the negative or purifying selective pressure of these positions across the genetic clades on a codon-by-codon basis ( Table 3).
3.5. Analysis for H5 Virulence Determinants. Amino acid residues in the autopsy specimens and the virus isolates were analyzed following the H5N1 genetic change inventory [26], as demonstrated in Table 4. The result showed that the amino acid sequences in all autopsy specimens and the virus isolates contained PQRERRRKKRG at the HA cleavage site, indicating their high virulence. The residues 190E, 225G, 226Q, 227S, and 228G (H3 numbering), indicated that all viruses preferred the α2,3-galactose linked-sialic acid avian type receptor. The H274Y substitution, which indicated the oseltamivir and peramivir resistance, was not present in both the autopsy specimens and the virus isolates, even though the patient received a full course of oseltamivir treatment before his death. On the other hand, both the autopsy specimens and the virus isolates contained the S31N substitution in the M2 protein, indicating the amantadine and rimantadine resistance.

Discussion
H5 HA, in combination with various NA subtypes, were the most frequent HPAI viruses that cause human infections. The most common subtype, the H5N1 virus, has caused human infections in 16 countries, resulting in 862 cases with 455 deaths or a fatality rate of 52.8% after its reemergence in 2003 until 9 December 2020. The latest human H5N1 virusinfected case occurred in the Lao People's Democratic Republic (Lao PDR) and was reported to the WHO on 31 October 2020. The human H5N6 virus infection was less common with the latest case reported on 1 December 2020 from China [8]. Recently, the first human H5N8-infected case was reported in Russia in December 2020 [27]. Even though the information on the pathology of H5N1 fatal cases is quite limited, the concordant results suggested that the H5N1 HPAI virus could disseminate and replicate in organs beyond the respiratory tract [13,28,29]. Furthermore, the level of natural intrinsic heterogeneity, mutation frequencies, and single-nucleotide polymorphisms (SNPs) of influenza A viruses was previously reported by several investigators [30,31]. With the NGS technique available lately, this study could demonstrate the complete H5N1 genomic sequences in all 4 kinds of autopsy specimens: the trachea, lung, colon, and intestinal feces. The nucleotide sequence alignment showed 8 viral quasispecies with 6 gene segments, i.e., PB2, PB1, PA, HA, NP, and NA, but not in M and NS. The NGS technique provided in-depth information on the natural amino acid residues present in the viral genomes in each autopsy specimen.
Our previous attempts to isolate the virus from various autopsy tissues of a few H5N1 patients did not succeed.  BioMed Research International Failure to isolate the virus from tissues that contain influenza viral genomes is not well understood. The infectious viruses might be destroyed by the tissue reactions resulting from a cytokine storm at the end stage of the disease. However, our attempt to isolate the virus was successful in the study patient who died of H5N1 HPAI virus infection in 2006. The H5N1 viruses were isolated from the trachea, lung, and intestinal fecal samples, but not the other organs. The nucleotide/amino acid sequences of these virus isolates were aligned against those of the autopsy specimens. A total of 27 nucleotide changes resulting in 18 amino acid substitutions were found. Mutational change(s) found in our virus isolates occurred during passaging in MDCK cells which had been going on until the 4 th passage before sequencing. Our study showed that the mutational change was more frequent with the lung isolate and, in particular, in the HA and NA genes.
This finding goes along well with basic influenza virology that HA and NA genes are highly variable compared to the other genes. There were 4 amino acid changes in the HA protein of our lung virus isolates that were not present in the other viruses in this study and other virus clades. However, the determination for natural selection showed the dN/dS of <1, which suggested the negative selection pressure of these residues. Our study found the point mutations 177G in viruses from the trachea and fecal specimens and 177W from the lung virus. According to the 3D structure, the amino acid position 177 did not locate on the external surface of HA1 globular head; nevertheless, it was adjacent to the receptorbinding residues 179H, 186E, 190L, and 191Y. We suggested that the amino acid change at position 177 may not directly responsible for binding to sialic acid host cell receptors.    Nevertheless, we cannot exclude the possibility of the steric hindrance phenomenon arose from the mutational change at amino acid position 177 that might affect its neighboring amino acid residues in binding to the receptor biding site. Moreover, we showed that the M315I amino acid change may be essential for the trachea virus to adapt for growth in MDCK cells since the 315I was found in virus isolates from all types of clinical specimens. Our study suggested that the changes of amino acid residues in multiple genes might be required for viral adaption to grow in a new host species. All virus isolates of different origins could grow in MDCK cells and yielded comparable titers to the other H5N1 viruses in our laboratory. Phylogenetic analyses relying on the concatenated PB2, PB1, PA, HA, NP, NA, M, and NS genes or each gene suggested that our H5N1 viruses were the reassortants generated from two clade 1 viruses: A/Tree sparrow/Thailand/VSMU-14-KR/2005-like virus and A/Chicken/Kohn Kaen/-NIAH330/2004-like virus. Our group and the other Thai investigators reported the reassortants generated from two H5N1 clade 1 viruses in avian hosts [32,33]. In this study, we reported the reassortant H5N1 virus infection in a Thai patient. Nevertheless, the reassortant H5N1 virus infection in humans had been previously reported in Vietnam [34,35] and Cambodia [36].

BioMed Research International
Furthermore, the amino acid substitution S31N in the M2 protein of our viruses suggested that the virus was resistant to amantadine and rimantidine [37], and the 274H residue in the NA protein suggested that the viruses were susceptible to oseltamivir. This patient died because the oseltamivir was prescribed late, and additionally, he was superimposed to Acinetobacter baumannii infection. This study also analyzed for virulence determinants present in the H5N1 HPAI viral genomes based on the Center for Disease Prevention and Control guidelines [26]. Our viral genomes contained amino acid residues 89V, 309D, 339K, 477G, 495V, 676T, and 627K in PB2 and residues 30D and 215A in M1, all of which are related to enhance viral virulence, increased polymerase activity, or increased replication efficiency. In contrast, our viral genomes also contained the residues 598L in PB1 and 149S and 357I in PA that are related to decreased polymerase activity and replication efficiency. Nevertheless, there was no change in the amino acid residues that determine the H5N1 HPAI viral virulence, particularly, the presence of multiple basic amino acids at the HA cleavage site and the 20 amino acid deletion in NA. The adaptation change for survival in a new host species might involve many genes. Still, the HA and NA genes that determine cell tropism and virulence determinants are likely to be more important than the others.

Conclusion
H5N1 avian influenza virus disseminated and infected the organs beyond the respiratory tract. This study was the first to report the complete viral genomes in the archival specimens from an autopsy: the lung, trachea, colon, and intestinal fecal samples. The H5N1 virus was isolated from the lung, trachea, and fecal samples using MDCK cells. The phylogenetic analysis demonstrated that this virus was a reassortant in which the HA segment from one avian species reassorted with the 7 segments of another virus from different avian species. The study also found novel amino acid substitutions.

Data Availability
All data generated or analyzed are included within the manuscript and supplementary materials.