Full-Length Genomic Sequence of Subgenotype IIIA Hepatitis A Virus Isolate in Republic of Korea

Hepatitis A virus is known to cause acute hepatitis and has significant implications for public health throughout the world. In the Republic of Korea, the number of patients with hepatitis A virus infection has been increasing rapidly since 2006. In this study, the Kor-HAV-F strain was identified as subgenotype IIIA by RT-PCR, and its identity was confirmed by nucleotide sequencing and alignment analysis. Moreover, detailed phylogenetic analysis indicated that the Kor-HAV-F strain clustered into subgenotype IIIA, including strains isolated in Japan, Norway, and India. The entire amino acid sequence of the VP1 and 2A regions was compared with that of the reference strains isolated in various countries. We found 2 amino acid changes (T168A and L96P, resp.) in the VP1 and 2A regions, which had not been found in any other hepatitis A virus strain. To our knowledge, this study is the first to report the full-length sequence of a hepatitis A virus isolated in the Republic of Korea.


Introduction
Hepatitis A virus (HAV), known to cause acute hepatitis, has significant implications for public health worldwide. HAV infection is endemic in developing countries, including Thailand, India, and Mexico. In contrast, industrialized countries have a decreasing exposure rate to HAV, due to improvements in hygiene and sanitation conditions [1]. Direct person-toperson spread by the fecal/oral route is the most important mean of transmission of hepatitis A, and infection with HAV can cause sporadic and epidemic acute hepatitis in humans [2,3].
Thus far, HAV strains have been classified into 3 human and simian genotypes (I-VI), of which genotypes I, II, and III are found in humans; these genotypes are further divided into subgenotypes IA and IB, IIA and IIB, and IIIA and IIIB, respectively [7,8]. Most of the human HAV strains belong to genotypes I and III [8][9][10]. An HAV genotype is defined as a group of viruses with >85% nucleotide sequence identity. The HAV genotypes are further classified into subgenotypes with sequence variability of <7.5% [11].
In this study, the whole genome sequence of a South Korean HAV subgenotype IIIA isolate was analyzed and compared with that of available reference strains to determine the genetic relationship along the entire genome in the Republic of Korea.

Stool Sample
Collection. An HAV-positive stool sample was isolated from a 35-year-old female patient with fever and myalgia, in Seoul, the Republic of Korea, in October 2011. The sample was obtained from the Waterborne Virus Bank (Seoul, the Republic of Korea). The stool sample was stored at −70 ∘ C.

Viral RNA Extraction.
The stool sample was diluted to a ratio of 1 : 10 in phosphate buffered saline (PBS), mixed, and centrifuged. From 140 L of this diluted stool sample, viral RNA was extracted using a QIAamp viral RNA mini kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions. A 50-L volume of elute was obtained and stored at −70 ∘ C until analysis.

Reverse Transcription-PCR (RT-PCR).
For the detection of HAV, reverse transcription polymerase chain reaction (RT-PCR) was performed with a OneStep RT-PCR Kit (Qiagen, Hilden, Germany), with HAV-F and HAV-R primers based on the sequence of the VP1-2A junction region (Table 1). To facilitate sequencing of the entire genome of the detected HAV strain, RT-PCR was performed with the OneStep RT-PCR Kit (Qiagen, Hilden, Germany), with 13 pairs of newly designed primer sets (Table 1). We used 5 L of viral RNA as template and 20 L of the premixed kit solution. The PCR was carried out in a PCR System S1000 thermal cycler (BIO-RAD, CA, USA) according to the following protocol: an initial RT step at 50 ∘ C for 30 min, followed by PCR activation at 95 ∘ C for 15 min and 40 cycles of amplification, each consisting of 1 min at 95 ∘ C, 1 min at 52 ∘ C to 54 ∘ C, and 1 min at 72 ∘ C, with a final extension step of 10 min at 72 ∘ C. The PCR products were then electrophoresed on a 2% ethidium bromide-stained agarose gel.

Cloning and Sequencing of RT-PCR Products.
The amplified fragments were purified from the gel using the HiYield Gel/PCR DNA Extraction Kit (RBC, Taipei, Taiwan). These products were then cloned into the pGEM-T Easy Vector (Promega, Madison, WI, USA) according to the manufacturer's recommendations and transformed into competent E. coli DH5 cells (RBC, Taipei, Taiwan). Transformants were selected on Luria-Bertani (LB) agar media (Duchefa, Haarlem, The Netherlands) containing 50 g/mL ampicillin. Clones were expanded overnight at 37 ∘ C in 2 mL LB media containing 50 g/mL ampicillin, centrifuged at 4 ∘ C for 10 min at 800 ×g, resuspended in 600 L fresh LB media with 10% glycerol, and stored at −80 ∘ C until required for further use. Plasmid DNA was purified using the HiYield Plasmid Mini Kit (RBC, Taipei, Taiwan) according to the manufacturer's recommendations. DNA was sequenced by Cosmo Genetech (Seoul, the Republic of Korea).

Sequence and Phylogenetic
Analysis. The sequence data of the composite sequences of the 13 plasmids were aligned using the Clustal W method with the DNASTAR software (DNAStar, Inc., Madison, WI, USA) and CLC Main Workbench Program version 6.7.1 (CLC Bio, Katrinebjerg, Denmark) to obtain the entire genome sequence. Dendrograms were constructed using the neighbor-joining method with MEGA software version 4.0.

Nucleotide Sequence Accession Number.
The nucleotide sequence of the HAV-positive stool sample isolate was submitted to the GenBank database under the following accession no.: JQ655151.

Results and Discussion
Globally, 1.4 million of patients affected by HAV infection have been reported annually. The prevalence rate of HAV in different countries varies with income and hygiene levels. According to data from Korea Centers for Disease Control and Prevention (KCDC), HAV infection in the Republic of Korea has been increasing consistently since 2001. Moreover, the number of patients with HAV infection has increased more rapidly since 2006. In 2009, the number of patients with HAV infection was approximately 143 times higher than that in 2001.
Generally, HAV IA was known as major subgenotype in the Republic of Korea. In 1994, 100% of HAV strains were subgenotype IA although this prevalence significantly decreased from year to year, reaching 0% in 2008, whereas HAV IIIA was first detected in 2005 and exhibited a peak prevalence of 100% in the Republic of Korea in 2006 [22,[25][26][27][28]. However, the 2 prevalent subgenotypes IA and IIIA cocirculated in the Republic of Korea since 2005 [25]. The difference in circulating subgenotypes differs depending on the region and period. Furthermore, 5 outbreaks associated with HAV infection occurred since 2005. Among them, 3 outbreaks were associated with subgenotype IIIA [29]. Despite their clear importance, complete genome analysis of subgenotype IIIA has not been yet reported in the Republic of Korea.
In this study, we examined the nucleotide and amino acid similarities and phylogenetic tree analysis between the partial and complete sequences of HAV reference strains. The Kor-HAV-F isolate had a genomic length of 7386 nucleotides (nt), excluding the poly(A) tract at the 3 terminus, and was similar to those of the reported HAV isolates for which the entire genomic sequence is known. Moreover, the isolate possessed a single long ORF of 6684 nt that encoded a polyprotein of 2228 amino acids. The single ORF was divided into 3 functional regions termed P1 (2373 nt), P2 (1893 nt), and P3 (2418 nt).
To assess the genetic relationship between the Kor-HAV-F strain and the other reference strains isolated worldwide, the sequences of the ORFs and the whole genomes were subjected to multiple sequence alignment analysis and phylogenetic analysis ( Table 2). Sequence comparison with 33 known HAV isolates revealed that the Kor-HAV-F strain shares the greatest identity with the NOR-21 HAV strain (AJ299464), which was isolated from Norway, with 98.8% identity at nucleotide and 99.9% identity at amino acid sequence levels. The Kor-HAV-F strain clustered with the NOR-21, HA-JNG04-90F, and CP-IND strains in a monophyletic branch. The 3 and 2 strains were isolated from India and Japan, respectively, belonged to 2 distinct clusters within subgenotype IIIA. The identities of the Kor-HAV-F strain with other subgenotypes (1A, IB, IIA, IIB, IIIB, and V) were within the range of 80.6-88.6% (nucleotide identity) and 92-98.5% (amino acid identity), respectively (Figure 1(a)). Similarly, in the phylogenetic analysis of the P1 and P2 regions, the Kor-HAV-F strain clustered with those strains (Figures 1(b) and 1(c)). However, in the case of the P3 region, the Kor-HAV-F strain clustered with the NOR-21 and HA-JNG04-90F strains in a monophyletic branch (Figure 1(d)). Partial phylogenetic analysis of P3A indicated that the Kor-HAV-F strain clustered with only the NOR-21 strain in a monophyletic branch (Figure 1(e)).
Additionally, to assess the genetic relationship between Kor-HAV-F and the Korean strains, the sequences of the VP3/VP1 and VP1/2A junctions were subjected to multiple sequence alignment and phylogenetic analysis. In the case of VP3/VP1, sequence comparisons revealed that the Kor-HAV-F strain shares the greatest identity with the NOR-21 strain (96.7% nucleotide identity), whereas sequence identity with the Korean strains was relatively low (92.9-93.4% nucleotide identity). These Korean strains belonged to a distinct cluster within the subgenotype IIIA (Figure 2(a)). In the case of VP1/2A, sequence comparisons revealed that the Kor-HAV-F strain shares the greatest identity with the 21 (FJ372963) strain isolated in 2005 (98.8% nucleotide identity). The Kor-HAV-F strain clustered with the Korean strains, other than the Korean and Japan strains, in a monophyletic branch (Figure 2(b)).
Nucleotide and amino acid identities between Kor-HAV-F and representatives of each genotype are shown in Table 1. Kor-HAV-F sequence identity with the other genotypes was in the range of 79%-89.8%, except for subgenotype IIIA in the coding region P1-P3, whereas amino acid sequences differed to a greater extent (89.4-99.1% identity). Compared with subgenotype IA, Kor-HAV-F showed higher identity (82. .3% at nucleotide level; 92.3-97% at amino acid level) with the H2 strain (isolated in 2007, subgenotype IA) than with the GBM strain, isolated in 1976, (81.7-83.1% at nucleotide Table 2: Percent identity between the Kor-HAV-F and the strains belonging to other genotypes, at the nucleotide and amino acid levels.     level; 91.9-96.7% at amino acid level), at coding regions P1 and P3. However, Kor-HAV-F showed higher identity with the older strains than with the recent strains belonging to the subgenotype IB. Among the P1-P3 coding regions, the P3A site had the highest variability at nucleotide and amino acid levels.
Amino acid sequences of the VP1 region (300 amino acids) were compared with diverse subgenotype strains reported from various countries, including the Republic of Korea. The Kor-HAV-F strain showed a distinct substitution of Asn instead of Thr at position 168, which was not found in any other subgenotype. The Kor-HAV-F strain showed 92-99.7% amino acid variation within the P1 region. Only the Kor-HAV-F strain, including subgenotypes IIIA, showed 2 amino acid changes (K34R and V42I) in the P1 region. Furthermore, genotypes III commonly showed 10 amino acid changes (E15K, I/M28L/V, R37Q, S266T, L270 M, T272S, S274T, S277D, A281L, and R298 K; Figure 2(a)). Inconsistent with previous reports, we found 12 amino acid changes in the VP1 region, but consistent with previous reports for other HAV subgenotype IIIA strains, our results also showed the existence of C-terminus cleavage sites as Leu 264/265 Asn, Glu 273/274 Thr, and Glu 285/286 Ser in the HAV VP1 protein (Figure 3(a)).
In the case of the 2A region (189 amino acids), amino acid sequences were compared with diverse subgenotypes reported from various countries. The Kor-HAV-F strain showed a distinct substitution of Phe for Leu at position 96, which was not found in any other subgenotype. The Kor-HAV-F strain showed 87.3-99.5% amino acid variation within the P1 region. Only the Kor-HAV-F strain, including subgenotypes IIIA, showed a single amino acid change (S/C148A) in the 2A region. Furthermore, genotype III commonly showed 8 amino acid changes (K/M/V39I, L42V, E55D, R64 K, L/V66I, D150E, V183I, and Q189 K). In particularly, genotypes II and III showed a single amino acid change at N/Y128H (Figure 3(b)).
In the Republic of Korea, 4 patients with HAV subgenotype IIIA have travelled overseas before the onset of symptoms, according to a previous study. Similarly, the HAV patient in this study had visited Taiwan before the onset of symptoms. Moreover, we found 2 novel amino acid changes that had not been reported in Korea earlier. Hence, it is assumed that this patient probably acquired the virus in Taiwan.
Korea, until the 1980s, was classified as a high endemic country, and most infections occurred in childhood. However, opportunities of infections for exposure decreased, with     improvements in sanitation and socioeconomic conditions; therefore, susceptible populations of the infection is changed to adolescents and young adults [18,30,31]. Since 2001, the incidence of acute hepatitis was officially reported through the national sentine surveillance system of Korean Center for Disease Control and Prevention; after that, the incidence of acute hepatitis was increased steadily, and it was sharply raised from 2006 [32]. This is the first study reporting the full-length sequence of an HAV isolated in the Republic of Korea. This sequence will be useful for comparison with the full-length HAV sequences of other strains currently identified globally and in future. The whole-genome sequence data derived in this study may prove useful not only for more accurate diagnoses of HAV, but also for basic research relating to the elucidation of genetic functions. Furthermore, it may prove useful for the prediction of newly appearing variants via comparison with HAV strains worldwide, in fundamental research for vaccine development, and eventually, in the field of public health, with identification of new emerging strains of HAV.

Conclusions
This study, the first to report the full-length sequence of a HAV isolated in the Republic of Korea, is meaningful as it provides a full-length HAV sequence standard for future evolutionary studies. It may also prove useful in the field of public health by facilitating the diagnosis and predicting new emerging variants. Further characterization of fulllength sequences of diverse HAV strains circulating worldwide is needed.