Evolving Trends in the Hepatitis C Virus Molecular Epidemiology Studies: From the Viral Sequences to the Human Genome

Hepatitis C virus (HCV) represents a major worldwide public health problem. The search for the key molecular biomarkers that may provide insight on the basis of the differences in disease progression, severity, and response to therapy is crucial for understanding the natural history of HCV, for estimating the burden of infection and for developing preventive interventions. Initially, molecular epidemiology studies have focused on studying the viral genetic diversity (genotypes, genetic variants, specific nucleotide and amino acid substitutions). However, the clinical heterogeneities of HCV infection and the imperfect predictability of the response to treatment have suggested the need to search for host genetic biomarkers. This led to the discovery of genetic polymorphisms playing a major role in the evolution of infection, as well as in treatment response and adverse effects, such as IL-28B, ITPA, and IP-10. As a consequence, nowadays the focus of molecular epidemiology studies has turned from the viral to the human genome. This paper will cover recent reports on the subject describing the most relevant viral as well as host genetic risk factors analyzed by past and current HCV molecular epidemiology studies.


Introduction
HCV represents a major health problem with approximately 3% of the world population-that is, more than 170 million people-infected.While only 20-30% of individuals exposed to HCV recover spontaneously, the remaining 70-80% develop chronic HCV infection (CHC) [1].Moreover, 3-11% of those people will develop liver cirrhosis (LC) within 20 years [2], with associated risks of liver failure and hepatocellular carcinoma (HCC) [3] which are the leading indications of liver transplantation in industrialized countries [4].The socioeconomic impact of HCV infection is therefore tremendous and the burden of the disease is expected to increase around the world as the disease progresses in patients who contracted HCV years ago.
Since the discovery of HCV more than 20 years ago [5], epidemiological studies have described complex patterns of infection concerning not only the worldwide prevalence of this virus but also its clinical presentation and its therapeutic response.
HCV presents highly variable local prevalence rates between countries and within countries [6]; for example, in Argentina the overall prevalence of HCV infection is close to 2%, but higher rates have been reported in different small rural communities (5.7-4.9%)[7,8].
The outcome of HCV infection is-as previously stated-heterogeneous ranging from an asymptomatic selflimiting infection to LC and HCC.Recent studies have concluded that this difference appears to be dependent on the route of transmission and other host and viral related characteristics [9][10][11][12].
The current standard of care (SOC) for CHC is based on the combination of pegylated interferon-alfa (PEG-IFN) and ribavirin (RBV) for 24 or 48 weeks.A sustained virological response (SVR), defined by having undetectable serum HCV RNA 24 weeks after cessation of treatment, is associated with permanent cure in more than 99% of cases [13].However, therapy is expensive and is associated with numerous side effects [14], which reduces its effectiveness in many cases (e.g., dialysis and HIV-infected patients, transplant recipients, etc.) [15], and sometimes requires dose reduction and premature treatment discontinuation, thus decreasing the rate of success.In addition to its limited efficacy, response to therapy is also variable and viral and host characteristics can influence whether patients achieve a SVR.
The search for the key molecular biomarkers that may provide insight on the basis of the differences in disease progression, severity, and response to therapy is crucial for understanding the natural history of HCV, for estimating the burden of infection, and for developing preventive interventions.
In this regard, molecular epidemiology studies have first focused on identifying and measuring viral risk factors by analysis of HCV genetic diversity.As these failed to explain a large proportion of the variability, the existence of host genetic risk factors for HCV infection had been strongly suggested [10,11].However, little progress could be made in their identification.
Recently, the completion of the Human Genome Project has led to the beginning of a new era of scientific research, including a revolutionary approach: the genome-wide association study (GWAS) which uses high-throughput genotyping technology usually for SNPs, ranging from 300,000 to 900,000 SNPs in each sample.Through these studies genetic factors strongly associated with disease susceptibility and drug response among HCV-infected patients were finally detected [16].As a consequence, nowadays the focus of molecular epidemiology studies has turned from the viral to the human genome.
The aim of this paper is to describe the most relevant viral as well as host genetic risk factors analyzed by past and current HCV molecular epidemiology studies.

Molecular Epidemiology of HCV Genetic Diversity and Its Clinical and Therapeutic Implications
Hepatitis C virus (HCV) is an enveloped RNA virus which contains a single-stranded, positive strand RNA molecule of approximately 9600 nucleotides [17].Following the determination of the viral nucleotide and amino acid sequences [18], it was reported that different isolates of HCV as well as sequences isolated from each individual show substantial nucleotide sequence variation distributed throughout the genome [19,20].The HCV genome contains both highly conserved and highly variable regions; for example, regions encoding the envelope proteins are the most variable, whereas the 5 noncoding region (NCR) is the most conserved with minor heterogeneity [21].For these reasons, several researchers have considered the 5 NCR to be the region of choice for viral detection.Sequence analysis performed on isolates from different geographical areas around the world has revealed that it is possible to classify HCV into six different genotypes, labelled with numbers (1 to 6) [21].Moreover, a seventh genotype was recently reported [22].The degree of sequence variation between genotypes of HCV is similar to that observed between variants of other viruses, such as the serotypes of the flavivirus, dengue virus, or between poliovirus types 1, 2 and, 3. HCV genotypes are further divided into multiple epidemiologically distinct subtypes-named with letters-due to the difference in the nucleotide sequence of the subgenomic regions such as core/E1 and NS5B (nonstructural 5B) [23].
Molecular epidemiological studies have shown that HCV genotypes display significant differences in their global distribution and prevalence.Genotypes 1, 2, and 3 are widely distributed throughout the USA, Europe, Australia, and East Asia (Japan, Taiwan, Thailand, and China), whereas geographical distributions of other genotypes are more restricted [24][25][26].Genotype 4 is largely confined to the Middle East, Egypt, and Central Africa.Genotypes 5 and 6 prevail in South Africa and Southeast Asia, respectively [24,25,27], and genotype 7 is found predominantly in central Africa [22].Thus, genotyping has become a useful method to determine the source of HCV transmission in an infected localized population.
Many risk factors for disease progression and development of HCC have been reported, such as male gender, age at infection, diabetes, hepatic fibrosis (particularly cirrhosis), and greater degrees of hepatic inflammation, iron overload, steatosis, coinfection with HBV, alcohol abuse, smoking, and obesity [28][29][30][31].On the basis of the observation that for most RNA viruses the existence of considerable sequence differences between serotypes has remarkably little effect on the phenotype of a virus, there would be no logical reason to suspect the existence of major differences between genotypes of HCV in their clinical course or disease associations.However, several cross-sectional studies-where the frequencies of infection with different genotypes are compared among patients with different disease outcomes-have concluded that HCV genotypes may be related to disease progression.
Although these studies have frequently produced contradictory results, it is generally agreed that genotype 1b may be associated with more severe liver disease than infection with other genotypes [32][33][34][35][36].There is also a greater consensus that infection with genotype 1b predisposes towards the development of HCC [32,35,[37][38][39], with only a few negative or contrary reports [40,41].Regarding liver transplantation in HCV-infected patients, genotype 1b is also associated with a higher rate of active disease after transplantation and graft destruction [42,43].
Of the estimated 170 million HCV cases in the world, over 50% occur among injection drug users (IDUs) [44].Thus, IDUs are considered to be the main risk group for HCV infection and act as a reservoir for this bloodborne virus.Several reports have demonstrated a statistically significant relationship between HCV genotype 3, the injecting drug abuse, and a younger age of infection [45][46][47][48][49].Moreover, by comparing patients infected with genotype 3 and those infected with other genotypes, numerous groups have revealed an association between this genotype and hepatic steatosis [50][51][52][53][54][55], and a severe histopathological manifestation of CHC which can improve after achieving SVR with the antiviral treatment [56][57][58].
In addition to their clinical importance as predictors of disease progression, HCV genotypes also offer essential information to those providing or receiving treatment.Present data strongly indicate that HCV genotype is the key determinant of response to IFN-alpha-based treatment regimens [59][60][61][62].
The actual SOC therapy, in a patient with hepatitis C, yields a sustained response in approximately 55% of the cases.Patients with genotypes 1 and 4 generally exhibit a poorer response to IFN-based therapy than those with genotypes 2 and 3, probably due to slower viral kinetics [63].HCV genotype 5 appears to be an easily treatable virus, with response rates compatible with those of genotypes 2 and 3 after a 48-week course of therapy [61,62].Treatment response in genotype 6 HCV patients may be at an intermediate level between that observed in genotype 1 and genotypes 2 and 3.
Interestingly, several studies have demonstrated that the chance to respond to IFN treatment is also related to the baseline viral load.While patients with a high viral load (>800 000 UI/mL) are less sensitive to the treatment, patients with a low viral load (<800 000 UI/mL) respond better to therapy [69].Thus, patients with genotype 1, low baseline viral load, and rapid virological response (HCV RNA negative in serum after 4 weeks of treatment) may be treated for 24 weeks, while patients with genotype 3, high baseline viral load, and without rapid virological response may require 48 weeks of treatment.
Interestingly, during its replication, HCV has the particularity to generate genetic variants, which exhibit 10% of nucleotide divergence among them.Thus, sensitivity to HCV therapy can be variable due to the emergence of variants with mutations that confer a different sensibility to the treatment.
In this regard, specific nucleotide and amino acid substitutions in the viral genome have been reported to be correlated with the effect of both IFN therapy and PEG-IFN plus ribavirin combination therapy.Two amino acid regions of NS5A have been described and are thought to play a role in response to IFN treatment: (1) IFN sensitivitydetermining region (ISDR) [70,71] and (2) IFN/RBV resistance-determining region (IRRDR) [72].The outcome of IFN therapy is related to the total number of amino acid substitutions in these regions.Recently, others mutationswithin core (amino acid 70 and 91), E2 (PePHD), NS5A (PKRBD and variable region 3)-have been implicated in influencing the response to IFN therapy in patients infected with genotype 1 of HCV [73][74][75][76].

Molecular Epidemiology of HCV-Related Single-Nucleotide Polymorphisms (SNPs) in the Human Genome and Their Clinical and Therapeutic Implications
After the beginning of the genomic era, studies of human genetics have been expected to alter clinical management for many diseases, including infectious diseases.Yet, to date, there are few examples of the use of such information in routine clinical practice.One of the most promising examples is the case of HCV.
Responsiveness to HCV therapy depends not only on viral factors but also on host factors.Older age, male sex, cirrhosis, steatosis, insulin resistance, diabetes, African American ethnicity, and weight (BMI) are all events associated to poor response to PEG-IFN plus RBV treatment [77,78].Comorbidities such as HIV and/or HBV coinfection, excess alcohol intake, and drug use are generally associated with lower SVR rates [79].It seems that cannabis receptor stimulation is associated with lower response to IFN treatment [80].Moreover, it has been recently reported that patients with a history of depression who were not receiving antidepressants and active intravenous drug users are more likely to fail treatment for HCV genotype 2 or 3 and will need additional support [81].
Initially, candidate gene approaches had been adopted to identify host factors related to HCV therapy response, SNPs, copy number variation (CNV), or insertion/deletion of genes [82][83][84].However, these approaches could latently find weak associations and show significant differences because only one or a limited number of SNPs or gene loci are detected in candidate genes.
In 2009-2010, on the basis of the GWAS, four independent groups assessed the role of genetic variation on response to PEG-IFN plus RBV combination therapy for CHC patients infected with genotype 1 [85][86][87][88].Although different ethnicities (European, African American, Hispanic, Australian, and Japanese) have been compared in these studies, the conclusive finding was-in all cases-that polymorphisms in or near the IL-28B gene, also known as IFN-3λ, on chromosome 19 strongly determined the outcome of HCV therapy.
Ge et al. identified a genetic polymorphism (rs12979860) in the IL-28B gene [86].Individuals with the CC genotype showed the significant association with an approximately twofold change in response to PEG-IFN plus RBV treatment compared with those with the TT genotype, both among patients of European ancestry and African Americans.An important finding in this study is the strong correlation between being a carrier of this SNP and SVR rates in diverse ethnic groups, which is significantly more frequent in European (53-85%) and Asian populations (90%) than in African-Americans (23-55%).This SNP could finally explain much of the recognized ethnic disparity between African Americans and Europeans in treatment response rates.
On the other hand, Suppiah et al. and Tanaka et al. revealed the strong association of particular haplotypes of SNP rs8099917 (8 kb upstream of IL-28B) with SVR in the European and Japanese population infected with HCV genotype 1 and treated with PEG-IFN plus RBV therapy [85,87].Homozygotes for the risk allele (rs8099917 G-allele) showed 2-fold higher risk of treatment failure than that of major allele homozygotes.
In the fourth GWAS that was published on the response to HCV therapy, Rauch et al. studied Swiss patients infected with HCV genotype 1, 2, 3, or 4 [88].The strongest Epidemiology Research International association with treatment failure was also found with rs8099917.Interestingly, this SNP was not associated with the response to PEG-IFN plus RBV therapy in genotype 2 or 3 patients.The contribution of host factors to genotype 2 or 3 clearance would be low because HCV genotype 2 or 3 is likely to be eliminated by the standard therapy compared with genotype 1.
In contrast, Kawaoka et al. revealed that for Japanese patients treated with PEG-IFN plus RBV, rs8099917, and viral load were independent predictive factors for SVR in genotype 2b but not in genotype 2a.Conversely, in patients treated with interferon monotherapy, viral load and rs8099917 were independent predictive factors for SVR in genotype 2a but not in genotype 2b [89].Moreover, Sarrazin et al. reported a significant association of the CC genotype of rs12979860 with SVR in European patients infected with HCV genotypes 2 and 3 [90].
For HCV-infected patients with end-stage chronic liver disease, orthotopic liver transplantation (OLT) is currently the treatment of choice [91].Several reports have shown that post-OLT patient and graft survival are significantly negatively affected by HCV recurrence after OLT [92,93].This can be mitigated by achievement of an SVR with PEG-IFN plus RBV therapy [94].However, many patients cannot tolerate curative doses or do not respond to therapy [93][94][95].Therefore, as it would be ideal to be able to predict which patients would benefit from PEG-IFN plus RBV therapy for recurrent HCV, it was recently reported that variants of the SNPs in or around the IL-28B gene from liver donors are also strongly associated to the degree of graft inflammation and the response to therapy of HCV-infected liver recipients [96][97][98].
In addition to IL-28B SNPs, other host molecular predictors of response to PEG-IFN plus RBV therapy have been documented.Human leukocyte antigen (HLA) and killer cell immunoglobulin-like receptors (KIRs) are highly polymorphic genetic loci whose gene proteins interact with each other.HLA-C molecules present ligands for KIR2DL receptors, with a functionally relevant dimorphism determining KIR specificity: for example, HLA-C group 1 (HLAC1) alleles, identified by Ser77/Asp80 of the HLA-C alpha 1 domain, are ligands for the inhibitory receptors KIR2DL2 and KIR2DL3 and the activating receptor KIR2DS2 [99,100].KIR2DL3 and its ligand, HLA-C1, have been associated with an increased likelihood of spontaneous [101][102][103] and treatment-induced HCV clearance [102,103].This association is attributed to differential natural killer (NK) cell activation and function in the context of this KIR/HLA interaction [104].In a recent cross-sectional study, Suppiah et al. concluded that IL28-B, HLA-C, and KIR variants additively predict response to therapy in CHC European patients [105].
Furthermore, interferon-gamma inducible protein 10 kDa (IP-10 or CXCL10) is a chemotactic CXC chemokine produced by a variety of cells, including hepatocytes [106,107].IP-10 targets the CXCR3 receptor and attracts T lymphocytes, NK cells, and monocytes to sites of infection [107][108][109].Levels of IP-10 at onset of therapy are reportedly elevated in patients infected with HCV of genotypes 1 or 4 who do not achieve SVR [110].In difficult-to-treat genotype-1-infected HCV patients, cut-off levels of IP-10 in plasma of 150 pg/mL (approximately equal to 2 standard deviations above the mean IP-10 level of HCV seronegative blood donors) and 600 pg/mL have yielded positive and negative predictive values for SVR of 71% and 100%, respectively [111].IP-10 in plasma is mirrored by intrahepatic IP-10 mRNA and strongly predicts the HCV RNA decline during the first days ("first phase decline") during IFN/RBV therapy for all HCV genotypes [112].According to recent reports, the assessment of both pretreatment IP-10 and IL28-B SNPs augments the prediction of the first phase decline in HCV RNA and, therefore, final therapeutic outcome [113,114].
As previously stated, in addition to its limited efficacy, SOC therapy is expensive and is associated with numerous side effects.In particular, anemia is a very common adverse effect of HCV combination treatment caused by RBVinduced hemolysis and IFN-related bone marrow toxicity.RBV-induced hemolytic anemia (HA) is usually reversible and dose related [115,116], but may require significant dose reductions possibly affecting efficacy, and is a cause of withdrawal from therapy in 10-14% of patients [64,[117][118][119][120].Several risk factors for RBV-induced HA have been identified, for example, age, female gender, dose and plasma concentration of RBV, baseline hemoglobin and platelets, and haptoglobin phenotype [121][122][123][124].However, the severity of RBV-induced HA shows great variability among individuals, suggesting that the genetic background may exert a profound influence on the clinical expression of this adverse effect.
With the aim to detect predictor biomarkers that could evaluate possible risks over benefits from currently available treatment and thus avoid these side effects in patients who will not be helped by the treatment, as well as reduce the substantial cost of the treatment, recent studies indicated that genetic variants leading to inosine triphosphatase (ITPase) deficiency, a benign red cell enzymopathy [125], protect against hemolytic anemia in CHC patients receiving RBV.In an American GWAS, a strong association was shown between hemoglobin reduction after 4 weeks of treatment and SNP rs6051702 [126].The association was explained by two known functional variants in the ITPA gene, located on chromosome 20 and encoding for inosine triphosphatase (ITPase).The two variants, a missense polymorphism in exon 2 (g.3141842C>A,P32T; rs1127354) and a splicealtering SNP located in the second intron (g.8838A>C, rs7270101), result in reduced enzyme activity: homozygosity for the P32T mutation leads to undetectable ITPase activity, accumulation of its substrate ITP in erythrocytes, and increased toxicity of purine analogue drugs [125,[127][128][129][130][131].Conversely, reduced ITPase activity may be protective from RBV-induced hemolysis through the competition of ITP with RBV-TP [127,132].
The same results obtained by Fellay et al. [126] were reported by Thompson et al. [133], who documented as well a strong association between ITPase deficiency and lower frequency of RBV-induced HA over the complete 48-week therapeutic course for patients infected with HCV genotype 1.Of note, ITPA variants did not affect treatment response.
Recent results by the same group analyzed patients infected with HCV genotypes 2 and 3, showing that ITPA variants are protective against treatment-related anemia, but are not related to the rate of SVR [134].Among Japanese patients, only the SNP rs1127354 was strongly associated with the incidence and severity of RBV-induced HA [135,136].
In addition to their role as predictors of PEG-IFN plus treatment response, the SNPs in or around the IL-28B gene are also associated with spontaneous clearance of HCV infection.Thomas et al. reported a strong association of rs12979860 with spontaneous recovery found in HCVinfected European and African American individuals [137].This association was independent of coinfection with HIV, type of HCV transmission, and history of HBV infection.Moreover, Rauch et al. revealed the host factor associated with spontaneous clearance of HCV was rs8099917, independently of HIV coinfection [88].
As in HCV infection an inappropriate ratio of proinflammatory and anti-inflammatory cytokines may determine the different outcomes of the infection (viral clearance or persistence), polymorphisms in regulatory regions of cytokine genes were studied.As a consequence, it was recently reported that genetic polymorphisms in the promoter region of interleukin-10 (IL-10) are possible predictors of not only the spontaneous favourable outcome of HCV infection but also of the progression of liver fibrosis [138,139].
Of note, the relationship between different levels of hepatic steatosis in patients infected with genotype 3 and host genetic SNPs was identified, suggesting that a small difference in host genetic factors may result in different outcomes of the disease with the same pathogen.Zampino et al. revealed that the presence of T allele in the −493G/T polymorphism of microsomal triglyceride transfer protein (MTP) gene reduces the activity of this key enzyme of assembly/secretion of lipoproteins and predisposes patients infected with HCV genotype 3 to develop higher degree of fatty liver accumulation [140].Moreover, the SNP rs738409 of adiponutrin/patatin-like phospholipase domain-containing 3 (PNPLA3), which encodes for the I148M protein variant, has been recognized as a determinant of liver fat content.In HCV infection, it also influences steatosis development and is independently associated with cirrhosis and other steatosis-related clinical outcomes, such as lack of response to antiviral treatment and possibly HCC [141].

Future Perspectives
Over the past few years, a great progress has been made in understanding the heterogeneous disease progression and treatment response in HCV infection.In the clinical practice, physicians will soon be able to offer to infected patients a tailor-made medicine by combining the screening of both viral and host molecular biomarkers.
The unprecedented increase in the spread of HCV documented during the 20th century has resulted in the wave of increased HCV-related morbidity and mortality that we are now facing.Moreover, over the next 10 years the incidence of complications of CHC will not decline because most patients remain undiagnosed.
Decisions on public health issues such as HCV screening, prevention measures, and early treatment have the potential to reduce the overall morbidity and mortality.However, these depend on reliable epidemiological data, which is still scarce.
Therefore, local and/or regional molecular epidemiology studies concerning the viral and the newly reported hostrelated aspects of infection are urgently needed.