HBV Genomic Integration and Hepatocellular Carcinoma

The infection of hepatitis B virus (HBV) is one of the most important risk factors of liver carcinogenesis. The infection could in ﬂ uence several aspects of the host liver cells, including liver in ﬂ ammation resulting in chronic hepatitis, cirrhosis, and even hepatocellular carcinoma (HCC). Among of them, HBV DNA integration into the host genome could change host gene expression pro ﬁ ling, genomic stability, and HBV gene expression. The present review is focused on the e ﬀ ects of the integration of HBV genomic DNA into host genome during the carcinogenesis of HCC.


Introduction
Hepatocellular carcinoma (HCC) is currently one of the most malignant cancers in the world, ranking sixth in incidence and fourth in mortality of cancers. According to the estimation of WHO, by 2030, the HCC-related deaths will reach 1 million per year worldwide [1]. The 5-year survival rate of HCC patients is only about 18%, which is the second most deadly malignancy after pancreatic cancer [2]. Most HCC patients usually have hepatitis disease, mainly including viral hepatitis (hepatitis viral infection), alcoholism or metabolic liver disease, or exposure to aflatoxin and aristolochic acid [3,4]. Among them, chronic hepatitis B or C virus infection is the most important risk factor for HCC, accounting for 80% of the global HCC patients, especially higher in East Asian and African countries [4,5]. The clinical data shows that decreasing influence of HCV (RNA virus) infection on HCC due to the effective cure rate of DAA in the treatment of HCV is over 90% currently [6,7]. In contrast, lacking effective drugs, approximately 880,000 individuals die from HBV (DNA virus) infection-induced cirrhosis and HCC annually because there are approximately 0.3 billion HBV-infected cases globally [8,9]. The persis-tence of HBV infection of liver cells may induce chronic hepatitis B (CHB), liver fibrosis, liver cirrhosis, or even HCC [10].
There are several HBV-related risk factors for HCC, including the level of HBV DNA load, genotypes of HBV, and full-long or truncated proteins of HBV. The level of HBV DNA in HBsAg-positive patients is positively correlated with the incidence of HCC in a dose-dependent manner [11]. In addition, the incidence of HCC in patients with 1 million copies/mL of HBV DNA is ten times higher than that in patients with less than 300 copies/mL [11]. Among the ten genotypes of HBV (A-J), the most common types in East Asia are types B and C, of which patients with type C infection have a higher risk of HCC [12]. In Europe, North America, and sub-Saharan Africa, types A and D are common and there are no differences in the risk of HCC [13]. Additionally, genomic mutations in preS1/S2, preC/C, Pol, and HBX also change the risk of HCC in HBV patients [14].
Apart from the mentioned risks, the integration of HBV DNA into the host genome is another critical cause of the carcinogenesis of HBV-related HCC [15,16]. The integration could affect the host genome stability, change the expression level of proto-oncogenes, and increase the accumulation of HBV protein. Next, we will discuss the relationship between the genomic integration of HBV DNA and the carcinogenesis of HCC in detail ( Figure 1).

HBV Life Cycle
A brief introduction of HBV life cycle needs to be addressed at first. HBV virus particle has a diameter of 42 nm (Dane particles) and a spherical structure [17]. The particles are composed of surface antigens (HBsAg), core antigens (HBcAgs), and lipid membrane [17]. HBV surface antigen has three members, large-, middle-, and small-surface antigens (LHBs, MHBs, and SHBs), of which help the virus bind to and enter liver cells [18]. In the Dane particle, core antigens, like bricks, construct an icosahedral viral capsid, wrapping a 3.2 kb long, partially double-stranded relaxed circular HBV genomic DNA (rcDNA), to which HBV polymerase covalently binds [19]. After entering into the nucleus, rcDNA is repaired by the host cell to covalently closed circular cccDNA (cccDNA), the template for the transcription of HBV [20][21][22].
During the life cycle, HBV enters hepatocytes via the binding of SHBs and LHBs to heparin sulfate proteoglycans (HSPG) and Bile acid transporter sodium-taurocholate cotransporting polypeptides (NTCP) [23,24]. At the same time, the epidermal growth factor receptor (EGFR) also has a role in this process [25,26]. Next, protein-mediated endocytosis help HBV particles enter hepatocytes after the HBV envelope fuses with the endosome, and then the nucleocapsid is released into the cytoplasm of target cells [27,28]. In turn, the genomic rcDNA of HBV is released from the nucleocapsid by interacting with microtubules into the nucleus via the nuclear pore complex (nuclear pore complex, NPC) [29,30]. In the nucleus of hepatocytes, with the help of several proteins, proliferating cell nuclear antigen (PCNA), replication factor C (RFC) complex, DNA polymerase delta (POLδ), Flap endonuclease 1 (FEN-1), and DNA ligase 1 (DNA ligase 1, LIG1) rcDNA is repaired to be cccDNA after the removal of short RNA primers and polymerase proteins, and single-stranded DNA gap [31][32][33]. In the nucleus, cccDNA can form a dynamic pool with several to tens of copies [28]. This pool acts as a template for the coding of structural proteins, polymerase and X protein, and produces progeny viruses source [28]. The persistent existence of cccDNA is the difficulty of completely eradiating HBV infection [28,[30][31][32]34].
In the hepatocytes, the carcinogenesis of HBV-related HCC can be attributed to the following factors: (1) massive expression and accumulation of HBsAg; (2) sustained expression of HBX protein; (3) activated and overexpressed proto-oncogenes induced by HBV infection; and (4) other factors such as preC/C and Pol proteins. Emerging evidence has shown that the primary source of HBsAg may be the integrated subgenome of HBV into the host genome [35]. The integrated fragment can express all HBV proteins or truncated proteins, affecting the host genes near the integration sites, and finally, involve in carcinogenesis.

HBV DNA Integration
The linearized HBV genome is the resource for integration into the host genome via homologous recombination [36][37][38][39][40]. Integration mostly occurs in the early stage of infection or even the stage of renewed liver tissue reinfection. At this stage, Linear viral DNA enters the nucleus and recombines with the end of the replicating DNA to achieve integration [40]. At present, there are at least three sources of known linearized HBV genomes as follows: (1) failure of sense strand repair initiation during genomic DNA replication [41,42]; (2) linear intermediates in abnormal genome replication [38,43]; and (3) products of recombined rcDNA with terminal repeat DNA confirmed by the evidence from woodchuck hepatitis virus model [43]. For the integration occurs early in HBV infection of cells, interferon-α, nucleos(t)ide analogs (NAs), and GLS4 (nucleocapsid inhibitor) cannot inhibit HBV genome integration and only Myrcludex B, an inhibitor of HBV infection entry into cells, can block that [44].
There are many HBV genome integration sites. A variety of techniques have been applied to identify the integration sites in the HBV-positive liver cancer samples. Among the techniques, PCR is the first widely used, mainly including restriction PCR, Long-range PCR, nested PCR, and the more accurate Alu-PCR [45,46]. Due to the limitations of PCR technology, it is impossible to perform whole-genome scanning identification, so the next-generation sequencing technology has expanded the scope of identification and achieved tissue whole-genome scanning [16].
In a Singaporean cohort study, a total of 399 HBV integration sites were found in HCC samples, most of which (209) were located near or within coding genes, including 2 Advanced Gut & Microbiome Research exons or promoter regions [37]. HBV integration was present in 92.6% (75 cases), and in tumor tissue and normal tissue [37]. In another study of 426 Chinese patients, researchers identified 4225 HBV integration sites, existing in 76.9% of tumor samples and 37.6% of nontumor samples [16]. There were 826 genes and 303 genes located near the integration site in tumor tissues and nontumor tissues, respectively; of them, 64 genes were shared between the two groups [15]. A European study also showed that, in 177 HBV-positive patients, 84% (143) of the specimens have integration sites (6610) located in the active chromosome regions [36]. At the same time, investigators found that the integrations usually were in promoter regions of genes in tumor tissues, but in introns in nontumor tissues [16]. Apart from the genes close to integration, there are fusion genes in the infected cells, for example, HBs-ESPL1 and HBs-FN1 in HCC tissues [36,51]. Hu et al. found that HBs-ESPL1 had a positive relationship with the risk of HCC (average of 4.8 years) in chronic hepatitis B patients compared to that in HCC patients without the fusion gene in tissue (average of 7.8 years) [51].

HBV Integration Increases the Expression of HBV Genes.
HBV integrations in the host cell genome usually are subgenomic fragments, which, however, could express all open reading frames of HBV proteins, including full-long or truncated Pol, pre-S1, pre-S2, S, HBX, and pre-Core/Core [16,49,50]. In a cohort study of 35 patients with occult HBV, investigators found the Pol and pre-S1 fragments had more copies in the host genome with Alu-PCR in HCC tissues, followed by Pre-Core/Core, Core promoter/Enhancer II, S and HBX [49]. However, another study on occult HBV infection-HCC showed that HBX had more copies, followed by S, core and Pol [50].
The characteristics of the HBV integration site also include chromosome distribution and genders. According to the sites of integration, chromosomes 1, 2, 5, 16, 17, and 19 are the frequent integration chromosomes, especially chromosomes 2 and 17 in men [16,49,50]. The integrations in chromosome 17 were frequently identified in tumor tissues but rare in nontumor [16,49].

HBV Integration Influences the Instability of Chromosomes.
Although HBV integration exists in tumor and nontumor tissues in HBV-positive HCC patients, there are many differences between them. The fragments of the integration of HBV genome mostly are located at 3 ′ end of HBx gene and 5 ′ end of preC/C, about 1400-1900 of HBV DNA, in tumor and nontumor tissue [16]. In the tumor, however, there is a peak of the integrations located at 300-500 in HBV DNA, S gene [16]. The HBV DNA fragments in the host genome usually are not full-long ORFs, for example, few complete HBX proteins in HCC cells [36]. Evidence is also showing that the breakpoints usually happen around 1800 bp, enhancer and replication initiating sites of HBV DNA [39].

Upregulating cell growth and proliferation genes
Changing the instability of the chromosomes Increasing the levels of HBV transcripts CNV region CpG islands Truncated Pol, Pre-S1, pre-S2, S, HBX, pre-Core/Core

Advanced Gut & Microbiome Research
Next, further analysis of the sequencing data reminded that copy number variations (CNV) were closed to the integration sites, changing the genomic instability [39]. Sung et al. had identified 648 CNV sites and 344 HBV integration sites in HBV-positive HCC, of which 29% of integration sites were located inside the CNV region and 16% of that within 1 MB to the CNV regions [39]. In another similar work, authors identified 504 insertions of HBV DNA in 121 HCC samples, 36% of the insertions related to the CNV regions [36]. These suggested that the HBV DNA integration that happened is related to the architecture of the chromosomes.
Another finding was that integration sites were associated with genomic methylation (CpG islands). In a cohort of 426 HCC patients, scientists found that HBV integration sites were significantly enriched within CpG islands in tumor tissues compared with normal tissues [16]. At the same time, the integration frequency was significantly lower at loci far from CpG islands [16]. Moreover, HBV integration is not random and tends to integrate into fragile sites in the host genome [16].

HBV DNA Integrations Involve in HCC Occurrence
Increasing evidence is displaying that the integration of the HBV DNA results in enormous changes in the host cells.
The changes could include the following: changing the expression and function of adjacent genes, driving hepatocarcinogenesis; integration and insertion affecting host cell chromosomal instability; and leading to continuous expression of HBV full-long or truncated genes [36,39]. Next, we discuss these aspects in detail. HBV integrations provoke pro-oncogenes. Firstly, hTERT gene is the most critical gene with the highest frequency of related integration [15,16,36,39,[45][46][47][48]. Most of the integrations are located in the promoter region of hTERT, due to two enhancers in HBV DNA, resulting in predominant overexpression of the gene [36]. TERT gene is the encoding gene of telomerase reverse transcriptase, which is the catalytic subunit of the telomerase complex [53]. In normal cells, including hepatocytes, telomere gradually shortens due to cell division, and when they reach a critical length, it triggers the cells to undergo programmed cell death. Whereas in cancer cells, the telomere could be extended in a telomerase-dependent manner [53]. This mechanism plays a critical role in the formation and development of cancer, preventing the cancer cells from apoptosis [53].
MLL4 is one member of the histone H3 (H3K4) methyltransferase family, COMPASS, which usually is accompanied by mutation or aberrant expression in types of malignancies, including blood cancer (acute myeloid leukemia, non-Hodgkin's lymphoma), digestive tract cancer (esophageal squamous cell carcinoma and colorectal cancer), and urinary system cancer (prostate cancer and bladder cancer) [54]. Another member of COMPASS is MLL2.
MLL2, also known as lysine-specific histone methyltransferase 2B (KMT2B), was reported as a cancer-promoting protein and played an important role in the development via regulation of the related genes [55,56]. It was reported that three HBV integrations in intron 3-6 of KMT2B increase the expression of the gene and even change the protein structure [36]. These molecular observations were found commonly in cancers, for instance, colorectal carcinoma, glioma, and HCC [56,57].
CCNE1, another commonly integration-related gene, encodes cyclin E1 and could form a cell cycle-controlling complex with cyclin A1 and CDK2, responding to the transition from G1 to S phase in cell cycles [58]. The transition control is critical in the initiation and development of cancers [58]. Several integrations of HBV DNA were identified in the promoter region of the CCNE1, significantly increasing the expression level of CCNE1 mRNA [39].
Additionally, the HBV integrations prefer to be in the short arm of chromosome 17, which could influence several antioncogenes of this region, such as TP53 [16,49,50].
Integrations also affect the chromosomal instability of the host cell. The integrations of HBV have been reported to be related to clonal expansion of CNV in the host genome, resulting in activating driver genes for carcinogenesis [36]. The instability induced by HBV DNA integration mainly impacts on chromosomes, resulting in massive deletion or gains, for example, massive deletion of chromosome 17p (containing TP53 gene), a gain of chromosome 5p (containing TERT gene), and massive gain of 8q (containing Myc gene) [35].
Furthermore, integrations could induce sustaining expression of HBV genes, potential carcinogenic preS1/S2/S and HBX. Several studies have shown integrated HBV DNA as a major source of HBsAg in HBeAg-negative patients by RNAseq analysis of human liver tissue [59,60]. In liver cells, most of the HBV transcripts were truncated forms, indicating only a small part of that from cccDNA [36]. A recent report described C terminal-truncated HBX variants in HCC tissues, which were transcripted from the HBV DNA integration [59]. The protein C-terminal truncated HBX may be involved in carcinogenesis because variants retained the ability to interact with the Cul4A-DDB1 E3 ubiquitin ligase complex, a repressor of the TP53 family [59,60].
Whether integrations of HBV DNA are the major source of all HBV transcripts and cccDNA is the template just for pgRNA, and a minor part of the transcripts should be comfirmed by further investigations in the future.

Conclusion
In the past decades, great progress has been made in the field of HBV DNA integration, although some key aspects remain confusing, such as the molecular mechanisms of integration and the role of HBV integration in HCC development and progression. The persistent development of detection methods and research models, such as NTCP-expressing liver cells and animals, could lead to a comprehensive and detailed understanding of HBV DNA integration. As parts of the host cell genome, integrated HBV DNA is more stable and harder to be eradicated than cccDNA. Though many direct-and indirect-acting antiviral agents for cccDNA 4 Advanced Gut & Microbiome Research disruption have been developed, they have not addressed the problem of integrating HBV DNA. Thus, the critical contribution of integrated HBV DNA to HBV infection persistence and carcinogenesis has been partially clarified, which will be a hot area of research interest [61]. Above all the evidence, it is recognized that HBV DNA integration plays a primary role in the initiation and development of HBV-related liver diseases, including HCC. There are reports indicating that the sites of integration are related to the level of HBsAg and AFP, age of prognosis, the size of tumor tissue, and prognosis of HCC [36,39].
HBV DNA integration should be used as a clinical indicator for disease monitoring and treatment in patients with HBV infection. Even if cccDNA can be completely eliminated in the future, a complete cure for HBV infection may not be achieved because of the challenge of the integrated HBV DNA. Therefore, the exact role of integrated HBV DNA in disease remains a challenge for therapeutic drug development [61]. Related drugs or treatments should be developed for the complete cure of HBV infection.

Data Availability
The original contributions presented in the study are included in the article.

Ethical Approval
No ethical review was required because all results were from published literature.

Conflicts of Interest
No potential financial or nonfinancial conflicts of interest were disclosed.