UGT1A1 Genetic Variations and a Haplotype Associated with Neonatal Hyperbilirubinemia in Indonesian Population

Neonatal hyperbilirubinemia (NH) is a common finding in newborn babies in Indonesia. Common and rare variants of UGT1A1 have been known to contribute to NH etiology. This study aims to identify UGT1A1 genetic variation and haplotype associated with NH in Indonesian population. DNA was isolated from 116 cases and 115 controls and a targeted-deep sequencing approach was performed on the promoter, UTRs, and exonic regions of UGT1A1. Determining association of common variants and haplotype analysis were performed using PLINK and Haploview. Ten and 4 rare variants were identified in cases and controls, respectively. The UGT1A1 rare variants frequency in cases (5.17%) was higher than that in controls (1.7%). Four of those rare variants in cases (p.Ala61Thr, p.His300Arg, p.Lys407Asn, and p.Tyr514Asn) and three in controls (p.Tyr79X, p.Ala346Val, and p.Thr412Ser) are novel variants. The frequencies of p.Gly71Arg, p.Pro229Gln, and TA7 common variants were not significantly different between cases and controls. A haplotype, consisting of 3 major alleles of 3′ UTRs common variants (rs8330C>G, rs10929303C>T, and rs1042640C>G), was associated with NH incidence (p = 0.025) in this population. Using targeted-deep sequencing and haplotype analysis, we identified novel UGT1A1 rare variants and disease-associated haplotype in NH in Indonesian population.


Introduction
Jaundice is a common physiological phenomenon in neonates as it occurs almost in 60% of healthy term newborns [1,2]. Neonatal hyperbilirubinemia (NH) is a condition when jaundice with the serum total bilirubin (STB) levels is above the 95th percentile for age in hours [3]. In the most severe cases, NH leads to chronic bilirubin encephalopathy (kernicterus) which is characterized by severe athetoid cerebral palsy, sensory hearing loss, dental-enamel dysplasia, paralysis upward gaze, and mortality. Neonatal hyperbilirubinemia occurs due to the imbalance between production and elimination of bilirubin [2,4]. One of the important processes in bilirubin elimination is glucuronic acid conjugation to bilirubin. Conjugated bilirubin is more polar and easier to eliminate compared to unconjugated bilirubin. The conjugation process occurs in the hepatocyte and is catalyzed by UDP-glucuronosyltransferase enzyme which is encoded 2 BioMed Research International by UGT1A1 gene [2,4]. The prevalence of NH varies in different populations with the highest being in the Navajo Indian, Greek, and East Asian such as Japanese and Korean [4]. The previous study showed that the risk of experience hyperbilirubinemia is 12.5-fold higher than that in controls population if a previous sib had STB levels higher than 15 mg/dL [4,5]. Furthermore, it has been reported that there was a high concordance of STB levels among identical twins in the European and Chinese. These findings suggest that a genetic component might contribute to the development of NH [4].
It has been reported that UGT1A1 rare and common variants were associated with NH in different populations [6][7][8][9][10][11]. It has been known also that UGT1A1 variants were underlying cause of prolonged unconjugated hyperbilirubinemia associated with breast milk in Japanese population [12]. These show that combination of genetic and nongenetic factors contributes to the development of the disease. More than 130 UGT1A1 variants have been reported causing Gilbert's syndrome and Crigler-Najjar syndrome type 1 (CN1) and type 2 (CN2), which are characterized by inherited nonhemolytic unconjugated hyperbilirubinemia. The insertion of TA sequence in the TATAA box of the promoter region (TA 7 ) had been reported to be associated with NH in North Indian population [13]. The meta-analysis performed by Long et al., 2011, showed that p.Gly71Arg common variant was reported to be associated with NH in Asian but not in Caucasian population [14]. However, the recent meta-analysis study performed by Yu et al., 2015, from 32 studies with total 6520 participants, showed that the TA 7 and p.Gly71Arg variants also significantly increased the risk of NH in both Caucasian and Asian population [15]. In this study we aim to clarify UGT1A1 genetic variation and haplotypes associated with NH in Deutromalay, one of the major ethnic groups in Indonesia.

Materials and Methods
2.1. Patients, Parents, and Controls. One hundred sixteen healthy term neonates with hyperbilirubinemia and 115 without hyperbilirubinemia from Deutromalay population in Indonesia were enrolled in this study. Diagnosis of hyperbilirubinemia was based on the American Academic of Pediatrics (AAP) criteria where cases are defined as neonates whose serum total bilirubin (STB) levels were above the 95th percentile of the corresponding age group. All cases and controls neonates were single birth. We excluded patients with hemolytic anemia, maternal diabetes, cephalohematoma, neonatal sepsis, incompatibility of ABO, and Rhesus blood grouping between mother and the babies and other congenital diseases which would affect the level of bilirubin in the serum. Written informed consent was obtained from parents and the study was approved by the Ethic Committee of the Faculty of Medicine, Universitas Padjadjaran, Bandung, Indonesia.

DNA Isolation.
Genomic DNA was extracted from onemilliliter peripheral blood leukocytes using Genomic DNA isolation kit (Roche Life Sciences) in MagNA Pure LightCycler 32 instrument (Roche Life Sciences, Pleasanton, USA) according to the manufacturer's protocol.

Deep-Targeted Next-Generation Sequencing (NGS) and
Data Analysis. The UGT1A1 gene was enriched with the TruSeq Custom Amplicon assay (Illumina, San Diego, USA). The oligos used in this assay were designed using Design Studio (Illumina, San Diego, USA). Amplicons cover all exons, padded with 10 base pairs (bp) on each end and 250 bp of the promoter region. The amplicons were sequenced using paired-end sequencing of 2 × 250 bps on a MiSeq (Illumina, San Diego, CA, USA). Data processing was performed with the MiSeq Reporter software (Illumina, San Diego, CA, USA) that performs the alignment against the human reference genome build 19 (hg19) with the Burrows-Wheeler Aligner (BWA) [16]. Subsequently, genetic variants were called with the genome analysis toolkit (GATK) [17]. Identified variants were filtered for known variants in dbSNP138 database using SeattleSeq software (http://snp.gs.washington.edu/SeattleSe-qAnnotation138/). Variants not known in dbSNP138 or minor allele frequency (MAF) < 0.01 were categorized as rare variants and variants with MAF ≥ 0.01 were categorized as common variants.

Association Analysis of Identified Common Variants.
The association analysis of common variants was performed using PLINK [18]. We excluded variants from the association test that were either monomorphic, had a minor allele frequency (MAF) < 0.05, or were missing in 95% of all subjects. We used the False Discovery Rate (FDR) to correct values for multiple testing [19].

Haplotype
Analysis. Analysis of linkage disequilibrium (LD) and haplotype structure mapping were calculated and graphically displayed using Haploview software developed by Broad Institute in Cambridge, USA (https://www.broadinstitute.org).

Validation of Rare Variants. Polymerase Chain Reaction
(PCR) and Sanger Sequencing were performed to validate the rare variants identified by deep-targeted next-generation sequencing method. The validation step was performed only for few variants in which DNA samples were still available ( Figure 1). Primers used for PCR are presented in Table S1. A mixture of 50-100 ng of DNA, 25 l of 2x KAPA2G Fast PCR Kits (KAPA Biosystems, Wilmington, Massachusetts, USA), and 10 M of each forward and reverse primer were prepared in a total volume of 50 l. All exons containing identified UGT1A1 rare variants were amplified by touch-down PCR with the annealing temperature ranging from 68 ∘ C to 58 ∘ C in the first 10 cycles and 58 ∘ C for the last 25 cycles. PCR products were gel purified and sequencing was performed using Big Dye Terminator v3.  [20].

Statistical
Analysis. Statistical analysis was performed using R version 3.3.1 and GraphPad. Differences in proportion of rare and common variants frequencies between cases and controls were analyzed using Fisher's exact test. Differences of serum total bilirubin (STB) levels according to genotypes were analyzed using one-way ANOVA. In all analyses, value < 0.05 was considered statistically significant.

Patients Characteristics.
Characteristics of neonates and their mothers enrolled in this study are presented in Tables S2 and S3, respectively. The mean of birth weight was 3,125 ± 345 g in cases and 3,138 ± 371 g in the controls. The results of ABO and Rhesus blood grouping incompatibility between the mothers and the neonates were negatives in all samples. The results of blood analyses are presented in Table S4. There was no statistical difference in the type of feeding methods. The age of the mother and the number of parities between the mothers of neonates in cases and control groups were also not significantly different. The majority of the mothers from both groups had Cesarean delivery. Taken together, we have collected an appropriated cohort for studying the involvement of common and rare variants in the development of NH in Deutromalay.

Identified Rare and Common
Variants. The statistics of the next-generation sequencing result is presented in Table  S5. The majority of the identified variants (88 and 89%) were Single Nucleotide Polymorphism (SNP) in both cases and controls, respectively. The rest of identified variants were insertions and deletions (indels) which account for 11 and 12% in controls and cases, respectively. Up to 92% of the SNPs were located in the 3 UTRs region and the remaining SNPs were located in the coding region. In total we identified 10 and 4 different heterozygous rare variants in 12 cases (5.17%) and 4 controls (1.7%), respectively. Four of those rare variants in cases (p.Ala61Thr, p.His300Arg, p.Lys407Asn, and p.Tyr514Asn) and three in controls (p.Tyr79X, p.Ala346Val, and p.Thr412Ser) were never reported before (novel variants) ( Table 1). Those rare variants identified in cases and controls were scattered through all exons ( Figure 2). Most of the identified rare variants were missense variants, whereas one of them was a stop-gain variant (p.Tyr79X) and one silent variant (p.Ile47=). Two common variants located in the coding region (p.Gly71Arg and p.Pro229Gln) were identified. The frequency of p.Gly71Arg was 6.89% and 4.78% in cases and controls, respectively. The frequency of p.Gly71Arg in this population was higher than that in the Javanese population from Central Java, Indonesia (1.5%), and in well-term infants from Malays in Singapore (4%) [8,21]. The frequency of p.Pro229Gln was 4.74% and 2.60% in cases and controls, respectively. The frequency of p.Pro229Gln in controls in this population was lower than that in well-term infants from Malays (6%) but higher than that in Chinese population (2%) in Singapore [8]. p.Gly71Arg variant was reported to be associated with NH in Taiwanese, Japanese, and Malaysian Chinese population [22][23][24][25], while the p.Pro229Gln was reported to be identified in infants and adult with hyperbilirubinemia in Japanese and Chinese population, respectively [26]. In this study, the Odds Ratio (OR) of p.Gly71Arg and p.Pro229Gln was 1.47 and 1.59, respectively; however none of the values was significant, showing that none of those common variants was associated with NH (Table 1). A heterozygous rare variant p.Tyr486Asp was identified in an infant with hyperbilirubinemia in this study. Eleven and 12% of the identified variants in cases and controls, respectively, were indels; however those indels consist of only insertion and deletion of TA sequences in the TATAA box region of the UGT1A1 promoter (rs8175347). The insertion and deletion of TA were common variants as their MAF in the population is higher than 0.01. Three different genotypes of TA repeats in this region were identified: TA 6 /TA 6 (wild type), TA 6 /TA 5 , and TA 6 /TA 7 (Table S6). It has been known that the insertion of TA sequence in this region (TA 7 allele) could reduce the expression of UGT1A1 gene leading to insufficiency of UGT1A1 enzyme, while the TA 5 allele could increase the expression of UGT1A1 [27]. Insertion of TA sequence, either TA 7 or TA 8 , has been reported to be associated with NH in different population [8,9,28]. The frequency of the TA 7 variant observed in this study was 10.8% and 9.5% in cases and controls, respectively. These numbers are quite similar to those reported in infants with NH in the Chinese (11.5%) and Malay (10.7%) population but lower than that in Indian population (37.1%) in Singapore [8]. The OR of TA 7 alleles in this study is 1.20 with 95% CIs: 0.65-2.22 and value: 0.57 (Table S6). Validation of rare variants by Sanger Sequencing is performed in this study only for those whose DNA of the subjects were still available; those were p.His487Tyr, p.Ala346Val, p.Arg336Trp, p.Tyr486Asp, and p.Ile47= (Figure 1).

In Silico Analysis.
In silico analysis using PolyPhen-2, SIFT, and Mutation Taster predicted that most of the identified rare variants in the coding regions in both cases and controls are damaging variants based on at least two different software packages (Table 2).       (Table 4).

Association Analysis of Common Variants.
In total 51 different variants located in coding and noncoding regions were identified in both cases and controls. Out of those 51 variants there were only 5 Single Nucleotides Polymorphisms (SNPs) which were included in the association test using PLINK, since the remaining variants were either monomorphic, had a MAF < 0.05 (rare variants), or were missing in 95% of all subjects. After adjusting for multiple testing none of the 5 tested SNPs remains significant at a threshold of value = 0.05 (Table 3).

Haplotype Analysis.
To further analyze the effect of the combination of SNPs in UGT1A1 on the development of NH in this population, we performed haplotype analysis using Haploview. Three linkage disequilibrium blocks with 2 between 0.8 and 0.97 were identified (Figure 3). The block with the strongest linkage disequilibrium ( 2 = 0.97) consists of three SNPs located in the 3 UTRs region; those were rs8330 (C>G), rs10929303 (C>T), and rs1042640 (C>G). The same set of SNPs appeared in the results of association analysis of common variants using PLINK (Table 3). CCC haplotype (major alleles/wild type) was significantly associated with NH ( value = 0.025). The other two haplotypes (minor alleles: GTG and CTC) were more present in the controls than in the cases and only the frequency of CTC haplotype was significantly different between cases and controls ( = 0.0083) ( Table 4). This suggests that the CTC haplotype might have a protective effect on the development of NH. However, these results should be viewed with caution given the low sample size.

Comparison of STB Levels according to Genotypes.
Based on risk analysis, the OR of all common variants identified (p.Gly71Arg, p.Pro229Gln, TA 5 , and TA 7 ) were greater than one; however they were not significantly associated with the increased risk of NH in Deutromalay as the value > 0.05  Figure 4: Total bilirubin levels of infants in cases group carrying different UGT1A1 common variant or combination of several common variants (TA 6 TA 5 , TA 6 TA 6 , TA 6 TA 7 , TA 6 TA 7 + p.Pro229Gln, TA 6 TA 5 + p.Gly71Arg, and TA 6 TA 7 + p.Gly71Arg). There was no significant difference of serum total bilirubin expression in infants carrying single common variant with combination of two common variants ( value = 0.42, one-way ANOVA analysis). Circles: outlier values/data; stars: extreme outlier values/data. (Tables 1 and S6). Furthermore, STB levels of infants with TA 5 and TA 7 were not significantly different from infants carrying TA 6 (major allele); this strengthens the fact that neither TA 5 nor TA 7 was associated with NH in this study. We further analyzed the differences of STB levels of infants carrying combination of TA repeats with common variants in the coding region (TA 6/7 +p.Pro229Gln, TA 6/6 +p.Gly71Arg, or TA 6/5 +p.Gly71Arg). However, the STB levels of infants carrying those combinations of common variants were not significantly higher than those with only TA repeats counterpart variants (Figure 4). Taken together, our results showed that different UGT1A1 genotypes did not significantly affect the severity of NH in the Indonesian population.

Discussion
In this study, we reported the first UGT1A1 rare and common variants identified in infants with NH in Deutromalay population in Indonesia. Ten and four rare variants (MAF < 0.01) were identified in cases and controls, respectively. p.Gly71Arg and p.Pro299Gln variants, as in other Asian and Caucasian populations, are also common variants in Deutromalay population in Indonesia. Although the OR of p.Gly71Arg, p.Pro229Gln, and TA 7 are above one, none of their value is <0.05. These data show that none of those common variants is a risk allele for NH in Deutromalay population in Indonesia (Tables 1 and S6). Study by Yu et al. showed that TA 7 and p.Gly71Arg are associated with the increased risk of NH in Asian population observed in metaanalysis, in which more than 6520 participants were included in the study [15]. Two hundred thirty-one participants (116 cases and 115 controls) included in this study most likely do not have enough power to detect those associations.
Of 10 rare variants identified in cases, 6 have been reported in the previous studies to be causally related to CN1 and CN2, while 4 of the identified remaining rare variants were novel (p.Ala61Thr, p.His300Arg, p.Lys407Asn, and p.Tyr514Asn) [10,29]. Compound heterozygous p.Tyr486Asp with p.Gly71Arg and homozygous p.Gly71Arg have been identified in CN2 patients in different populations across Asia [29]. Instead with p.Gly71Arg, in this study, we identified a heterozygous compound of p.Tyr486Asp with deletion of TA sequence (TA 5 ) in the promoter of one patient (ID79) ( Table 2). It has been reported that the glucuronidation activity of the enzyme UGT1A1 containing p.Tyr486Asp reduces to only 10% of the wild type counterpart, while the TA 5 variant is known to increase the UGT1A1 expression [29]. If these two variants, which have contradictive effect on enzyme activities and gene expression, are located in different strand of DNA (trans-coinheritance) this might explain why this patient had a milder phenotype (STB level ∼ 18.26 mg/dL) compared to that of CN1 patients (STB level ∼ 30-50 mg/dL) which is more similar to that of CN2 patients (STB level ∼6-20 mg/dL) [26,29]. The same reason might be applied to other patients who have compounding heterozygous missense rare variant (p.His300Arg, p.Pro364Leu, p.His487Tyr, and p. Pro451Leu) with heterozygous deletion of TA sequence (TA 5 ) variant in the promoter region (Table 2). Compared to the reported rare variants identified in CN1 and CN2 patients which are usually homozygous or compounding heterozygous variants in the coding region, there was a trend in this study that rare coding variants identified in cases were heterozygous and almost always present in combination with deletion of TA sequence (TA 5 ) in the promoter ( Table 2). One patient (ID 44) has a combination of three common variants (p.Gly71Arg, p.Pro229Gln, and TA 7 ) and another patient (ID 178) has a combination of two heterozygous novel rare coding variants (p.Lys407Asn and p.Tyr514Asn) with the heterozygous common variant (p.Gly71Arg). Although the phenotypes of these two infants were not severe (STB levels were 19.3 and 20.7 mg/dL) as CN1, genetic counseling for their families might be required as the pattern of inheritance could be autosomal recessive (Crigler-Najjar syndrome).
Study by Kadakol et al., 2000, showed that the combination of TA 7 variant in the promoter of UGT1A1 in transcoinheritance with structural (coding) variants could enhance the pathogenicity of the coding variant, as the TA 7 variant decreases the expression of wild type UGT1A1 protein [30]. Study by Maruo et al., 2016 showed that different combination of rare and common variants resulted in different phenotype, starting from mild to severe, of NH and hyperbilirubinemia in adults [26]. Those studies showed that different UGT1A1 genotypes (combination of common and rare variants) affect the severity of the disease. In this study, we could not group the cases based on combination of rare and common variants, as most of the rare variants present only in one patient. If we group the genotype of patients based on combination of common variants, we observed that the STB level of infants with only TA repeat variant and those who have combination of TA repeat and coding common variant (such as p.Gly71Arg or p.Prof229Gln) did not significantly differ (Figure 3). In conclusion, different combination of UGT1A1 common variants (genotypes) observed in this study did not significantly affect the severity of NH phenotype in Indonesian population. In terms of genotype-phenotype correlation, we observed that infants with rare variants located in exon 5 (p.Pro451Leu, p.Tyr486Asp, p.His487Tyr, and p.Tyr514Asn) have STB level higher than 18 mg/dL ( Table 2). Those variants are located close to the transmembrane domain ( Figure 1); this might affect protein internalization into the membrane of endoplasmic reticulum (ER) and nuclear envelope in the hepatocyte cell and hence disturb its function. It is interesting to mention that in this population heterozygous p.Pro229Gln variant presents always in combination with TA 7 allele. Similar result had been reported in CN2 and Gilbert syndrome patients in Japanese population and in neonatal hyperbilirubinemia in Chinese population [25,26]. This might indicate that these two variants are in the linkage disequilibrium in Asian population.
Common and rare variants were identified in controls as well; although not significant ( = 0.07) more infants in cases (5.17%) are carrying rare variants compared to those in controls (1.7%) (Tables 1 and 2). One of those rare variants identified in controls was a variant caused a premature stop codon (p.Tyr79X). This variant supposedly leads to a truncated protein and hence most likely is pathogenic. However surprisingly this infant did not experience severe hyperbilirubinemia (STB level is 6.99 mg/dL). The possible explanation for this is that either the wild type allele alone could cover the overall UGT1A1 function or this rare variant, for unknown reason, did not lead to premature stopgain function. Interestingly, there was a stop-gain variant identified in CN1 patient proven to induce skipping of exoncontaining variant rather than cause premature stop codon in hepatic mRNA [31]. This rare variant produced an in-frame shorter protein which is still functioning. This could be a possible explanation why sometimes the predicting pathogenic stop-gain variants do not lead to severe phenotype. Whether or not the same mechanism applies on the p.Tyr79X variant still needs further studies.
A remark that needs to be treated with caution is that one needs to consider many aspects, including the nature of the disease itself, to conclude whether or not rare variant identified is pathogenic. Neonatal hyperbilirubinemia (NH), for example, is considered as a polygenic and multifactorial complex in nature. Many variants in many different genes might contribute in concert to induce NH. In this study, four subjects in the controls group from whom UGT1A1 rare variant was identified had STB level > 6 mg/dL and even two of them had STB level > 12 mg/dL. This shows that although the phenotype is not as severe as those in cases groups, still those rare variants might contribute to the high STB level in those subjects. Adding specific inducer (e.g., breast feeding) might lead them to have NH phenotype.
Haplotype analysis showed that in this study a single haplotype containing major alleles of rs8830, rs10929303, and rs1042640 (CCC) located in 3 UTRS was significantly associated with neonatal hyperbilirubinemia ( = 0.0025) ( Table 4). Those SNPs located in 3 UTRs are usually involved in the mRNA stability. The major alleles of those SNP most likely stabilize the UGT1A1 mRNA; hence one could assume that this haplotype might modify the risk of getting NH if it is in cis-position with pathogenic UGT1A1 coding variants. However, study by Court et al., 2013, showed that the minor alleles of those SNPs did not affect the mRNA stability but created exon splice enhancer which favored certain splicing site on exon 5 (exon 5a) [32]. Whether or not this alternative splicing could reduce the risk of developing hyperbilirubinemia in infants needs further study.

Conclusion
Our study provides information that UGT1A1 rare variants are identified more frequently in cases (5.17%) than in controls (1.70%) in Indonesian population. Combination of common variants located in the promoter (TA6, TA7, and T5) within the coding region of UGT1A1 (p.Gly71Arg or p.Pro229Gln) did not significantly affect the STB levels and hence did not contribute to the severity of NH phenotype in Indonesian population.

Disclosure
Some of the data had been presented in the thesis book of Dr. Dewi A. Wisnumurti in Universitas Padjadjaran, Indonesia.

Conflicts of Interest
The authors declare that they have no conflicts of interest.

Supplementary Materials
Table S1: list of primers used for amplified exons region of UGT1A1. Table S2: characteristic data of neonates in cases and controls group.