Common Expression Quantitative Trait Loci Shared by Histone Genes

A genome-wide association study (GWAS) was conducted to examine expression quantitative trait loci (eQTLs) for histone genes. We examined common eQTLs for multiple histone genes in 373 European lymphoblastoid cell lines (LCLs). A linear regression model was employed to identify single-nucleotide polymorphisms (SNPs) associated with expression of the histone genes, and the number of eQTLs was determined by linkage disequilibrium analysis. Additional associations of the identified eQTLs with other genes were also examined. We identified 31 eQTLs for 29 histone genes through genome-wide analysis using 29 histone genes (P < 2.97 × 10−10). Among them, 12 eQTLs were associated with the expression of multiple histone genes. Transcriptome-wide association analysis using the identified eQTLs showed their associations with additional 80 genes (P < 4.75 × 10−6). In particular, expression of RPPH1, SCARNA2, and SCARNA7 genes was associated with 26, 25, and 23 eQTLs, respectively. This study suggests that histone genes shared 12 common eQTLs that might regulate cell cycle-dependent transcription of histone and other genes. Further investigations are needed to elucidate the transcriptional mechanisms of these genes.


Introduction
Histone mRNA transcripts and proteins are important for packing DNA into chromatin and are thus tightly regulated in most human cells [1]. In humans, the genes encoding histones are gathered on chromosomes 1 and 6. It has been suspected that the clustered structure of genes can provide a manageable unit for coordinating transcription [1]. Recently, genome-wide chromatin interaction analysis with paired-end-tag sequencing (ChIA-PET) has shown that some histone genes can share promoters [2]. While many efforts have been made to understand the mechanisms for the transcription of histone genes, they have not yet been well defined. Nuclear protein of the ataxia-telangiectasia-mutated locus (NPAT), which promotes the transcription of histone genes, is located near the Cajal body [1]. Clusters of histone genes are also located near the Cajal body [3]. The positions of histone gene clusters near the Cajal body have been observed between the restriction point (R-point) and the G1/S transition (S-point) during the cell cycle [4]. The objective of this study was to select simultaneously expressed histone genes, identify their expression quantitative trait loci (eQTLs), and examine the functions of those eQTLs. reads (RPKM). Outliers were removed based on sample similarity, which was estimated by the Spearman rank correlation between RPKMs and the exon counts of the samples [6]. Sample swaps or contaminated samples were excluded based on allele-specific expression analysis [6]. For details on the quality control process, see t Hoen et al. [7].

Statistical Methods.
We selected histone genes that were expressed simultaneously. Pairwise gene expression relationships were estimated using Pearson's correlation coefficient (r). The significance of the correlation was determined by P < 0 05.
We investigated genome-wide associations of the expression of the selected histone genes. A regression model was employed to identify SNPs associated with expressions of histone genes using PLINK [8]. The Bonferroni correction was applied as a multiple testing, and the significance was determined by P < 2 97 × 10 −10 .
Linkage disequilibrium (LD) between the identified SNPs was estimated using the HaploView program [9]. The LD block was determined according to the 95% confidence interval of the D ′ value for pairwise LD between the nucleotide variants with minor allele frequency > 0.05 [10].  Figure 1: Pearson's correlation (r) between expressions of histone genes using the ellipse visualization method. The correlation estimate without significance (P > 0 05) is marked with "X."    The identified eQTLs were further analyzed for their associations with the expression of nonhistone genes throughout the genome. The Bonferroni multiple testing based on t-statistic was also applied with a significance threshold value of P = 4 75 × 10 −6 .
The functions of identified SNPs were examined using the Ensembl Variant Effect Predictor program [11] and RegulomeDB [12] (e.g., the motif of DNA footprinting assay, chromatin structure by DNA-seq, and protein binding by ChIP-seq).

Discussion
We analyzed the eQTLs for simultaneously expressed histone genes. We found significant correlations amid the expression of 29 histone genes, which were all clustered in chromosome 1 or 6. This clustered structure of the genes may serve to control simultaneous transcription, and this is supported by the observation that the expression of other histone genes not located on chromosome 1 or 6, including H1FX and H2A family members, was not correlated with those of the 29 selected genes. Furthermore, correlation estimates showed two subgroups nested within the large group (one with 21 genes and the other with 10 genes; with strong correlation coefficients of r > 0 7), which likely provide a manageable unit for coordinating transcription. The genome-wide eQTL analysis revealed that 12 loci were associated with the expression of multiple histone genes. The eQTLs were located on chromosomes 2, 7, and 11. Since 29 histone genes were all located on chromosome 1 or 6, we suspect that the identified eQTLs were transacting. This suggests that many histone genes are simultaneously transcribed by remote regulators.
Functional analysis of the identified eQTLs suggests that they are very important for transcription. For example, rs79335804, an SNP within an eQTL on chromosome 2, was the binding motif for Kruppel-like factor 4 (Klf4) protein in various cells including LCLs. Klf4 was associated with chromosomal aberrations and can prevent cell proliferation by acting as a transcription factor [13]. The aberrant chromatin formation could be caused by overproduction of a histone dimer set (H2A-H2B or H3-H4) [14]. Thus, we suspect that there is an association between the chromosomal aberrations by Klf4 and histone gene mRNA expression. Rs849578 within another eQTL on chromosome 2 was associated with autism in the Chinese Han population [15]. It is located in an intron of neuropilin 2 (NRP2) which may be an effector of apoptosis, proliferation, and neuronal development [16]. Histones are known to be related to developmental regulation [17], but additional study is required to elucidate underlying mechanism of the relationship between histones and NRP2.
Transcriptome-wide association analysis revealed that many nonhistone genes were also associated with the identified eQTLs. In particular, some genes were associated with 23 or more eQTLs. One was RPPH1, an RNA component of RNase P, which may assist in the cell cycle-dependent transcription of ribosomal RNAs (rRNAs) by associating with chromatin [18]. The expression of rRNAs increased from G1 to S and peaked at G2 [18]. The transcription of histone genes rapidly increased before the S phase of the cell cycle and decreased shortly thereafter [1]. Thus, many eQLTs rs191508159  rs79103588  rs10490124  rs7562208  rs12468670  rs7563070  rs7563889  rs115071514  rs7600883  rs10170423  rs10208418  rs10208981  rs79335804  rs13430743  rs13429699  rs13432631  rs13385591  rs7602404  rs142402253  rs189340111  rs75372391  rs79503131  rs76908315  rs75358328  rs79931187  rs114273249  rs74871811  rs114288750  rs137985802  rs114372180  rs75148923  rs72853591  rs73963540  rs17801458  rs7570091  rs11684178  rs6718590  rs111875224  rs849583  rs849578  rs849577  rs849573  rs849572  rs849567 Block      identified in this study might be involved in the cell cycledependent expression of both RPPH1 and histone genes. Such a regulation of the eQTLs would be one of the key factors to solve their underlying mechanisms. The others identified with many eQTLs were the genes encoding scaRNAs (SCARNA2 and SCARNA7) located in the Cajal body, similar to the pre-mRNAs of histones, which move to the Cajal body for mRNA processing [3,19]. Interestingly, many genes controlled by the same eQTLs as those for histones do not have polyadenylated structures [20]. In particular, the genes associated with more than 10 eQTLs were all nonpolyadenylated. They were snoRNAs, scaRNAs, and RPPH1. Considering that histones are also nonpolyadenylated, this may help us to understand the transcriptional regulation of histone genes by these eQTLs.
The expressions of histone genes play an important role in controlling chromatin accessibility [21]. Improper expression of histone genes has been associated with tumorigenesis [22][23][24]. Expression of NPAT, a transcriptional activator for histone genes, is also associated with human tumorigenesis [25]. The influence of histone genes on tumor developments might be supported by the eQTLs identified in the current study, because some of the eQTLs were located within anticipated tumor suppressor genes such as low-density lipoprotein receptor-related protein (LRP1B) and utrophin (UTRN) [26,27].
In conclusion, we identified 31 eQTLs for histone genes. The eQTLs were also associated with nonhistone genes that exhibited both a cell cycle-dependent expression and a nonpolyadenylated RNA structure. Further investigations are required to understand the mechanisms regulating the transcription of the histone and nonhistone genes identified in this study and to appreciate their influence on cancer and other diseases. Moreover, identification of eQTLs using disease-specific cell types would provide resolute mechanisms by diseases.

Conflicts of Interest
The authors declare that they have no competing interests.