Hematological traits are important health indicators and are used as diagnostic clinical parameters for human disorders. Recently, genome-wide association studies (GWAS) identified many genetic loci associated with hematological traits in diverse ethnic groups. However, additional GWAS are necessary to elucidate the breadth of genetic variation and the underlying genetic architecture represented by hematological metrics. To identify additional genetic loci influencing hematological traits (such as hematocrit, hemoglobin concentration, white blood cell count, red blood cell count, and platelet count), we conducted GWAS and meta-analyses on data from 12,509 Korean individuals grouped into population-based cohorts. Of interest is EGF, a factor plays a role in the proliferation and differentiation of hematopoietic progenitor cells. We identified a novel EGF variant, which associated with platelet count in our study (
Hematological metrics are used as essential medical indicators [
In addition, previous studies illustrated that significant differences in hematological traits exist between ethnic groups. For example, African Americans tend to have lower white blood cell counts, whereas persons of Japanese descent generally have fewer red blood cell-related anomalies than typically seen in other populations [
In this study, we sought to identify additional ethnic Korean-specific genetic variants associated with five hematological traits: hemoglobin (Hb), hematocrit (Hct), red blood cell count (RBC), white blood cell count (WBC), and platelet count (PLT). To achieve our aim, we thus carried out a GWAS and meta-analysis in Korean populations to look specifically for effects related to these metrics. Subsequently, we performed pleiotropic association analyses and functional annotation of the identified trait-associated loci. Our results may not only highlight the biologically important role of genetic variants in hematological traits found in Korean populations but also provide useful insight on understanding genetic diversity between ethnic groups.
We performed GWAS based on 5 hematological traits (Hb, Hct, WBC, RBC, and PLT) with data from 12,509 subjects from two population-based cohorts that are comprised in the Korean Genome Epidemiology Study (KoGES). In discovery stage, we analyzed data for 8,842 subjects from the Korea Association Resource (KARE) project of KoGES [
This study was approved by the ethics committee of the Korea Centers for Disease Control and Prevention’s Institutional Review Board, and all of study subjects provided written informed consent prior to taking part in the study.
Hematological trait values were available for up to 20,562 subjects (8,842 KARE subjects, 3,667 CAVAS subjects, and 8,053 Health2 subjects) taking part in KoGES. Fasting blood samples were drawn from study subjects into a test tube containing an anticoagulant (e.g., EDTA), and relevant traits were measured or calculated using an automated electronic cell counter, ADIVA 120 hematology system by Bayer Diagnostics, USA.
In the discovery stage, 10,004 KARE study samples were genotyped by the Affymetrix Genome-Wide Human SNP array 5.0. Our quality control criteria are as follows: samples (i) with missing genotype call rate (>4%), (ii) with excessive heterozygosity (>30%), (iii) with gender inconsistencies, and (iv) from subject with cancer; SNPs with (i) missing genotype call rate (>5%), (ii) low MAF (<0.01), and (iii) Hardy-Weinberg equilibrium (
For further replication of a novel locus, analyses in two methods were performed,
SNPs were imputed based on HapMap (phase 2, release 22, NCBI build 36 and dbSNP build 126;
To investigate the genetic causes for the five specified hematological traits, we carried out GWAS using a linear regression model via the PLINK program (
As we were interested in the effects of genome-wide significant SNPs on PLT, associations of each SNP with the lipid profile metrics (total cholesterol (TC), triglyceride (TG), LDL-cholesterol (LDL), and HDL-cholesterol (HDL)) and CAD were implemented using 8,842 KARE subjects and CAD 2,123 cases and 2,690 controls that were previously published, respectively [
We conducted GWAS on 1,590,162 common SNPs (minor allele frequency (MAF) > 1%) and five hematological traits, namely, Hb, Hct, WBC, RBC, and PLT, for 8,842 subjects of the KARE project [
For the 12,509 data we used, we identified 17 genetic regions including one novel genetic association for PLT (4q25, on the EGF gene) that reached our threshold for genome-wide significance (
Results of genome-wide association analyses for hematological traits.
Trait | CHR | SNP | Cytoband | Gene |
Minor |
Discovery ( |
Replication ( |
Combined ( |
|||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
MAF | Effect size |
|
Effect size |
|
Effect size |
|
|
||||||
Novel | |||||||||||||
PLT | 4 | rs2282786 | 4q25 | EGF | C | 0.25 | −6.66 ± 1.123 | 3.18 × 10−9 | −6.58 ± 1.741 | 1.58 × 10−4 | −6.64 ± 0.944 | 2.05 × 10−12 | 0.97 (0.00) |
Previously reported | |||||||||||||
Hb | 22 | rs2076086 | 22q12.3 | TMPRSS6 | T | 0.49 | 0.09 ± 0.018 | 7.33 × 10−7 | 0.11 ± 0.026 | 3.24 × 10−5 | 0.09 ± 0.015 | 1.19 × 10−10 | 0.52 (0.41) |
Hct | 6 | rs9376090 | 6q23.3 | HBS1L-MYB | C | 0.32 | −0.35 ± 0.055 | 1.68 × 10−10 | −0.18 ± 0.081 | 2.68 × 10−2 | −0.30 ± 0.045 | 6.20 × 10−11 | 0.08 (3.05) |
RBC | 6 | rs7775698 | 6q23.3 | HBS1L-MYB | T | 0.33 | −0.07 ± 0.006 | 6.82 × 10−34 | −0.06 ± 0.009 | 4.92 × 10−12 | −0.07 ± 0.005 | 2.69 × 10−44 | 0.20 (1.68) |
4 | rs17084406 | 4q12 | PDGFRA-KIT | G | 0.26 | −0.04 ± 0.006 | 2.09 × 10−11 | −0.05 ± 0.010 | 1.81 × 10−8 | −0.05 ± 0.005 | 2.85 × 10−18 | 0.36 (0.85) | |
6 | rs3218108 | 6p21.1 | CCND3 | T | 0.22 | 0.04 ± 0.007 | 2.15 × 10−10 | 0.05 ± 0.010 | 2.54 × 10−6 | 0.04 ± 0.006 | 2.59 × 10−15 | 0.79 (0.07) | |
12 | rs7138216 | 12p13.3 | PARP11-CCND2 | C | 0.27 | −0.03 ± 0.006 | 2.70 × 10−7 | −0.03 ± 0.009 | 4.73 × 10−3 | −0.03 ± 0.005 | 5.19 × 10−9 | 0.54 (0.37) | |
9 | rs8176743 | 9q34.2 | ABO | T | 0.21 | 0.03 ± 0.007 | 9.88 × 10−7 | 0.03 ± 0.010 | 5.73 × 10−3 | 0.03 ± 0.006 | 2.12 × 10−8 | 0.62 (0.25) | |
2 | rs2218660 | 2p21 | PRKCE | G | 0.18 | −0.03 ± 0.007 | 2.90 × 10−6 | −0.03 ± 0.011 | 2.90 × 10−3 | −0.03 ± 0.006 | 2.94 × 10−8 | 0.85 (0.04) | |
WBC | 17 | rs8070454 | 17q21.1 | PSMD3-CSF3 | T | 0.47 | 0.16 ± 0.028 | 3.62 × 10−9 | 0.11 ± 0.043 | 8.68 × 10−3 | 0.15 ± 0.023 | 1.74 × 10−10 | 0.31 (1.04) |
7 | rs11981340 | 7q21.2 | CDK6 | C | 0.34 | −0.13 ± 0.029 | 4.99 × 10−6 | −0.20 ± 0.045 | 9.58 × 10−6 | −0.15 ± 0.024 | 4.33 × 10−10 | 0.21 (1.58) | |
PLT | 22 | rs1977081 | 22q13.31 | PNPLA3 | C | 0.41 | −5.84 ± 0.997 | 4.70 × 10−9 | −3.76 ± 1.608 | 1.94 × 10−2 | −5.27 ± 0.847 | 5.10 × 10−10 | 0.27 (1.21) |
6 | rs9469032 | 6p21.33 | LY6G5C | G | 0.03 | 12.7 ± 2.757 | 4.42 × 10−6 | 16.6 ± 4.336 | 1.35 × 10−4 | 13.8 ± 2.327 | 3.12 × 10−9 | 0.45 (0.58) | |
6 | rs9277053 | 6p21.32 | HLA-DOA-HLA-DPA1 | A | 0.34 | 5.18 ± 1.049 | 7.95 × 10−7 | 5.27 ± 1.622 | 1.18 × 10−3 | 5.21 ± 0.881 | 3.43 × 10−9 | 0.96 (0.00) | |
12 | rs739496 | 12q24.12 | SH2B3 | A | 0.11 | −8.36 ± 1.572 | 1.06 × 10−7 | −5.75 ± 2.385 | 1.06 × 10−2 | −7.57 ± 1.313 | 8.00 × 10−9 | 0.36 (0.84) | |
6 | rs9399137 | 6q23.3 | HBS1L-MYB | C | 0.33 | 5.24 ± 1.072 | 1.06 × 10−6 | 5.00 ± 1.630 | 2.20 × 10−3 | 5.17 ± 0.896 | 8.07 × 10−9 | 0.90 (0.02) | |
3 | rs13091574 | 3q27.1 | THPO | C | 0.16 | 6.39 ± 1.319 | 1.28 × 10−6 | 6.22 ± 2.142 | 3.73 × 10−3 | 6.34 ± 1.123 | 1.62 × 10−8 | 0.94 (0.01) |
CHR, chromosome; BP, base position; MAF, minor allele frequency; Hb, hemoglobin; Hct, hematocrit; RBC, red blood cell count; WBC, white blood cell count; PLT, platelet count. Effect sizes are shown as beta ± S.E. A test of heterogeneity (
Manhattan plots of the GWAS for five hematological traits in discovery stage. Vertical axis indicates −log10
Of the 17 regions we identified, seven included previously reported associations of erythrocyte-related traits (Hb, Hct, and RBC) with the following loci: 22q12.3 (TMPRSS6), 6q23.3 (HBS1L-MYB), 4q12 (PDGFRA-KIT), 6p21.1 (CCND3), 12p13.3 (PARP11-CCND2), 9q34.2 (ABO), and 2p21 (PRKCE) (Table
We identified a novel intronic variant, rs2282786, located on EGF at 4q25 that associated with lower platelet counts (effect size =
A regional association plot of the novel genetic locus associated with PLT. Round shaped dots represent −log10
We examined associations between seven PLT-associated variants with genome-wide significance and other traits related to CAD and lipid profile, including TC, TG, LDL, and HDL (Table
Associations with CAD and lipid profiles for loci-associated PLT with genome-wide significance.
Trait | SNP | Cytoband | Gene | CAD ( |
TC ( |
TG ( |
LDL ( |
HDL ( |
|||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
OR (95% CI) |
|
Effect size |
|
Effect size |
|
Effect size |
|
Effect size |
| ||||
PLT | rs13091574 | 3q27.1 | THPO | 0.96 (0.86–1.07) | 0.4571 | −0.39 ± 0.72 | 0.5944 | −1.63 ± 2.12 | 0.4426 | −0.67 ± 0.65 | 0.3096 | 0.37 ± 0.21 | 0.0704 |
rs2282786 | 4q25 | EGF | 1.07 (0.97–1.17) | 0.1803 | −0.33 ± 0.62 | 0.5954 | −2.40 ± 1.82 | 0.188 | −0.05 ± 0.56 | 0.9282 | −0.05 ± 0.17 | 0.7807 | |
rs9469032 | 6p21.33 | LY6G5C | 1.15 (0.93–1.42) | 0.1943 | 1.69 ± 1.51 | 0.2613 | 4.37 ± 4.47 | 0.3288 | 1.00 ± 1.37 | 0.4655 | 0.01 ± 0.43 | 0.9818 | |
rs9277053 | 6p21.32 | HLA-DOA-HLA-DPA1 | — | — | 0.93 ± 0.57 | 0.1059 | 0.16 ± 1.67 | 0.9252 | 0.79 ± 0.52 | 0.1296 | 0.14 ± 0.16 | 0.3875 | |
rs9399137 | 6q23.3 | HBS1L-MYB | — | — | −2.79 ± 0.58 | 1.87 × 10−6 | −5.75 ± 1.72 | 8.41 × 10−4 | −1.78 ± 0.53 | 8.02 × 10−4 | −0.05 ± 0.17 | 0.7750 | |
rs739496 | 12q24.12 | SH2B3 | 0.82 (0.72–0.94) | 3.70 × 10−3 | 0.99 ± 0.86 | 0.2522 | 4.77 ± 2.54 | 0.06059 | 0.05 ± 0.78 | 0.9487 | 0.36 ± 0.24 | 0.1420 | |
rs1977081 | 22q13.31 | PNPLA3 | 0.94 (0.87–1.03) | 0.182 | −2.03 ± 0.55 | 2.08 × 10−4 | −5.94 ± 1.60 | 2.00 × 10−4 | −1.30 ± 0.50 | 8.93 × 10−3 | 0.07 ± 0.16 | 0.6558 |
CHR, chromosome; BP, base position; PLT, platelet count; CAD, coronary artery disease; TC, total cholesterol; TG, triglyceride; LDL, LDL-cholesterol; HDL, HDL-cholesterol. Logistic and linear regression analyses adjusted by age and gender were performed. Effect sizes are shown as beta ± S.E.
Association results of rs2282786 for platelet count.
Trait | CHR | SNP | BP | Gene | Function | MA | KARE + CAVAS ( |
Health |
BioBank Japan ( |
Combined ( |
|||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Effect size |
|
Effect size |
|
Effect size |
|
Effect size |
|
|
|||||||
PLT | 4 | rs2282786 | 111141945 | EGF | Intronic | C | −6.64 ± 0.944 | 2.05 × 10−12 | −4.21 ± 1.11 | 1.59 × 10−4 | −4.78 ± 1.04 | 4.47 × 10−6 | −5.23 ± 0.66 | 2.44 × 10−15 | 0.41 (0.68) |
Recently, numerous genetic loci for hematological traits were discovered through several GWASs of European, African American, and Japanese populations [
Additionally, to examine the association between genetic variants and the level of gene expression, the novel PLT-associated locus was cross-referenced with expression quantitative trait loci (eQTL) associations using genetic variation and gene expression profiling data from Gene Expression Variation (GENEVAR) (
The EGF gene encodes epidermal growth factor; the encoded protein acts as a potent mitogenic factor, playing an important role in the growth, proliferation, and differentiation of numerous cell types [MIM: 131530]. It may play a role in growth, proliferation, and differentiation of megakaryocytes and platelet production. Previous studies reported that activated platelets induced by inflammation may secrete EGF and proinflammatory substances for subsequent thrombus formation in an inflammation-hemostasis cycle that is a tightly interrelated pathophysiologic process [
To date, many studies have reported genetic factors associated with hematological traits via GWAS across diverse ethnic groups [
In summary, we illustrated that a genome-wide approach identified genetic variants contributing to phenotypic variation of hematological traits in Korean populations. We identified one novel ethnic specific variant associated with PLT that localized to a key regulator of hematopoiesis and confirmed previously implicated loci that were associated five hematological traits. We also provided pleiotropic effects of PLT-associated variants that may support the biological role of genetic determinants for hematological traits. Our findings may help identify biological pathways that contribute not only to hematopoiesis but also to inflammatory and cardiovascular diseases in humans.
The authors declare no conflict of interests.
Yun Kyoung Kim and Ji Hee Oh contributed equally to this work.
This work was supported by an intramural grant from the Korea National Institute of Health (2012-N73002-00) and grants from the Korea Centers for Disease Control and Prevention (4845-301, 4851-302, and 4851-307).