Genomic Regions and Candidate Genes Associated with Milk Production Traits in Holstein and Its Crossbred Cattle: A Review

Genome-wide association studies (GWAS) are a powerful tool for identifying genomic regions and causative genes associated with economically important traits in dairy cattle, particularly complex traits, such as milk production. This is possible due to advances in next-generation sequencing technology. This review summarized information on identified candidate genes and genomic regions associated with milk production traits in Holstein and its crossbreds from various regions of the world. Milk production traits are important in dairy cattle breeding programs because of their direct economic impact on the industry and their close relationship with nutritional requirements. GWAS has been used in a large number of studies to identify genomic regions and candidate genes associated with milk production traits in dairy cattle. Many genomic regions and candidate genes have already been identified in Holstein and its crossbreds. Genes and single nucleotide polymorphisms (SNPs) that significantly affect milk yield (MY) were found in all autosomal chromosomes except chromosomes 27 and 29. Half of the reported SNPs associated with fat yield and fat percentage were found on chromosome 14. However, a large number of significant SNPs for protein yield (PY) and protein percentage were found on chromosomes 1, 5, and 20. Approximately 155 SNPs with significant influence on multiple milk production traits have been identified. Several promising candidate genes, including diacylglycerol O-acyltransferase 1, plectin, Rho GTPase activating protein 39, protein phosphatase 1 regulatory subunit 16A, and sphingomyelin phosphodiesterase 5 were found to have pleiotropic effects on all five milk production traits. Thus, to improve milk production traits it is of practical relevance to focus on significant SNPs and pleiotropic genes frequently found to affect multiple milk production traits.


Introduction
Milk is a highly nutritious and valuable human food consumed by millions of people every day in a variety of flavors and products. Milk production traits, such as milk, fat, and protein yields (PYs), and fat and protein percentages (PPs), are the essential economic traits that are used to evaluate milk quantity and quality and play a major role in dairy development [1]. Milk traits are influenced by multiple genes, and therefore genomic evaluations have the potential to rapidly increase the rate of genetic improvement for these traits in dairy [2]. Understanding genetic variation in dairy cattle is crucial to associating genomic regions with milk yield (MY) and composition traits. The sequencing of the bovine genome in 2004 sparked a worldwide effort to improve how cattle genetic values can be estimated using basic genetic coding information [3].
GWAS have been extensively used in recent years to identify genomic regions and candidate genes for milk production traits in Holstein and its crossbreds in cattle populations from various countries. Numerous candidate genes and quantitative regions associated with milk production traits in Holstein and its crossbreds have already been identified [7-9, 16, 17]. The objective of this review was to summarize the findings of genomic regions and candidate genes associated with milk production traits including MY, FY, PY, FP, and PP in Holstein and its crossbreds.

Methods
Data were gathered from Google Scholar, Science Direct, PubMed, Springer Link, Web of Science, and Scopus using the keywords GWAS, genomic markers, Holstein, crossbred, and milk production traits. The current review included published studies that discussed      International Journal of Genomics  International Journal of Genomics 7 International Journal of Genomics candidate genes and genomic regions that were significantly associated with milk production traits in Holstein and its crossbreds. We included studies that used a P-value as a statistical significance criterion. In addition, we included studies that reported both SNPs and candidate genes. Similarly, only articles published in English in peerreviewed journals since 2009 were included in this review. Thus, conference papers, books, book chapters, theses, and unpublished results were excluded from this review. To ensure consistency throughout the review, SNP names that differed from what researchers reported were converted to the rs name format.

GWAS for Milk Production Traits in Holstein and Its Crossbreds
The phenotypic expression of milk production traits (MY and milk composition) is controlled by many genes. The detection of potential candidate genes affecting milk production traits of cattle is made possible by the widespread availability of SNP markers through the fast-growing number of genotyped cattle [16]. Several GWAS focused on the identification of potential candidate genes and genomic regions underlying milk production traits (MY, FY, FP, PY, and PP). Most researchers conducted association studies using 50 K chips, except [7,18]; who used 26 and 100 K chips, respectively. The methodologies they used were linear, single-locus, multi-locus, and Bayesian mixed models. This review summarized the 462 significantly associated SNPs from which 34 SNPs for milk production traits were repeatedly reported by various researchers at different rates. Ten SNPs were reported three and more than three from 34 SNPs: rs109421300, rs109350371, rs109146371, rs109558046, rs109752439, rs109234250, rs109968515, rs110199901, rs17870736, and rs43703011. While the ramming 24 SNPs were reported twice. For instance, rs109421300 was reported by [11,13,14,17,21]. Diacylglycerol O-acyltransferase 1 (DGAT1) was the most frequently reported candidate gene associated with one or more milk production traits by multiple authors    11 International Journal of Genomics [9,11,13,14,17,19,22]. GHR was reported by [11,13,21,23]. MAPK15 was reported by [15,21,24]. KHDRBS3 was reported by [7,16,21]. The remaining candidate genes were reported by fewer than four researchers. Researchers [8,16,18] conducted association studies for milk production traits with crossbred dairy cattle ranging from 87.50% to <100% Holstein, Holsteinized Black-and-White Pied and Gir × Holstein (Girolando) in Thailand, Russia, and Brazil, respectively, using a single marker linear model. The remaining studies included in this review were conducted with Holstein and its crossbreds.

Milk Yield.
MY is the most economically important trait, and several researchers were keenly interested in identifying the genes and genomic regions that contribute to its variation in Holstein and its crossbreds [7,11,13,16,17]. Several publications that utilized GWAS for the MY are shown in Table 1. These researchers reported 103 individual SNPs that were significantly associated with MY. These SNPs were found on all autosomal chromosomes except chromosomes 27 and 29 in Holsteins and their crossbreds. Figure 1 shows the frequency of SNPs identified by different researchers within each chromosome. Chromosomes 14 and 20 have a high number of SNPs. This information could be used to help focus research on these two chromosomes to improve MY.

Fat Yield and Fat Percentage.
Fat is an important component of milk and it is controlled by gene networks associated with several metabolic and biological pathways. The identification of potential genes and their locations can provide valuable information that can be used for selective breeding to improve milk quality. A total of 46 significantly associated SNPs with FY and 117 significantly associated SNPs with FP were detected in various chromosomes from Holstein and its crossbreds. Several researchers [9,12,17,19,20] mentioned more than twice that two SNPs (rs109350371 and rs109421300) that were significantly associated with FP. Figure 2 shows the number of identified significant SNPs associated with FY and FP in chromosomes from Holstein and its crossbreds. Chromosome 14 contains a large number of significant SNPs associated with FP accounting for more than 75% of the SNPs on this chromosome. Conversely, for fat yield (FY), chromosomes 5 and 14 have an equal number of significantly associated SNPs.

All Milk Production Traits.
A total of 136 SNPs were significantly associated with two or more milk production traits (MY, FY, PY, FP, and PP). According to Fontanesi et al. [22], rs109234250 was significantly associated with all milk production traits (MY, FY, PY, FP, and PP). As reported by [11,12,15,17,21,22], 14 SNPs frequently affected four, 39 SNPs three, and 86 SNPs two of milk production traits. Number of significant SNPs associated with multiple milk production traits in Holstein and its crossbreds are shown in Figure 4. There was a greater number of SNPs frequently affected multiple milk production traits on chromosome 14. Thus, selection programs should focus on candidate genes and genomic regions that are known to influence multiple production traits.

Conclusion
This review summarized information on identified candidate genes and genomic regions associated with milk production traits in Holstein and its crossbreds from various regions of the world. Most of the identified SNPs and candidate genes were on chromosome 14. One of the challenges in dairy cattle selection is that milk production traits are expressed after the first calving. Candidate gene and genomic region information would permit earlier selection of males and females, shorten the generation interval, and accelerate genetic progress for milk production traits.