Functional Genomics in Chickens: Development of Integrated-Systems Microarrays for Transcriptional Profiling and Discovery of Regulatory Pathways

The genetic networks that govern the differentiation and growth of major tissues of economic importance in the chicken are largely unknown. Under a functional genomics project, our consortium has generated 30 609 expressed sequence tags (ESTs) and developed several chicken DNA microarrays, which represent the Chicken Metabolic/Somatic (10 K) and Neuroendocrine/Reproductive (8 K) Systems (http://udgenome.ags.udel.edu/cogburn/). One of the major challenges facing functional genomics is the development of mathematical models to reconstruct functional gene networks and regulatory pathways from vast volumes of microarray data. In initial studies with liver-specific microarrays (3.1 K), we have examined gene expression profiles in liver during the peri-hatch transition and during a strong metabolic perturbation—fasting and re-feeding—in divergently selected broiler chickens (fast vs. slow-growth lines). The expression of many genes controlling metabolic pathways is dramatically altered by these perturbations. Our analysis has revealed a large number of clusters of functionally related genes (mainly metabolic enzymes and transcription factors) that control major metabolic pathways. Currently, we are conducting transcriptional profiling studies of multiple tissues during development of two sets of divergently selected broiler chickens (fast vs. slow growing and fat vs. lean lines). Transcriptional profiling across multiple tissues should permit construction of a detailed genetic blueprint that illustrates the developmental events and hierarchy of genes that govern growth and development of chickens. This review will briefly describe the recent acquisition of chicken genomic resources (ESTs and microarrays) and our consortium's efforts to help launch the new era of functional genomics in the chicken.


Introduction
The chicken was first domesticated from red jungle fowl (Gallus gallus) in south-east Asia (now Thailand) more than 8000 years ago [1]. Domestic chickens (Gallus domesticus) were soon found along the Yellow River (Huang He) in northeast China and eventually they were carried into Europe through Persia and by the Roman conquests. The early domestication of the chicken played a significant role in the global spread of a flourishing human culture [2]. Today the domestic chicken continues to serve mankind as a widely-used biological model and an important global source of high-quality protein from meat and eggs. Until recently, the chicken had received less attention for comparative and functional genomics, mainly due to a low number of expressed sequence tags (ESTs) and the lack of a completed genome sequence. Currently, there are only 8868 chicken Unigenes in GenBank (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? db=unigene), which is about half the number of Unigenes listed for pigs, or cattle. Over the past 3 years, there has been a remarkable increase in the number of chicken ESTs entered into the dbEST division of GenBank (Table 1); this feat has quickly advanced chicken to the sixth-largest collection, with 460 385 ESTs -first place being held by the human collection of 5 471 545 ESTs (http://www.ncbi.nlm.nih.gov/dbEST/dbEST summary.html). Perhaps more remarkable, a 6.6× coverage sequence of the chicken genome has just been completed by the National Human Genome Research Institute (NHGRI) at the Washington University Genome Sequencing Center (http:// genome.wustl.edu/projects/chicken/) [3], within the predicted 1 year deadline [4]. In the near future, the availability of these genomic resources should drive the chicken towards the forefront of developmental and systems biology, and promote its use as a model for comparative and functional genomics research [5] (see ChickNET at: http://www.chicken-genome.org/). The present review will recount some of these recent acquisitions and, in particular, our consortium's efforts to help launch the new era of functional genomics in the chicken [6].

Development of genomic resources for chickens
Making a comprehensive catalogue of genes expressed in chicken tissues In 2000, only a few thousand chicken-expressed sequence tags (ESTs) were found in GenBank; these ESTs were derived primarily from lymphoid tissue [7,8]. Under a USDA-IFAFS/Animal Genome Program consortium project for functional genomics in chickens, we initiated the first comprehensive EST discovery project in chickens, with high-throughput sequencing of a number of single and multiple-tissue cDNA libraries, which were genetically and developmentally complex [6] (see Acknowledgements). This original EST sequencing effort has been completed with the single-pass 5 -end sequencing of 42 870 chicken cDNA clones (Table 1) from a set of tissue-specific cDNA libraries (http://www.chickest.udel.edu)  [11] into 33 949 high-fidelity contigs ( Figure 1) that could represent the number of bona fide genes expressed in the chicken [12]. This overall chicken EST assembly greatly enhanced gene identification and clustering of unigene sets from our cDNA libraries ( Figure 1A). Before CAP3 assembly of all public chicken ESTs, about 52% of the 42 964 ESTs in the UD collection had a high (>200) BLASTX score, while 26% had a low BLASTX (<200) score and 22% were classified as unknown ( Figure 1A). Gene identity based on BLASTX (or BLASTN) scores of the 12 537 UD contigs was improved by our CAP3 assembly  Figure 1. CAP3 assembly of (A) 42 964 ESTs in the UD collection and (B) all chicken ESTs (∼407 K) in found in public databases [12]. The CAP3 program was used at a stringency of 40 bp overlap and 90% sequence identity of all public chicken ESTs to 64% with high BLAST scores, 19% with low scores and 17% with no identification. Thus, the UD chicken EST collection contains 18 648 non-redundant sequences, with an overall redundancy rate of 4.6 ESTs/contig as compared to 8.3 ESTs/contig for the larger BBSRC collection. The UD CAP3 assemblies were used to construct a Chicken Gene Index with 33 949 contigs (high-fidelity in silico cDNAs) and 84 070 unclustered singlets [9]. Furthermore, our assembly of publicly available chicken ESTs is in good agreement with The Institute for Genome Research (TIGR) Gallus gallus gene index, GgGI, Version 5.0 of which was also built using the CAP3 program. The UD CAP3 assembly has allowed us to establish a non-redundant set of genes from liver, adipose tissue, breast (white fibres) and leg (red fibres) muscle/epiphyseal growth plate, pituitary gland/hypothalamus/pineal, reproductive tract ( Figure 2) and lymphoid tissues. Thus, the UD chicken EST collection is based on three major physiological systems (metabolic/ somatic, neuroendocrine/reproductive and immune) [9]. Furthermore, the UD CAP3 database contains all information stemming from this assembly (i.e. the detailed alignment of contigs, EST sequences used to build contigs, BLASTN and BLASTX reports, and the PubMed links in GenBank). The UD chicken EST database (http://www.chickest. udel.edu) and our CAP3 assemblies (http:// udgenome.ags.udel.edu/cogburn/) can be searched by nucleotide sequence or keyword. A portion of the UD EST collection (23 427 ESTs) was recently exploited for single nucleotide polymorphism (SNPs) discovery by another UD group [13] (http://chicksnps.afs.udel.edu).

Development of chicken cDNA microarrays
A primer on the principles of microarray technology and its application to poultry genetics, breeding and biotechnology has recently been published [14]. Prior to this, there had been only a few papers published on gene expression profiling with chicken DNA microarrays [7,[15][16][17][18]. Low-density arrays and differential mRNA display were used to examine the chicken's auditory system (i.e. the cochlea and brain) for auditory plasticity [15]. The first chicken lymphoid cDNA microarrays (1-3 K) were derived from about 5251 ESTs sequenced from an activated T cell cDNA library [7]. Two of  [17] or provoked responses [16]. An additional five papers were published on microarray analysis of chicken tissues in 2003; these interrogations of transcriptional units in the chicken genome involved tissue-specific DNA microarrays for liver [6], pineal [19], retina [20], intestine [21,22] and the bursa of Fabricius [23].
Under our USDA-IFAFS consortium project, we have developed and printed both tissue-specific and systems-wide chicken cDNA microarrays [6]. Our prototype liver-specific array (3.1 K unigenes) was printed on nylon membranes and used in several definitive studies [24][25][26]. The Chicken Metabolic/Somatic System ( Figure 2A) and Neuroendocrine/Reproductive Systems ( Figure 2B) microarrays were originally printed and used as independent arrays. Recently, we have combined both of these systems-wide gene sets into the Del-Mar 14K Chicken Integrated Systems Microarray ( Figure 2C). This universal high-density microarray is currently being used for time-series transcriptional profiling across multiple tissues from divergently selected lines of broiler chickens [6]. An integrated immune system microarray (4 K) has been developed by Joan Burnside and Robin Morgan at UD, DBI from their collection of lymphoid ESTs. They currently use the lymph microarray for studies on the chicken's immune defence system. A chicken macrophage microarray (4 K) has been recently developed by another group at UD from several thousand chicken ESTs sequenced from activated-macrophage cDNA libraries [27] (www.aviangenomics.udel.edu). These ESTs have now brought the total number of chicken ESTs submitted to GenBank by the UD chicken genomics group to 47 853 (Table 1). Recently, a group from the Chicken Genome Consortium [Dave Burt, Roslin Institute; Joan Burnside, UD/DBI; Paul Neiman, Fred Hutchinson Cancer Research Center (FHCRC)] has developed a high-density (13 K) chicken cDNA array that mainly represents the high scoring BLASTX contigs from the BBSRC collection and a few thousand lymphoid clones. This generic 13 K chicken microarray is available in Europe from ARK-Genomics (http://www.arkgenomics.org) and in North America from the FHCRC (genomics@fhcrc.org). Currently, the FHCRC microarray appears to be the most widely used functional genomics platform available to the chicken genomics community. Furthermore, a Chicken GeneChip  is under development by Affymetrix (http://www.affymetrix.com/index. affx) and the Chicken Genome Consortium Microarray Committee (http://www.chicken-genome. org/) for release later this year.

Modelling of gene networks and regulatory pathways
One of the most promising new developments in functional genomics is gene network modelling [28][29][30][31]. A strong external perturbation is applied and the transcriptional snapshots from time-series experiments are used to estimate the regulatory strengths of gene-gene interactions [28,29,32,33]. The perturbation method [34] is widely used in yeast and plants, where each gene in a pathway under study is perturbed, one gene at time. However, gene-by-gene perturbations are not practical in complex organisms like birds and mammals. We have used two strong metabolic perturbations -the embryo-to-hatching transition [25] and the fasting and re-feeding response [26] -to take time-series transcriptional snapshots of chicken liver. A dynamic Bayesian model for analysis of microarray data (BAM) and a spanning tree clustering method were developed for mapping 'functional' clusters of genes that respond to these metabolic perturbations [35].

Global gene expression profiling in liver of the peri-hatch chick
We have examined global gene expression in the liver of embryos (e16, e18 and e20) and hatchling chicks (1, 3 and 9 days) during the very critical and vulnerable peri-hatch period [6]. A multidimensional projection of 32 clusters of functionally related genes expressed in the liver during the peri-hatch period is presented in Figure 3A. Two major and distinct patterns of gene expression were revealed from a total of 756 differentially expressed genes in this cluster tree. One group of 49 genes (red clusters) had higher levels of expression in embryos, whereas the opposing blue clusters had the opposite pattern, with higher expression after hatching ( Figure 3B). Gene cluster analysis, using our spanning tree model, shows the interconnectivity of functional gene clusters involved in the metabolic switch from embryonic to terrestrial life in the peri-hatch chick. Several   Figure 3. Cluster analysis of gene expression patterns in liver during the embryo-to-hatchling transition, using a liver-specific cDNA array (3.1 K). Opposing clusters of functionally related genes in this multidimensional tree also have opposing patterns of gene expression, e.g. the three red clusters contain genes whose expression is high during late embryonic development then fall after hatching; the opposing blue branch clusters contain the genes that are highly expressed after hatching. The inserts provide some examples of these functionally related genes. Liver samples for microarray analysis were collected at three embryonic (e) ages (e16,e18,e20) and at three ages after hatching [(1day) 1d, 3d and 9d] enzymes, expressed at higher levels in embryos, are directly involved in fatty acid metabolism [acetyl coenzyme A acetyltransferase 2 (ACAT2 ), pyruvate dehydrogenase kinase 4 (PDK4 ) and carnitine palmitoyl-transferase 1 (CPT1 )]. The opposing blue clusters contain a number of transcription factors and metabolic enzymes that are expressed at higher levels in the liver of the newly hatched chick; these genes are involved in lipogenesis and energy metabolism [thyroid hormone responsive Spot 14 protein (THRSP ), peroxisome proliferatoractivated receptor-γ (PPARγ ), CCAAT/enhancerbinding protein-α (CEBPα), fatty acid synthase (FAS ), malic enzyme (ME) and HMG CoA synthese (HMG CS )]. For example, THRSP (Spot 14 ) is a transcription factor which controls the expression of several enzymes in the lipogenic pathway (see Figure 4 in [6]). Furthermore, we have  Figure 4. Transcriptional control of the TCA cycle and fat biosynthesis in chicken liver. This working model is based on functional clusters of genes identified from the analysis of two perturbation studies discovered an insertion/deletion polymorphism in chicken THRSP that is associated with abdominal fat traits [36]. Thus, time-series perturbation studies and gene cluster analysis provides a very powerful method for revealing the major topography of gene networks that control major metabolic pathways in chicken liver.

Mapping of functional genes in metabolic pathways
Some of the metabolic enzymes and transcription factors identified by cluster analysis in the livers of the peri-hatch chick [6] or fasting and re-fed chickens [26] were integrated into a working model of transcriptional control of the TCA cycle and fat biosynthesis pathway (Figure 4). Several genes found in these functional clusters are directly involved in fatty acid metabolism [sterol response element binding protein (SREBP ); ATP citrate lyase (ACL); FAS; ME ; fatty acid desaturase 2 (FADS2 )]. The metabolic genes found in these clusters agree with those known to regulate these pathways in mammals [37]. A number of genes (Spot 14, ACL, FAS, FADS2) are overexpressed after hatching and have the same expression pattern as SREBP, which regulates expression of lipogenic genes. The upregulation of fumarase in the TCA cycle and the production of acetyl CoA also contribute to increased lipogenesis. Further, pyruvate dehydrogenase kinase-4 (PDK4 ), which is upregulated in the liver of embryos, inhibits the activity of the pyruvate dehydrogenase complex (PDC ) in the TCA cycle. Thus, downregulation of PDK4 would contribute to increased lipogenesis by increasing the production of acetyl CoA in the mitochondria. Furthermore, PDK4 is known to be upregulated by PPARα. PPARα promotes expression of genes involved in β-oxidation of fatty acids and overexpression of PPARα inhibits SREBP promoter activity in a dose-dependent manner [38]. In addition, PPARα levels are strongly upregulated in the liver of fasting chickens, which reflects an increase in catabolism of stored fat. In contrast, PPARγ appears to support lipogenesis, since the hepatic expression of PPARγ is dramatically increased after hatching. Overexpression of genes in the lipogenic pathway and inhibition of the lipolytic pathway could be related to the nutritional transition between embryonic and hatchling metabolic states. Lipogenesis in the chick liver is very low during the embryonic period and the first few days after hatching [39,40]. In chickens, lipogenesis, although likely to be controlled by the same genes as in mammals, takes place primarily in the liver, whereas adipocytes serve for the release and storage of triglycerides. The balancing and partitioning of nutrients between metabolic tissues could be controlled in a different way in chickens and mammals. Mapping of transcriptional networks requires high-throughput analysis of microarray scans, clustering of co-regulated genes and computational analysis for the presence of functional motifs (i.e. cis-regulatory elements and transcription factor binding sites) that exert control over major metabolic pathways [41]. The assembly of the chicken genome sequence in the near future will certainly enhance efforts to understand transcriptional regulation of major gene networks.

Conclusions
The current bonanza of genomic resources (460 K ESTs, several high-density microarrays and a complete genome sequence) for the chicken should soon shift the domestic chicken to the forefront of developmental biology and functional genomics research. We have constructed and normalized five tissue-specific chicken cDNA libraries and completed high-throughput sequencing of 30 609 ESTs. Chicken unigene sets were identified by CAP3 clustering for development of tissue-specific (liver) and systems-wide (metabolic/somatic and neuroendocrine) chicken DNA microarrays. Using gene clustering and computational analyses of timeseries transcriptional profiles, we have identified a number of polymorphic functional genes in key metabolic pathways that could control important phenotypes in chickens.