Characterization of Expressed Sequence Tags From a Gallus gallus Pineal Gland cDNA Library

The pineal gland is the circadian oscillator in the chicken, regulating diverse functions ranging from egg laying to feeding. Here, we describe the isolation and characterization of expressed sequence tags (ESTs) isolated from a chicken pineal gland cDNA library. A total of 192 unique sequences were analysed and submitted to GenBank; 6% of the ESTs matched neither GenBank cDNA sequences nor the newly assembled chicken genomic DNA sequence, three ESTs aligned with sequences designated to be on the Z_random, while one matched a W chromosome sequence and could be useful in cataloguing functionally important genes on this sex chromosome. Additionally, single nucleotide polymorphisms (SNPs) were identified and validated in 10 ESTs that showed 98% or higher sequence similarity to known chicken genes. Here, we have described resources that may be useful in comparative and functional genomic analysis of genes expressed in an important organ, the pineal gland, in a model and agriculturally important organism.


Introduction
Circadian rhythm is a general characteristic of living organisms. Both physiological and genetic factors involved in this process continue to be very widely investigated in different organisms. In mammalian and avian systems, it is a general consensus that the physiological and genetic processes of biological rhythms occur in a loop. The molecular mechanisms that control the loop appear to be conserved among diverse species. The avian circadian rhythm is unique as it involves multiple organs whose inputs and interactions influence the oscillatory patterns of rhythmic behaviour (Ebihara et al., 1987). Some of the positive and negative regulator genes involved in the autoregulatory feedback loop mechanism for the circadian oscillator in the pineal gland have been described in diverse birds, including the quail (Yoshimura et al., 2000) and chicken .
The chicken pineal gland is an important model for vertebrate circadian clock systems because of its ability to retain circadian rhythm in culture. Several important genes have been identified in the pineal gland. One important component of the autoregulatory feedback loop of the circadian oscillator is the negative regulator gene, cPer2 ; the gene products of cBmal1, cBmal2 and cClock form heterodimers that bind to a promoter sequence of cPer2 and activate transcription . The photoreceptor pinopsin has been shown to be present, although its expression responds exclusively to light and not circadian patterns. The arylalkylamine N -acetyltransferase (AA-NAT) gene product, however, has been directly linked to melatonin production in a circadian rhythm (Takanaka et al., 1988). In addition, GCAP1, GCAP2 and GC, genes that are important in resetting rods and cones after light exposure, have been identified in the pineal gland (Semple-Rowland, 1999).

S. Hartman et al.
Since the chicken is considered an excellent model for further understanding the genetic and molecular basis of rhythmic behaviour, here we investigated the characteristics of expressed sequence tags (ESTs) isolated from the chicken pineal gland. While previous work by Hubbard et al. (2005) has yielded a number of ESTs in such important functional tissues as the liver, pancreas, heart, cerebellum, kidney and ovary, none has been described to date from the pineal gland. Bailey et al. (2003) used microarray technology to evaluate pineal genes expressed in periods of light and darkness with a focus on function rather than sequence comparisons. The primary goal of our study was to identify novel genes that could be useful in comparative genome analysis of the molecular mechanisms that underlie rhythmic behaviour. Additionally, we evaluated the level of variation in selected ESTs that matched known chicken genes using in silico analysis followed by PCR-based resequencing for validation.

Sequence analysis
The ESTs were obtained from a previously described chicken pineal gland-cDNA library (Chong et al., 2000). Briefly, the library was established from 10-11 day-old White Leghorn birds under 12 h light. The ESTs were produced from singlepass sequencing of randomly selected clones, processed by a modification of the toothpick PCR described by Smith et al. (2001). The modification involved first converting the original library from HybriZAP2.1 into phagemid, using the manufacturer's (Stratagene, La Jolla, CA92037) recommendation. The ESTs were characterized using BLAT (http://genome.ucsc.edu/cgi-bin/hgBlat?com-mand=start&org=Chicken&db=galGal2& hgsid=30295 885) and BLAST to identify database matches corresponding to the recently released chicken genomic DNA sequence and known genes in GenBank, respectively.
The chicken radiation hybrid panel (Morisson et al., 2002) was used to map VTEST71 in order to validate the in silico chromosomal location of the EST. Forward and reverse primers specific for the EST, designed using Primer 3 (Rozen and Skaletsky, 1997), were used for the genotyping. The forward and reverse primers were 5 -GAT TTC AAA ACG GAC TTG AG-3 and 5 -TGA GCA GTC ACT TTT AGC ATT-3 , respectively. The PCR was carried out in a final volume of 10 µl containing 1.5 mM Mg 2+ Buffer (Eppendorf, Westbury, NY), 200 µM dNTPs, 70 µg primer (MWG Biotech), 1 U Taq (Eppendorf), and 5 ng template. The cycling was performed using a Mastercycler (Brinkmann, Westbury, NY) with the following program: initial denaturation at 95 • C for 5 min followed by 95 • C for 45 s, 55 • C for 45 s, 72 • C for 45 s for a total of 38 cycles of denaturation, annealing and extension, respectively. A final extension at 72 • C was carried out for 7 min. The PCR product was run on a 2% agarose gel stained with ethidium bromide, and scored as 0, 1 or 2 for absent, present, or ambiguous, respectively. Mapping results were determined by the Morisson lab from these data.

SNP analysis
An in silico analysis of 10 ESTs that closely matched chicken genes was used to identify candidate SNPs in the ESTs according to the pipeline protocol of Buetow et al. (1999). Validation of the candidate SNPs for three of the ESTs was carried out by PCR-based resequencing of amplicons from 10 unrelated commercial birds, using previously described protocols (Smith et al., 2001).

Results and discussion
Of the 200 clones sequenced, a total of 192 sequences exceeded a Phred quality score of 30 (Ewing et al., 1998). These 192 sequences were submitted to GenBank and have been assigned accession numbers ( ESTs matched known chicken gene or cDNA sequences. All but 28 ESTs aligned with genomic DNA sequences assigned to chicken chromosomes. Ninety-one (about 47%) ESTs aligned with sequences assigned to macrochromosomes (GGA) 1-6, and four sequences aligned to genomic DNA sequences assigned to the Z chromosome. An additional three ESTs aligned with sequences designated to be on the Z random, while one matched  (2004) to be about 12% of the chicken genome. Seven ESTs matched sequences assigned either to more than one region of a chromosome or on different chromosomes. A few ESTs aligned with sequences from Escherichia coli, which could be due to bacterial contamination or simply to conserved sequences. The chromosomal assignments of some of the ESTs should be considered putative, as there are still many errors in the draft chicken genomic DNA sequence. The incompleteness of the Gallus gallus DNA sequence may also account for the relatively high percentage of ESTs that showed no significant sequence similarity to known chicken sequences.
The chromosomal assignment of VTEST71 to chromosome 18, based on the sequence alignment with the recently released genomic DNA, was confirmed by radiation hybrid mapping. VTEST71 is designated as locus VTC08 on the chicken radiation hybrid map and is flanked by MCW0217 and ADL0290, with LOD scores of 10.7 and 13.1, respectively. VTEST71 showed 99% sequence similarity to chicken histone protein H3 and 95% identity with human Histone H3.3 (AK130772). Previously, chicken H3 was also mapped to chromosome 18 by RFLP, while the human H3 was linked to chromosome 17 (Levin et al., 1994).
A total of 22 SNPs were identified and validated in the three ESTs scanned (data not presented). Eight of the SNPs were non-synonymous and are described in Table 2. All the SNPs appear to be novel and have not been previously described (Smith et al., 2002;Wong et al., 2004). Therefore, these SNPs, although few, may be useful in efforts to assign phenotypes to genotypes and identifying the effects of the three genes on different chicken traits, e.g. knowledge of the function of cofilin, an essential protein for depolymerization of actin filaments, is still limited (Arber et al., 1998). The three non-synonymous SNPs described may be useful in further defining its role in skeletal function and the dynamics of actin filaments. Similarly, the recently discovered collapsin response mediator gene product is thought to have a role in the incidence and/or severity of Alzheimer's disease (Yoshida et al., 1998). The SNPs described in this gene in Gallus gallus may be useful in investigating the role of this apparently important gene that is also expressed in the chicken pineal gland. It is not surprising that only 45% of the ESTs aligned to GGA1-6 DNA sequences, which comprise approximately 65% of the chicken genome. In their analysis of the draft sequence, the International Chicken Genome Sequencing Consortium (2004) reported that the density of CpG islands showed a strong negative correlation with chromosome length. This distribution supports earlier studies by McQueen et al. (1998) and Smith et al. (2000) of a higher density of genes on the microchromosomes than on the macrochromosomes. Several explanations are possible for the 9% of ESTs that did not match known sequences in GenBank, including novelty in vertebrates, too short to match known sequences, or contamination. In a recent comparative gene analysis between the chicken and human genomes, Castelo et al. (2005) predicted that the undiscovered genes in the human gene set may be very low, at a predicted lower limit of about 0.2%.
The number of ESTs and SNPs described in the present work are small relative to the total numbers of both genomic reagents currently available in GenBank and other databases. That they are potentially useful, however, is evident by the novelty of some of the sequences. Since a significant fraction matched mammalian genes and/or DNA sequences, they can be used as resources for comparative genome analysis of genes expressed in the pineal gland. Such comparative analysis may be useful in assigning function to chicken sequences. A similar impact on chicken biology is also likely with the SNPs described. Finally, it is worthy of note that one of the ESTs matched a W chromosomeassigned sequence. Currently, the number of genes assigned to this chromosome is limited. As efforts such as ours, even though limited in scope, identify additional ESTs, it will provide the genomic reagents essential to further increase our understanding of a chromosome that continues to be little understood.