Assessment of Tools for Marker-Assisted Selection in a Marine Commercial Species: Significant Association between MSTN-1 Gene Polymorphism and Growth Traits

Growth is a priority trait from the point of view of genetic improvement. Molecular markers linked to quantitative trait loci (QTL) have been regarded as useful for marker-assisted selection in complex traits as growth. Polymorphisms have been studied in five candidate genes influencing growth in gilthead seabream (Sparus aurata): the growth hormone (GH), insulin-like growth factor-1 (IGF-1), myostatin (MSTN-1), prolactin (PRL), and somatolactin (SL) genes. Specimens evaluated were from a commercial broodstock comprising 131 breeders (from which 36 males and 44 females contributed to the progeny). In all samples eleven gene fragments, covering more than 13,000 bp, generated by PCR-RFLP, were analyzed; tests were made for significant associations between these markers and growth traits. ANOVA results showed a significant association between MSTN-1 gene polymorphism and growth traits. Pairwise tests revealed several RFLPs in the MSTN-1 gene with significant heterogeneity of genotypes among size groups. PRL and MSTN-1 genes presented linkage disequilibrium. The MSTN-1 gene was mapped in the centromeric region of a medium-size acrocentric chromosome pair.


Introduction
The value of aquaculture production was estimated at USD 98.4 billion in 2008 and has continued to show strong growth, increasing at an average annual growth rate of 6.2 percent [1]. Among fishes, the gilthead seabream Sparus aurata, from the Sparidae family, is one of the most important fish species cultivated in the Mediterranean region, where the main producer countries are Greece, Turkey, and Spain [2]. The strong competition between producer companies and the steep increase in production have caused the selling prices of the product to fall; this in turn has seriously reduced the profit margins of the companies, which in many cases have been squeezed to unsustainable levels. For this reason improvements in the systems of cultivation have become essential, and programmes of improvement by selection must be put into action to reduce production costs. Nevertheless, from cultivated species, only a few of them have ongoing selective breeding programmes. Traditionally, these have been carried out successfully using pedigree information by selecting individuals based on breeding values using an animal model. However, information on individual genes with medium or large effects cannot be used in this manner [3]. Molecular markers that directly affect or are linked to quantitative trait loci (QTL) have been regarded as useful for marker-assisted selection (MAS) or gene-assisted selection (GAS) programmes. The allelic variations at these individual genes have a major influence on overall phenotypic expression, and the number of genotypes managed is smaller, so it may be faster and more efficient to implement these improvement programs rather than others under industrial production conditions. Besides, traits as growth,  with low heritability and relatively few records per traits measured, are those most benefiting from incorporating marker information [4].
Tests of association between candidate gene polymorphisms and quantitative traits have been widely applied in recent years using genetic markers. In that context, PCR-RFLPs (polymerase chain reaction-restriction fragment length polymorphisms) have been demonstrated to be very useful genetic markers for candidate gene studies; such studies have revealed polymorphisms associated with quantitative traits in Atlantic salmon Salmo salar [5], oysters Crassostrea gigas [6], cattle [7,8], chickens [9], and sheep [10].
Growth is a priority trait from the point of view of genetic improvement since economic advantages can be gained from shortening the time required for production and improving the rate of feed conversion. Increasing homogeneity in the rate of growth of individuals is another characteristic of interest, since it would reduce the number of gradings to be carried out until harvest size is reached. In this context, the search for candidate genes influencing growth has been a major focus of research, and several candidate genes influencing growth in finfish have been isolated from the genome, and their effects quantified [11]. The growth is a complex trait from the genetic point of view because its genetic con-trol is not well known and there are dozens of candidate genes acting upon it.
The five candidate genes selected for this study were growth hormone (GH), insulin-like growth factor-1 (IGF-1), myostatin (MSTN-1), prolactin (PRL), and somatolactin (SL) genes. All of them are known to be related to the somatotropic axis or transforming growth factors. Growth hormone (GH) plays a major role in stimulating somatic growth primarily through the induction of insulin-like growth factor I (IGF) [12]. Myostatin (MSTN) or growth differentiation factor 8 gene (GDF-8) seems to be a negative regulator of skeletal muscle growth in mammals [13]. There are two genes in gilthead seabream that code for MSTN: MSTN-1 [14] and MSTN-2 [15]; MSTN-1 is expressed mainly in skeletal muscle, at both adult and juvenile stages, and MSTN-2 is expressed almost exclusively in the central nervous system, at late larval stages [14,15]. The insulinlike growth factor I gene (IGF-I) appears to be linked to nutritional condition, environmental adaptation, embryonic development, and growth regulation of teleosts [16]. Other candidate genes acting upon growth are the prolactin (PRL) gene, which performs several functions including mitogenic, somatotropic, and osmoregulatory activities [17] and the somatolactin (SL) gene, involved in a variety of physiological functions in teleosts [18].
Genetic mapping of genes allows a complete identification and the location of a gene and hence can be used in programs of genetic improvement in aquaculture. Physical maps enable the integration of linkage maps and karyotypes and are essential tools for comprehensive comparative genomic studies. In addition, the existence of a well-characterized physical map makes it more feasible to undertake a whole genome sequencing project [19]. Here, we report the localization of MSTN-1 gene, which presented in this work statistically significant association with growth traits, on S. aurata chromosomes.
With the object of gaining new understanding of genes related to growth traits in a marine commercial species, the objectives of the present study were to find and describe polymorphisms in candidate genes from S. aurata populations produced under commercial conditions and to test for associations between these polymorphisms and growth traits. Furthermore, the MSTN-1 gene, whose polymorphism has shown, in this study, a statistically significant correlation with growth traits, was mapped to the chromosomes of this species by means of the tyramide signal amplification (TSA) fluorescence system.

Animals and Phenotypic
Traits. The samples were obtained from CUPIMAR S.A., an aquaculture company that operates in the area of the Bay of Cadiz, on the Atlantic coast of southwest Spain. One broodstock and their offspring were analyzed at different stages of production ( Figure 1). The broodstock in these installations have mixed origins; wild specimens are added to those from the company's own aquaculture production, because the transformation of males into females after the third year results in a relative lack of males. The broodstock was utilised during several spawning seasons to generate the F 1 that are cultivated and grown for subsequent transfer to the on-growing facilities until reaching harvesting size. The breeders were maintained under environmental conditions of temperature, salinity, and oxygen, according to the management practices of the farm. A total of 131 individual breeders were sampled: 48 males and 83 females (N B = 131). Within each gender, the breeders were all of similar age.
Fish were anaesthetized with 0.025% 2-phenoxyethanol solution; fork length (L) and body weight (W) were recorded, and 1-1.5 mL of blood was extracted for DNA isolation. Maturation was induced by control of the photoperiod. Spawning took place in several tanks under production The Scientific World Journal 3 conditions, and offspring were then grouped together in one tank. The offspring originated from the spawning that took place in one single day.
Total residence time, calculated from the hatching date to the first sampling date, ranged between 112 days and 126 days (J120). A total of 148 juvenile fishes (N J120 = 148) were measured (fork length and body weight) during this period of fourteen days; specimens were graded, and either fin clips were excised or blood samples were taken, in both cases being preserved in 70% ethanol at 4 • C till processing for DNA isolation. Juveniles were kept in nursery tanks and transferred to floating cages, where they were cultivated until reaching adult stage. At this stage, the individuals from two cages were sampled at different dates, representing either 635 days (N A635 = 108) or 819 days (N A819 = 120) after the date of spawning, respectively, and the 228 individuals in total were measured for fork length and body weight ( Figure 1). Finally, quantities of 1.0 to 1.5 mL of blood were extracted for subsequent DNA analysis. A condition index was used to assess relative fatness and provide an estimate of the energy reserves, morphology, and health of the fish and is the parameter that best reflects the criterion utilised by the producer for its commercial classification of the product. Body weight × fork length −3 × 100 is the metric of the condition index (C).

DNA Extraction, Genotyping, and Parental Assignment.
Genomic DNA from breeders, adults, and some juveniles was extracted from blood according to the protocol described by Martínez et al. [21]. For the rest of the juveniles, genomic DNA was extracted from fin clips using a modification of the salting-out extraction method. In all cases, the DNA was kept at 4 • C until utilisation. All 376 offspring and 131 breeders were genetically characterised using microsatellite markers (multiplex) and familial assignments, determined according to Porta et al. [22].

Polymerase Chain Reaction (PCR) and
Sequencing. The 11 regions from the five candidate genes that were screened by restriction fragment length polymorphisms (RFLP) are listed in Table 1. Except for growth hormone intron I amplification, which was carried out using oligonucleotides as described in Almuly et al. [20], all other amplification products were performed using forward and reverse primers based on S. aurata sequences from GenBank and designed with Primer3 software [23]. Primers were designed to anneal in the exons and/or promoters of the candidate genes so that we could amplify 0.7-1.9 kb DNA regions including at least one intron or part of the promoter (Table 2 and Figure 2).
The PCR amplification reactions were performed in a Gene Amp PCR System 2700 (Applied Biosystem) thermal cycler that was programmed for denaturation at 94 • C for 5 min.  reaction buffer (EuroClone Genomics). The PCR products were examined by electrophoresis through a 1.5% agarose gel containing 0.5 μg mL −1 ethidium bromide. Amplified products of the predicted size were cut and purified from gels with a nucleoSpin extract II kit (Macherey-Nagel) before carrying out the sequencing reactions. DNA sequencing was performed with fluorescence-labelled terminator (BigDye Terminator 3.1 Cycle Sequencing Kit; Applied Biosystems) in an ABI3100 Genetic Analyzer. Sequence data from this paper have been deposited with the GenBank Data Library under Accession numbers FJ827497 to FJ827506.

RFLP Detection and Genotyping (PCR-RFLP Alleles).
The selection of the restriction enzymes needed to observe the RFLPs was performed using software available on the Internet, such as In silico [24] and Webcutter 2.0 (http://rna.lundberg.gu.se/cutter2/). Preferential selection was made for those enzymes that produced cuts within the target sequence and with the number of targets not higher than three, to avoid products of small size that might not be detected later in the electrophoresis. Different enzymes were tested for the different PCR products, with the aim of obtaining at least two RFLPs for each sequence amplified. The digestions were performed by incubation of the samples in a water bath at the optimum temperature for each enzyme according to the manufacturer. The reactions were carried out in a final volume of 50 μL containing 0.05-1.0 μL 4 The Scientific World Journal  of enzyme (Roche Molecular Biochemicals, Amersham Biosciences or Fermentas), 1x of the appropriate buffer, and 2-5 μL of the PCR product.
To determine the size of the products, the samples were loaded into agarose gels (1-1.5%) in 0.5x TBE with ethidium bromide (0.5 μg mL −1 ). Gels were visualised and analyzed using Gel Doc XR Molecular Imager (BioRad) and Quantity One 1-D Analysis programme, respectively.

Statistical Analysis.
One-way analysis of variance (ANOVA) was used to test significant associations between the marker genotype (at each of the eight RFLP loci) and B-M, B-F, and B-T: males, females, and total individuals among breeders, respectively; J120: juveniles sampled 120 days after hatching; A635 and A819: adults sampled 635 and 819 days after hatching.
the three phenotypic traits (L, W, and C). The procedures utilised in the comparison of means are not very sensitive to the lack of normality. When there are no atypical observations and the sample distributions are approximately symmetric, ANOVA can safely be used even for very few samples (n = 4-5) [25]. On the other hand, the results of the F test of the ANOVA can be considered reliable, when the largest standard deviation is less than twice as large as the smallest standard deviation [25]. The ANOVA test was applied only in those cases in which the standard deviations complied with this ratio. In those cases in which there were significant values and the sample standard deviations did not comply with this ratio, the nonparametric test of Kruskal-Wallis was applied.
As the gilthead seabream is a protandrous hermaphroditic teleost, in which sex is reversed in males after about 2 years of age, the broodstock presented a marked difference between the means of the males and females. For this reason, we analyzed length, body weight, and condition index variables for each locus by fitting a general linear model (GLM) to the data from breeders that included gender as a factor: where Y i j is the phenotypic trait (e.g., weight in breeder group), μ is the mean value of the trait, α i is the genotype at the RFLP marker, β j is gender, and ε i j is the random residual. The proportion of the phenotypic variation explained by each RFLP marker was calculated as r 2 = SS effect/SS total.
All these tests were applied using the SPSS 14.0 statistical software programme. In all cases when multiple comparisons were made, the Bonferroni adjustment was applied [26].
To determine the existence of linkage disequilibrium between pairs of loci analyzed, the null hypothesis "genotypes at one locus are independent from genotypes at the other locus" was tested using the GENEPOP programme package [27]. Contingency tables were created for all pairs of loci in each age group and across all the population, and Fisher's exact test was performed for each table.
The distribution of the genotypes in each age group, subclassified by sizes, was studied. The null hypothesis tested was "the genotypic distribution is identical across the size-group". For each locus, the test was performed on contingency tables. An unbiased estimate of the P value of a log-likelihood (G)based exact test was performed [28] using the GENEPOP program package [27]. The test was performed for all pairs of samples, for all loci. According to fork length and body weight distribution, adults and breeders were classified in two groups: small and large (adults) and males and females (breeders). Juveniles were classified in two size groups, small (J-S) and large (J-L) which were subclassified in small (S), medium (M), and large (L). When multiple comparisons were made, the Bonferroni correction was applied to adjust the α value for each age group.

Cytogenetic Techniques.
Chromosome preparations were made from 1-2-day-old S. aurata larvae according to Cross et al. [29]. To make the myostatin probe, first-strand cDNA was obtained from S. aurata larvae RNA using a SMART RACE cDNA Amplification Kit (Clontech Laboratories). The cDNA for the MSTN-1 probe was obtained by PCR from the first-strand cDNA using a forward primer in exon 1: 5 -AGAAGACACGGAGCTGTGC-3 and a reverse primer in exon 3: 5 -AAGAGCATCCACAACGGTCT-3 . The amplification yielded a fragment of 1025 bp, and its structure was verified by sequencing. Digoxigenin (dig-11-dUTP) labelling of 1 μg DNA probe was performed by random priming using the Decalabel DNA Labeling Kit (Fermentas) according to the manufacturer's instructions. Following the labelling, the sample was purified on a column of QIAquick Gel Extraction Kit (Qiagen) in a final volume of 50 μL. FISH-TSA were carried out essentially according to Krylov et al. [30] with minor modifications. Photographs were taken using a fluorescence microscope (Olympus BX-40) and a SONY EXwave HAD black and white camera. Images were processed and coloured using the ACC Program, v.5.0 (SOFO, Brno, Czech Republic).

Biometric Data of Breeders and
Progeny. The analysis of parentage revealed that a total of 80 fishes, 47 males and 33 females, from 131 breeders, contributed to the progeny (exclusion power > 0.999 and an estimated probability of identity of 2.13 · 10 −31 ). A total of 108 families of full and half-siblings, were represented in the sample of 376 offspring samples, with an average of 3.2 (±3.15) descendents per full siblings family. Table 3 summarizes biometric data (weight, length, and condition index) from breeder and offspring in the different sampled groups. Because of the biology of S. aurata, a clear size difference can be observed between male and female 6 The Scientific World Journal breeders; all individuals are males at the end of their first year of life, and at approximately two years of age, most individuals become females. In our study, all the offspring sampled were male. In the breeder group, the 83 females sampled presented mean weight and length values higher than those of the 48 males. The coefficients of variation (CV) can be deduced from Table 3, and for weight, CV is approximately 3 times the values obtained for L and for C.

3.2.
Polymorphisms of the Five Candidate Genes. We studied five candidate genes for growth rate in S. aurata and tested for association with three quantitative traits (body weight, fork length, and condition factor) at different stages of development: breeders, juveniles, and adults. To find usable RFLP markers in the proposed candidate genes, 11 gene fragments, covering more than 13000 bp, were amplified and sequenced. Several enzymes were chosen to test for usable polymorphism. Four of the 11 fragments studied (GH B, IGF-I B, SL A, and SL B) were nonpolymorphic, but seven exhibited polymorphism of size and/or restriction sites, as shown in Table 1 (RFLP patterns obtained in PRL 3 4 for HaeIII and FokI were identical, and therefore only one of them (FokI) was considered in the ANOVA analysis).

One-Tail ANOVA Analysis of the Association between Genotypes of the Different Age-Groups, of the Five Candidate
Genes and Phenotypic Traits. When the ANOVA test was applied, several putative associations between phenotypic traits (W, L, and C) and genetic marker genotypes were found (Table 4). However, as we tested seven independent hypotheses (seven RFLPs) in each age group the corrected value for α was considerably smaller (α Bonferroni = 0.05/7 = 0.00714). Under this α-correction for the ANOVA test, six putative associations between markers and traits remained significant: five of the six were associated with the MSTN1 RFLP (in some of the breeders, adults and juvenile groups) ( Table 4). After Bonferroni corrections in the Kruskal-Wallis nonparametric tests (applied to W and C in the juveniles group, and C in the A635 adults), two putative MSTN1-RFLP associations (both in breeders for W and L) remained significant (MSTN1 A-HaeIII). This marker accounted for 5.8% and 6.2% of the phenotypic variance in W-B and L-B, respectively.
Size differences between males and females are a characteristic of this species due to its protrandrous hermaphroditism, by which individuals mature first as males and later change to females. Because of this difference, the study of associations between genotypes and quantitative characteristics in breeders may be affected by sex. When a GLM was used to test for association between genotypic marker and the three quantitative traits (W, L and C) in breeders with gender factor (data not shown), the association between MSTN1 A-HaeIII and weight/length traits did not remain significant (P W = 0.084 and P L = 0.061).

Pairwise Analysis of the Association between Genotypes, Five Candidate Genes, and Size Classes in Each Age Group.
To determine the distribution of genotypes between pairs of size classes for each age group, an analysis using contingency tables was made. Pairwise tests revealed several RFLPs with significant heterogeneity of genotypes among size-group pairs in juveniles, again most of them at the MSTN-1 gene (P ranged from 0.002 to 0.017), and one significant P value (P = 0.019) in the A819 adult group at the IGF-I gene ( Table 5). As numerous comparisons were performed for each age group, Bonferroni's correction was applied. Under these corrected values of α, the comparison of the smalllarge groups (S-L), in the RFLP MSTN1 B-HaeIII in the J-S group, remained significant. With this new distribution of individuals, in function of their size, an ANOVA analysis was performed for the small juveniles (S-J120); it was found that, for the weight, there was a significant association (P = 0.003) with the RFLP MSTN1 B-HaeIII, confirming the putative relationship of MSTN-1 with growth traits obtained in the various statistical treatments reflected in Tables 4 and 5.
To assess whether heterozygosity at the MSTN-1 gene influences any of the traits studied, the genotypes were grouped by homozygous/heterozygous type, and this new factor was used in an analysis of variance (data not shown). It was observed that again, for weight of both breeders (MSTN1 A-HaeIII) and juveniles (MSTN1 B-HaeIII), the homozygous genotypes were larger (P < 0.05) than the heterozygous.
When we tested for the existence of linkage disequilibrium between pairs of loci, our results revealed that, curiously, MSTN1 A-DraI were in linkage disequilibrium with both PRL-1-BspLU11I (P ≤ 0.05) and PRL-1-EcoRV (P ≤ 0.05) (Fisher's exact test), which implies a statistically significant linkage disequilibrium between these two genes, MSTN1 and PRL.

Chromosome
Location of the MSTN-1 Gene. FISH-TSA experiments on S. aurata chromosomes have led us to localize the MSTN-1 gene of S. aurata in the centromeric region of two medium-sized acrocentric chromosomes (Figure 3(a)). Signal position was evaluated in 194 metaphase spreads, of which 65.9% displayed a specific signal. The remaining metaphase spreads did not show any FISH signal, apart from a few metaphase spreads (8.7%) that presented four possible signals; that second signal also appears in a centromeric region (Figure 3(b)).

Discussion
The infinitesimal model, which assumes that for any particular quantitative trait the individual effects of genes are unknown and usually are only small in their magnitude, does not fully explain the observed patterns of genetic variance. On the contrary, for many traits, allelic variation at individual genes has been shown to have a major influence on overall phenotypic expression [11]. Consequently, there has been substantial investment in livestock sciences directed towards the detection of major/candidate genes differentially influencing trait expression. Taking these factors into account, this study was conducted to obtain more practical results that are closely related to conditions in fish farming and to the biology of the species. The analysis of polymorphism Table 4: Analysis of variance (ANOVA) for weight, length, and condition index within genotypes for different age groups in S. aurata. Breeders are considered without respect to gender.  The Scientific World Journal Table 5: Probability values (P) for exact tests for homogeneity of RFLP genotype frequencies in candidate genes between analyzed groups of S. aurata.   in eleven DNA fragments, belonging to five candidate genes, allowed us to use seven PCR-RFLPs to test for association with three quantitative traits (W, L, and C). The majority of the associations found, after several corrections and nonparametric tests (when necessary), were associated with the MSTN-1 gene. In breeders, when ANOVAs were performed considering gender as a factor (data not shown), no significant values were obtained. The considerable weight lost at spawning in females may explain the effect of transition from male to female on length and weight traits. Pairwise tests revealed significant heterogeneity of genotypes among size groups in one RFLP marker (MSTN1 B-HaeIII) ( Table 5) and confirmed the putative relationship of MSTN-1 with growth traits observed in the ANOVA tests. When we tested for association between homozygous/heterozygous genotypes of the MSTN-1 gene and phenotype traits, the results suggest selection at the genotype level, with homozygotes being favoured at certain ages for certain alleles present in the population. In order to test rigorously and confirm the putative associations uncovered in this study, cross-test family-based analysis should be applied, under similar conditions of industrial production.

Locus
In this study, myostatin polymorphisms were putatively linked with growth traits in S. aurata under industrial production conditions. An association between this candidate gene and growth is not surprising because myostatin acts as a negative regulator of skeletal muscle growth [31], and many studies show that myostatin can affect muscle mass in several species [32][33][34], and the presence in the population of favourable alleles would result in greater growth of those individuals that were homozygous for growth-increasing alleles. Unlike terrestrial vertebrates, piscine MSTN mRNA has been detected in several tissues including muscle, gill, skin, brain, renal, and gonadal tissue [35]; this pattern of expression suggests that MSTN is involved in numerous physiological activities such as osmoregulation, gonadal development, growth, and maturation.
No significant correlations of molecular variants and quantitative traits were revealed in the present study at the other candidate genes analyzed. However, genes within the somatotropic axis and transforming growth factor superfamily have been shown to influence growth variation in terrestrial livestock, as well as in other vertebrates [11].
We tested the null hypothesis that genotypes at one locus are independent of genotypes at the other locus, for each locus pair across the population. A locus pair located at the same gene (e.g., introns) should be in linkage disequilibrium; the results show a significant linkage disequilibrium between MSTN-1 and PRL. However, not all of them revealed this expected disequilibrium across all age groups. This is because several or many families were present in each group, and different genotypic combinations could be observed. The finding is remarkable because it indicates that genotypes of two RFLP markers (located in MSTN-1 and PRL genes) are not independent. If these loci are located on the same chromosome, they can be selected at the same time. If they are not linked, it means that some genotypic combinations of the two loci are favourable (or are selected against) under some specific conditions and they can be used in marker-assisted selection (MAS), too. Sarropoulou et al. [36] localized the S. aurata growth hormone and prolactin genes on radiation hybrid group RH24 of the gene-based radiation hybrid map of gilthead seabream. This location is very interesting because our results suggest linkage disequilibrium between PRL and MSTN-1 genes; therefore, if this disequilibrium is because the two genes are linked in the genome, a linkage between three candidate genes, PRL, MSTN-1, and GH, for putative growth-related QTLs would have been revealed.
In relation to cytogenetic mapping, the application of fluorescence in situ hybridization (FISH) to fish genetics and genomics has been widely tested (reviewed in [37]). Chromosome mapping of single copy genes is useful in isolating QTL of importance in aquaculture, and it is the first step towards linking physical and linkage maps. In fishes, to date, only one study, on D. rerio, has been found in which the MSTN-1 gene was localized on chromosome 9, close to the marker Z8363 [38]. The fact that some metaphases, in our study, showed four signals may be due to the presence of two MSTN genes in S. aurata. The relative similarity at the nucleotide level of the two myostatin cDNA genes (72%) could be enough to find some metaphase spreads with four signals. In the light of the results, the two genes (MSTN-1 and MSTN-2) are located on different chromosomes, as was found in zebrafish, where the MSTN-2 gene was situated on chromosome 22 (Zv7 of the zebrafish genome, NCBI data base). Our results show that the genotypes of MSTN-1 gene should be considered for selective improvement programmes of S. aurata and could be a start point in order to detect candidate genes for growth in other aquatic species of economic interest.