Clonal Identification Based on Quantitative , Codominant , and Dominant Marker Data : A Comparative Analysis of Selected Willow ( Salix L . ) Clones

Clonal identification in forestry may employ different means, each with unique advantages. A comparative evaluation of different approaches is reported. Nine quantitative leaf morphometric parameters, 15 variable codominant (isoenzyme) and 15 variable dominant (RAPD) loci, were used. All clones presented unique multilocus isoenzyme genotypes and 86% presented unique multilocus RAPD genotypes. Quantitative, isoenzyme and molecular data were subjected to principal component analysis, the latter two data sets after vector transformation. Most of the variability (quantitative 99%, isoenzyme 72.5%, RAPD 89%) was accounted for in the first three axes. This study has shown: (1) individual quantitative parameters were inefficient for clonal identification, (2) multilocus clonal identification was successful, (3) dominant markers were more polymorphic than codominant ones: 1.5 variable loci per enzyme system, 7.5 variable RAPD loci per primer, (4) 15 codominant marker loci could identify about 2.8 times more individuals than 15 dominant ones, but this advantage is surpassed when 42 dominant loci are employed, (5) multivariate analysis of morphological, codominant and dominant genetic data could not discriminate at the clonal level. It was concluded that due to their higher number of loci available dominant markers perform better than codominant ones, despite the higher informativeness of the latter.


Introduction
Clonal identification in forest trees has been traditionally based on morphological and phenological traits [1,2].Morphological characterization is a common step in plant breeding presenting a fast method to identify and characterize germplasm.Nevertheless, phenotypic characteristics are influenced by environmental or physiological factors that may cause elevated diversity in trait scoring thus lowering reliability.The employment of genetic markers which present Mendelian inheritance provided a more precise identification of plant genotypes and portrayed a great potential for the characterization of economically important cultivars (for a detailed discussion on method and marker choice see [3]).However, in practice, the choice of clonal identification method is based not only on the scientific merits of traits or markers but also on infrastructure and financial resources as well.
Willows offer an excellent means to test different clonal identification approaches.Intensive culture of fast growing species in feedstock plantations for wood, biomass, and energy has been established as a worldwide valid alternative to classical forestry, leading to the selection, mass production, and utilization of a large number of fast growing clones, hybrids, and varieties of willows [4].Willows are at the early stages of domestication, hence large-scale clonal evaluation and distribution is essential for tree improvement activities.Such tasks require proper identification of genetic entries [5].Traditionally, clones have been described based on their morphology and phenology [1,2].However, morphological markers have been criticized as time-consuming and subject to environmental and ontogenetic effects in the Salicaceae   [6], while results have shown the significant discriminative ability of genetic markers in clonal identification [5,7].
Comparisons of genetic markers for diversity measurements have been carried out in a number of plant species [8][9][10][11] including poplars [12].However, studies directly evaluating different approaches for clonal identification in perennial woody plants based on exactly the same plant material, numbers of loci, and methodology of statistical analysis are scarce.Transcribed codominant isoenzymes and anonymous dominant RAPDs offer the opportunity to test markers with different properties for fingerprinting.Isoenzymes, as functional gene products, present a simple and reliable system for clonal identification, albeit of limited polymorphism reflecting a low mutation rate.RAPDs, which are nonfunctional DNA sequences representing different levels of DNA variability from single-base changes to insertions and deletions, do not depend on sequence information and reveal high numbers of loci, but of relatively lower informativeness due to their dominance.It should be noted that classification and organization of clones can be based on a wide array of morphological, biochemical, or molecular descriptors all of which present some degree of genetic control.No marker is superior to all others for a wide application range [3,13].Hence, no marker may be viewed in a vacuum and its true value can be evaluated only in comparisons based on the same genetic material [14].
In this paper willow clones selected for biomass production are characterized based on quantitative, biochemical and molecular means, and a formal comparative analysis of the different identification approaches is presented.Few studies have the diversity generated by molecular markers compared to that revealed by morphometric traits [15][16][17].A better understanding of the different molecular marker effectiveness is considered a priority step toward germplasm characterization and classification and a prerequisite for more effective breeding programs [8].

Materials and Methods
Seven willow clones (three Salix eriocephala, two S. exigua, one S. eriocephala x exigua, and one S. exigua x eriocephala; Table 1) representing material collected from unrelated natural populations of Ontario, Canada, and pedigree material, which originated from the breeding program of the Forest Genetics Laboratory, Faculty of Forestry University of Toronto (FGL-UT), were employed in this study.All the material has been growing in a clonal trial under the same conditions and treatments [4].
Seven leaf morphology parameters (leaf length, leaf width, petiole length, distance from leaf base to leaf widest point, number of teeth per centimeter, stipule length, and stipule width) were recorded to the nearest mm in 10 fully expanded leaves per clone (Table 2).Two additional covariables, leaf and stipule indices defined as the ratio of respective length to width, were also employed.These indices, which are independent shape variables, have been used extensively in leaf morphometrics [18,19].Data normality was tested by means of the Shapiro-Wilk statistic, and transformations were performed when criteria were not met.Product moment correlation coefficients provided a measure for the elimination of highly correlated variables.Principal component analysis (PCA) was employed in order to examine the simultaneous contribution of all leaf parameters in discriminating between clones and to observe the ordination and grouping of clones in principal space.In the field of numerical taxonomy the use of PCA is prevalent, since it fits for the analysis of data concerning more than one variable measured for each individual and it results in sensible biological explanation of the outcome [20,21].Horizontal starch gel electrophoresis was employed in order to resolve isoenzyme markers suitable for clonal identification.Ten enzyme systems were investigated (Table 3).Established methodology and protocols for willows were followed [22,23].Two random decamer primers (CHL-2 and CHL-4; Table 4) were selected based on their RAPD polymorphism in Salix [7].Previously reported protocols regarding amplification reactions, electrophoresis, banding patterns visualization, and locus naming nomenclature were used [7,24].The particular enzyme systems and primers were chosen based on the fact that they revealed the same number of variable isoenzyme and RAPD loci, respectively.
Multilocus genotypes were determined, and for the identification of unique genotypes all 42 possible pairwise comparisons between any two clones were investigated for both markers.The number of loci at which unique multilocus genotypes and clones differed from each other was determined.The genotypes of individual clones at each of the variable loci were coded after vector transformation [25].The coded genotypic data were investigated by PCA using SAS software [26] and defining individual clones as classes.Clonal ordination and associations on the first three principal components were determined.
The visualization of loci within one enzyme system and the PCR product amplification with one random primer were regarded as one assay unit, respectively.Fourteen criteria were used in order to compare the feasibility of codominant versus dominant markers for clonal identification: twelve reported in the relevant literature [8,[27][28][29][30][31] and in addition information capacity (I C ) developed as I C = n L PN G (N G : number of potential genotypes based on the mode of gene action taking the value of N G = n A (n A + 1)/2 for codominant and N G = n A for dominant markers), as well as information capacity per assay unit (I C/U ) developed as I C/U = (n L /U)PN G .(Tables 5 and 6).Use of these criteria is based on the assumptions that there is no linkage disequilibrium among loci, and artificial selection of elite clones is independent of isoenzyme or RAPD loci frequencies.

Morphometrics.
Descriptive statistics for the leaf morphometrics parameters indicated that leaf length, leaf width, and petiole length characters were able to differentiate at  the species level, while values of interspecific hybrids were generally similar to those of the maternal parent species (Table 2).Species differences were exemplified in the case of the stipule characters.However, within species clonal identification could not be achieved.

Codominant and Dominant Genetic Markers.
A total of 19 loci (43 alleles) were scored in the 10 enzyme systems studied.Loci, presented Mendelian inheritance and codominant allelic expression in S. eriocephala and S. exigua [22,23], can directly be employed in clonal identification.The interclonal isoenzyme variability was controlled by 15 loci (Table 3).All clones presented unique multilocus genotypes.Clones were compared in pairwise tests which showed genetic differences between any two clones in seven to 14 loci.Any two clones differed on the average in 10.7 loci.
Ten-locus genotypes were adequate for discriminating 42% of the studied clonal pairwise comparisons while 12-locus genotypes were adequate for discriminating about 88% of the studied pairwise comparisons.Seventeen stable and repeatable RAPD loci were scored in a fragment size range of 220 to 1830 bp (Table 4).Interclonal molecular genetic variability was controlled by 15 loci covering the observed fragment sizes.Pairwise tests for the identification of unique multilocus genotypes showed that the number of differences among clones ranged from zero to 13.The average number of genetic differences between any two clones was 7.70 loci.Nine-locus genotypes were sufficient for discriminating 48% of the clones while 12-locus genotypes were enough for discriminating 92% of the clones.

Levels of Marker Polymorphism and Discriminating
Capacity.The number of polymorphic loci was set to be the same providing a means of marker comparison with regard to the number of assay units.Isoenzymes needed five times more assay units than RAPDs to reach the same number of variable loci (Table 5), while the number of loci per assay unit was 1.9 for isoenzymes and 8.5 for RAPDs.The total number of bands (Table 5) did not correlate to the fraction of polymorphic loci, as polymorphism was higher in RAPDs (Table 6).The number of bands per assay unit was also not correlated to the total number of bands and it demonstrated the difference in marker abundance, being five times higher in RAPDs than in isoenzymes (Table 5).The number of genotypes that, could potentially be identified with each marker system, indicated that for the same number of variable loci, isoenzymes could identify 83 genotypes compared to RAPDs 30 (Table 5).

Comparison of Informativeness.
Despite isoenzymes portraying a lower percent of polymorphic loci, the average number of alleles per locus was higher (but only 1.5 times higher).Isoenzymes also exhibited a higher value for the effective number of alleles per locus which was 1.99 compared to 1.79 for RAPDs (Table 6).This difference was better epitomized in the total number of effective alleles which was 32.8 for isoenzymes and 26.9 for RAPDs (Table 6).Polymorphic information content was also slightly higher for the former marker system.The information capacity criterion produced interesting results when the total number of available variable loci was considered.Information capacity   6).however, when information capacity was considered on a per-assay-unit basis RAPDs emerged to be far better than isoenzymes (I C/U = 6.57for isoenzymes and 14.99 for RAPDs; Table 6).

Multivariate Analyses of the Morphometric, Codominant, and Dominant Markers Data Sets.
In the PCA of the quantitative data (Table 7) the major proportion of variation (99%) was accounted for by the first three principal axes.The first two principal components had eigenvalues greater than 1.0 (Table 7).All components were bipolar and can be interpreted as shape components [20].The first principal component separated the S. exigua clones from the rest of the clonal material (Figure 1) and was dominated by the highest absolute eigenvectors for the teeth per cm and stipule index variables.The second principal component, which was characterized by the high absolute loading of the distance from leaf base to leaf widest point variable, separated the S. exigua x eriocephala clone from the rest of the clones.
In the third component which accounted for a low portion of total variability (4%) some further separation of withingroup clones was achieved (Figure 1).Regarding codominant marker data, a smaller amount of clonal variability was resolved in low multidimensional space.The first five principal components had eigenvalues above 1.0 and accounted for 96% of the total variability.Clonal relationships are depicted by their ordination on the first three principal component axes, which explain 73% of the total variation (Figure 2).The first component separated the clones into three groups: S. eriocephala, hybrids, and S. exigua clones with the inclusion of clone S289 (S. eriocephala).This component was bipolar and characterized by the highest absolute eigenvector of isoenzyme locus Per-2.
The second component, which was characterized by the high loading of Pgi-2, further separated the S289 clone from the S. exigua group.The third component (20.1%) contributed in further within-group separation (Figure 2).The PCA of the dominant marker multilocus data set resulted in the resolution of a significant amount of clonal variation in low principal space; in particular 89% of the variability was resolved in the first three axes (Table 7, Figure 3), while the first four eigenvectors were above 1.0.The first principal component separated the S. exigua and the S. exigua x eriocephala clones from the rest (Figure 3).It was characterized by the highest negative and positive eigenvectors of the RAPD loci CHL-2 1286 and CHL-4 984 , respectively.The second component separated the S. exigua x eriocephala hybrid from the S. exigua clones (Figure 3).This component was characterized by the highest absolute loading of the RAPD locus CHL-4 308 .The incorporation of the third component accounted for a generally low portion of total variability (8%) and resulted in some further within group separation (Figure 3).

Discussion
The three approaches differed in their capability to identify clones.Leaf quantitative traits, while managing to separate at the species level, were incapable of being suitable for withinspecies or among-hybrids clonal identification by the use of simple univariate statistical analysis.The efficiency of molecular markers depends upon the amount of polymorphism and informativeness they can detect among the genotypes under investigation.This is reflected in the balance between the level of polymorphism and the capacity of a marker to identify multiple polymorphisms [28].All clones presented unique multilocus isoenzyme genotypes and 86% of the clones presented unique multilocus RAPD genotypes.There was a single case of no differences among the multilocus genotypes of clones S259 and S289 which are both S. eriocephala clones.These clones were verified as being genetically different by the isoenzyme analysis (differences in 9 loci).It is clear that the use of a low number of dominant RAPD loci may be inadequate for identifying within-species clonal variation.Nevertheless, 12 loci were enough in both cases to discriminate approximately 90% of the clones.The general success of clonal identification is in agreement with pertinent literature regarding isoenzymes [32] and molecular markers [5][6][7]33] in the Salicaceae.This is an important result for Salix germplasm management as there will be an urgent need for the identification of a high number of clones in the future.
RAPDs were notably more variable than isoenzymes as the number of loci per assay unit was almost 4.5 times higher, and there were 1.5 variable loci per enzyme system compared to 7.5 RAPD loci per primer.Considering the same number of variable loci, isoenzymes could identify about 2.8 times more genotypes than RAPDs.Nevertheless, 42 variable loci are enough for RAPDs to exceed the discriminating ability of isoenzymes (considering a realistic value of 15 variable isoenzyme loci [22,23]), a number perfectly attainable for RAPDs [34].Only when few clones are used in a particular application, isoenzymes may be a better choice than RAPDs considering the low cost and high reproducibility of the former.In a similar study RAPDs were found to be more efficient than isoenzymes; however, this result was highly confounded in the lack of sufficient isoenzyme variability in the Solanum accessions that were studied [14].
RAPDs were more polymorphic for the same number of loci than isoenzymes, a result also observed in poplars [7].However isoenzymes had a higher average and effective number of alleles per locus, total number of effective alleles, and a slightly higher PIC value.For the same low number of loci isoenzymes had a higher efficiency in discriminating clones than RAPDs regarding average differences among any two clones and percent of unique multilocus genotypes.This was also reflected in the information capacity which was almost 2.5 times higher in isoenzymes.However, when information capacity was considered on a per-assay-unit basis RAPDs emerged to be about 2.3 times better than isoenzymes.Considering that 15 loci is the upper limit of available isoenzyme polymorphism in Salix, 38 variable RAPD loci are enough to surpass the information content of isoenzymes.Marker polymorphism evidently reflects the molecular properties of the particular markers [3,13].Clearly codominant isoenzymes inherently possess a higher information content, but this advantage is mediated by lower polymorphism and is lost when it is considered on a per-assay-unit basis of dominant markers.In general, the apparent advantage of the codominant isoenzymes in discriminating capacity when few loci are considered is surpassed by the high numbers of loci associated with dominant RAPDs.
These results while referring to isoenzyme and RAPD data are relevant to codominant SSRs and dominant AFLPs as well.In fact, taking into account the values on SSR numbers of loci, percent polymorphism, and numbers of alleles reported in the literature [35,36] and estimating information capacity, it appears that on the average three SSR loci are adequate to surpass the information capacity of isoenzymes (at 15 variable loci) and eight SSRs are adequate to surpass the information capacity of RAPDs (at 99 variable loci).On the other hand, it is realistic to expect to reveal about 1000 polymorphic fragments in Salix AFLP analysis [37], hence 88 SSR primers would be needed to exceed the AFLP information capacity, a number not available yet according to the author's knowledge.
Principal component analysis managed to provide a clear separation of species and hybrids in low multidimensional space.Despite small differences, general congruence among the different data sets was observed.General concordance of results obtained with different sets of quantitative, biochemical, and molecular markers has generally been reported [14,38,39].Ordination in principal space depicted the relationships among the sampled clones and resulted in the formation of loosely defined groups corresponding to the S. eriocephala and S. eriocephala x exigua clones and to the S. exigua and S. exigua x eriocephala clones.PCA of the quantitative data was very efficient in resolving variation and portraying clonal relationships.This result should be attributed to the use of the stipule parameters, which form a distinctive character among S. eriocephala and S. exigua [40].The high loading of stipule index in the first principal component supports this notion.
Multilocus genotyping of Salix clones can have important practical applications in breeding and operational activities such as in clonal certification and in handling and distribution of clonal stock.The method of clonal identification by genetic markers is objective, versatile, reliable, fast, and accurate.Sampling is always nondestructive, and genotyping can be achieved by harvesting tissue from greenhouse-rooted cuttings throughout the year.On the contrary, quantitative criteria are not completely penetrant and have to be measured during the growing period on young or adult trees.As the number of genetically related clones selected for commercial applications becomes large, their morphological discrimination will be difficult, since hybrid clones may exhibit convergence on a few or many of the characters employed in identification.The choice of genetic marker for clonal identification should be based on polymorphism, information content, and the number of clones to be fingerprinted.AFLPs and SSRs have obvious advantages over isoenzymes and RAPD; however the latter two marker systems will remain useful under financial or infrastructure constrains.In the choice of dominant versus codominant markers, it seems that at least in Salix the high numbers of dominant markers available overwhelm the information content advantages of codominant genetic markers.

Table 1 :
List of the Salix clones studied and their origin.

Table 2 :
Means and standard deviations (in parentheses) of seven morphological characters studied in Salix clones selected for biomass production (LL: leaf length, LW: leaf width, PL: petiole length, BW: distance from leaf base to leaf widest point, TC: teeth per cm, SL: stipule length, and SW: stipule width).

Table 3 :
Enzymes investigated, their enzyme commission number (E.C.No.), abbreviation, and numbers of scored loci, variable loci, and alleles.

Table 4 :
Random primers investigated, their sequence, numbers of RAPD loci identified, range of RAPD loci per individual and numbers of variable loci.

Table 5 :
Polymorphism levels and comparison of discriminating capacity of codominant (isoenzyme) and dominant (RAPD) markers.

Table 7 :
Principal component analysis of morphological, isoenzyme, and molecular data of Salix selected clones: eigenvalues and cumulative percent of variation explained by the first three principal components.