Retrotransposon-Based Molecular Markers for Analysis of Genetic Diversity within the Genus Linum

SSAP method was used to study the genetic diversity of 22 Linum species from sections Linum, Adenolinum, Dasylinum, Stellerolinum, and 46 flax cultivars. All the studied flax varieties were distinguished using SSAP for retrotransposons FL9 and FL11. Thus, the validity of SSAP method was demonstrated for flax marking, identification of accessions in genebank collections, and control during propagation of flax varieties. Polymorphism of Fl1a, Fl1b, and Cassandra insertions were very low in flax varieties, but these retrotransposons were successfully used for the investigation of Linum species. Species clusterization based on SSAP markers was in concordance with their taxonomic division into sections Dasylinum, Stellerolinum, Adenolinum, and Linum. All species of sect. Adenolinum clustered apart from species of sect. Linum. The data confirmed the accuracy of the separation in these sections. Members of section Linum are not as closely related as members of other sections, so taxonomic revision of this section is desirable. L. usitatissimum accessions genetically distant from modern flax cultivars were revealed in our work. These accessions are of utmost interest for flax breeding and introduction of new useful traits into flax cultivars. The chromosome localization of Cassandra retrotransposon in Linum species was determined.


Introduction
The genus Linum comprises about 200 species which are distributed throughout the temperate and subtropical regions of the world. The genus is subdivided by Ockendon and Walters into five sections: Linum, Dasylinum (Planch.) Juz., Linastrum (Planchon), Bentham, Syllinum Griseb., and Cathartolinum (Reichenb.) Griseb. [1]. Some taxonomists classified the members of the L. perenne group from section Linum to an independent section Adenolinum (Reichenb.) Juz. [2,3]. The species L. stelleroides (Planch.), distributed in Far East and China, was classified by Yuzepchuk [2] to a monotype section Stellerolinum Juz. ex Prob. The phylogenetic analyses based on chloroplast (ndhF, trnL-F, and trnK 3 intron) and 2 BioMed Research International nuclear ITS (internal transcribed spacer) DNA sequences revealed that genus Linum was not monophyletic. It contains two major lineages: a yellow-flowered clade (sections Linopsis, Syllinum, and Cathartolinum) and a blue-flowered clade (sections Linum, Dasylinum, and Stellerolinum) [4]. The cultivated flax (L. usitatissimum L.) belongs to sec. Linum from a blue-flowered clade. L. usitatissimum is believed to have originated as a result of domestication of wild species L. angustifolium Huds. approximately 8000 years ago [5][6][7][8]. For a long time flax has been cultivating as a dual-purpose crop grown for its fiber and linseed oil.
The taxonomy of the genus cannot be considered as finally established one because the phylogenetic linkages between the individual taxa have not been sufficiently investigated. The phylogeny of species of the genus Linum was previously studied by the use of molecular and cytogenetic approaches [4,[12][13][14][15][16][17], but there are problems that still remain to be solved.
Transposon-based molecular markers are successfully used in phylogenic studies. Transposable elements were shown to influence changing in genomic structure as well as transcriptional regulation occurring during the evolution [18,19]. The presence of transposons in various species of plants, their high integration activity, conservative sequences, and a large number of copies encouraged the use of transposons in the studies of genetic diversity and profiling of plant varieties [20][21][22]. Several molecular marker systems based on the information available for the transposable elements sequences were developed for plants [20,[22][23][24][25][26][27]. SSAP (sequence-specific amplified polymorphism) method was shown to have a number of advantages as compared to other marker systems. SSAP method produces many polymorphic fragments and allows differentiation of most samples using only a single combination of specific primers [23,[28][29][30]. Different plants were successfully studied by SSAP analysis, but the method has not been applied for the investigation of species of the genus Linum yet. Only recently flax sequences have appeared in databases [31][32][33][34], and development of a marker system based on flax transposable elements for the investigation of cultivated and wild species of the genus Linum has become possible.
In this study the SSAP method was used for assessment of genetic diversity. Besides, the possibilities of application of marker-based profiling for identification of L. usitatissimum varieties were analyzed. We studied 46 varieties of L. usitatissimum mainly bred in Russia and a number of varieties which were grown in geographically close or distant regions. We also analyzed different types of cultivated flax (fiber, oilseed, large seeded, winter, and dehiscent flax) together with 21 wild species and subspecies from sections Linum, Adenolinum, Dasylinum, and Stellerolinum to estimate the possibility of using the SSAP method for the investigation of flax domestication history and phylogenic linkages between different taxa of the genus Linum.

Confirmation of Species Determination.
Species determination of some accessions of wild Linum species was done during the course of our earlier cytogenetic investigations [14,17,37]. To confirm the species determination, the rest of the accessions were planted in the ground, and, additionally, chromosome analysis (determination of chromosome number) using acetocarmine staining according to previously developed approach was performed [14]. Chromosome numbers of all the studied accessions of wild flax are represented in Table 2.

SSAP Analysis. Genomic polymorphism of different
Linum species was studied using SSAP method [23] with modifications described earlier through genomic studies of wheat [38] and strawberry [39]. Total genomic DNA was extracted from young flax leaves according to Edwards et al. [40] with minor modifications. 30 ng of genomic DNA  Tost  Unknown 2n = 28 [14] 28 L. narbonense L.

LIN1913
Italy 2n = 16 [14] BioMed Research International 5 Amplification was performed in two stages. At the first stage, only the primer to LTR of the retrotransposon was used. Amplification was carried out in 25 L of PCR mix containing 5 L of ligation mix, 1 U of TrueStart Hot Start Taq DNA polymerase (Thermo Scientific, USA), TrueStart Taq DNA polymerase buffer, 0.5 mM MgCl 2 , 20 M dNTP (Thermo Scientific), and 5 pmol of the LTR primer. The program for amplification for the first stage was 95 ∘ C for 15 min, 30 cycles (95 ∘ C for 30 s, 62 ∘ C for 1 min, 72 ∘ C for 2 min) and 72 ∘ C for 10 min. At the second stage of amplification, the adapter primer 5 -GTTTACTCGATTCTCAACCCGA-3 and one of the primers to the LTR region of retrotransposons were used. Amplification was carried out in 25 L of PCR mix containing 12 L of the first-stage PCR product, 1 U of Taq DNA polymerase (Thermo Scientific, USA), Taq DNA polymerase buffer, 1.5 mM MgCl 2 , 200 M dNTP (Thermo Scientific, USA), 25 pmol of LTR primer, and 25 pmol of the adapter primer. The program for amplification for the second stage was 95 ∘ C for 15 min, 35 cycles (95 ∘ C for 30 s, 62 ∘ C for 1 min, 72 ∘ C for 2 min), and 72 ∘ C for 10 min. The PCR products were separated in 2.5% agarose gel using fj@ buffer and then stained with ethidium bromide. Ten PCR products were excised from agarose gel and characterized by sequencing on Applied Biosystems 3730 DNA Analyzer to confirm the specificity of SSAP reaction. The Bio-Rad Gel Doc system was used for gel documentation and photography as well as for visual detection of presence or absence of polymorphic fragments in the samples from different accessions. These data were recorded in the form of a binary matrix in which the presence of a fragment was coded as 1 and its absence as 0.
The genetic distances between varieties were calculated based on the binary matrix of amplified fragments using Dice's formula [41]. The dendrograms were constructed using SplitsTree 4.10 software [42]. Cluster analysis was performed using neighbor-joining method [43] and bootstrap values were determined based on 5000 permutations. The amplicons were analyzed in 2% agarose gel and then used as a template for biotin PCR labeling to obtain biotin-labeled probes for FISH. PCR labeling was carried out using Biotin PCR Labeling Core Kit (Jena Bioscience, Germany) according to the manufacture's protocol. Labeled PCR products were precipitated with ethanol.

FISH with Cassandra Retrotransposon DNA Probe.
Chromosome preparation was carried out according to the technique developed earlier for plants having small-sized chromosomes [14]. The hybridization mixture contained 2x SSC, 50% formamide, 10% dextran sulphate, and 2 ng/ L of a biotinylated DNA probe of Cassandra retrotransposon. The probe was hybridized overnight at 31 ∘ C. After hybridization the slides were washed twice with 0.1x SSC at 38 ∘ C for 10 min, followed by two washes with 2x SSC at 44 ∘ C for 5 min and a final 5 min wash in 2x SSC at room temperature. The biotin-labeled DNA probe was detected using a highly sensitive Alexa Fluor 488, Tyramide Signal Amplification system (Invitrogen) according to manufacturer's instructions.  "Stormont cirrus" was restricted twice, ligated, and amplified with the primers. The obtained PCR products were visualized in acrylamide and agarose gels ( Figure 1). Electrophoretic spectra of PCR products obtained with primers 1826, 1838, 1845, 1868, 1886, and 1899 coincided completely demonstrating high reproducibility of the SSAP results. The fingerprints obtained with primers 1846, 1854, and 1881 varied in individual amplified fragments, so the use of primers 1846, 1854, and 1881 for SSAP analysis of flax varieties will require further optimization of restriction, ligation, or PCR conditions. We selected primers with high reproducibility (1838, 1845, 1868, and 1899) as they yielded PCR products that were easily discernible in an agarose gel. These primers were used for analysis of 46 flax varieties by the SSAP method.

Analysis of SSAP
All the examined varieties produced identical or very similar fingerprints with primers 1838 and 1868 (FL1a,

Analysis of Genetic Diversity of Flax Varieties. Visual analysis of SSAP fingerprints based on retrotransposons FL11
and FL9 revealed 44 polymorphic retrotransposon insertions (23 fragments for primer 1845 and 21 fragments for primer 1899) in 46 flax varieties. Each of the 46 varieties had their own unique spectrum of retrotransposon insertions. So, we could differentiate all the 46 varieties using only two SSAP primers. In order to analyze genetic diversity of these varieties, we compiled a binary matrix of the presence/absence of polymorphic insertions of the above-mentioned retrotransposons, calculated the genetic distances between the varieties using Dice's formula [41], and constructed a dendrogram by using the neighbor-joining method (Figure 2). The obtained tree branching pattern revealed no distinct clusters among examined varieties. All the groups contained at least one group-specific marker. The results of phylogenetic analysis of Linum species are shown in the dendrogram on Figure 4. As the dendrogram shows nine clearly distinguished groups of species supported by high bootstrap values can be observed.

FISH with Cassandra
Retrotransposon DNA Probe. The highly sensitive tyramide FISH method was applied for the investigation of abundanceof Cassandra retrotransposons as well as their distribution along chromosomes in three species of sect. Linum (L. usitatissimum, L. grandiflorum, and L. narbonense) and L. amurense (sect. Adenolinum). FISH revealed that Cassandra dispersed along the whole length of chromosomes in karyotypes of four studied species, but its distribution along the chromosomes was nonrandom ( Figure 5). In species having small-sized chromosomes (L. usitatissimum, L. grandiflorum, and L. amurense), Cassandra was mainly localized in pericentromeric and subtelomeric chromosome regions. The patterns of Cassandra distribution were chromosome specific and were similar in homologous pairs of chromosomes ( Figure 5(e)). In L. narbonense, which    possessed large chromosomes, the patterns of Cassandra distribution resembled the patterns observed in karyotypes of the above-mentioned species (having small-sized chromosomes) though they were more regular. In analyzed flax varieties, 44 polymorphic insertions for FL11 (primer 1845) and FL9 (primer 1899) retrotransposons were revealed. Every studied variety possessed a unique set of SSAP markers. Therefore, the SSAP method can be used to mark the genotypes, to identify varieties of L. usitatissimum in genebank collections, to exercise control during of flax variety growth, and to obtain high quality seed material, when the varietal identity is particularly important.

Use of SSAP Analysis for Identification of Flax
The genetic similarity of 46 flax varieties was characterized by genetic distances calculated based on the SSAP data. The dendrogram (Figure 3) did not contain clearly isolated clusters of varieties. Thus, the studied flax varieties could not be subdivided into distinct groups. Our results were in agreement with earlier obtained data shown that flax accessions examined by IRAP analysis did not form distinct clusters in studies of their origin or the type of commercial use (fiber or oil). These data indicated an overlap in genetic diversity despite of disruptive selection for fiber or seed oil types [32]. In our study, the SSAP method also failed to distinguish fiber or oil seed flax varieties. Since varieties with the best characteristics are commonly used as parents in breeding practice, some valuable flax forms present in the genealogy of most modern varieties. Besides, the lines selected for crossing are usually characterized by low genetic diversity. So, the commercial flax varieties were shown to be less diverse than wild flax species and landraces [32].
Although the examined flax varieties could not be clustered into different groups by SSAP method, it might be used for estimation of their genetic similarity based on polymorphic insertions of retrotransposons. The estimation can be used for choosing the parents in breeding practice and also for creation of core collections which should include genetically diverse accessions.

Diversity and Phylogeny of Linum Species.
In the present study, 20 accessions from sect. Linum, 21 accessions from sect. Adenolinum, 4 accessions from sect. Dasylinum, and 2 accessions from sect. Stellerolinum were analyzed by using SSAP method. All the examined species were clustered into 9 groups mainly according to common taxonomic division of the genus Linum into sections (Figure 4).

Section Dasylinum.
Species from sect. Dasylinum clustered together and formed two related groups B and C. Group B included L. hirsutum subsp. pseudoanatolicum and L. hirsutum subsp. anatolicum and group C included L. hirsutum subsp. hirsutum. Thus, SSAP analysis singled out sect. Dasylinum as a well-supported clade. Our results were in agreement with the AFLP and ITS data as well as chloroplast phylogenies, chromosome studies, and transcriptome analysis of Linum species [4,13,37,44]. It should be mentioned that the subdivision of accessions of sect. Dasylinum into two related clusters correlated with their difference in chromosome numbers and the origin of accessions. Thus, the accessions of L. hirsutum subsp. hirsutum (cluster C) from Europe was characterized by chromosome number of 2n = 16, while accessions from Turkey, L. hirsutum subsp. pseudoanatolicum and L. hirsutum subsp. anatolicum (cluster B) have chromosome number 2n = 32. Chromosome numbers for L. hirsutum subsp. pseudoanatolicum and L. hirsutum subsp. anatolicum were firstly determined in the present study.

Section
Stellerolinum. Two accessions of L. stelleroides have rather similar SSAP fingerprints which were differed significantly from all the others Linum species and formed a separated clade. The similar results were obtained by phylogenetic analyses of chloroplast and ITS DNA sequences [4]. Moreover, L. stelleroides was shown to have chromosome number 2n = 20 which was unique for blue-flowered flaxes [45].

Section
Adenolinum. All the members of sect. Adenolinum formed an independent group clustered separately from species of sect. Linum and other sections. Distinct isolation of this species group was also revealed in several molecular and karyological investigations [4,[12][13][14]17]. The data were in good agreement with the opinion of Yuzepchuk [2] and Egorova [3] who isolated the group from sect. Linum into an independent section Adenolinum.
SSAP fingerprints of the accessions within sect. Adenolinum were highly polymorphic, but SSAP markers did not allow us to reveal any species subclusters supported by a high bootstrap value. Thus, SSAP analysis used in the present study as well as AFLP and RAPD analyses [13,17] separated individual accessions but did not identify individual species inside sect. Adenolinum. 4.6. Section Linum. Sect. Linum was subdivided into 5 groups by neighbor-joining clustering. Accessions of L. marginale, L. grandiflorum, L. decumben, and L. narbonense formed four independent single species groups, while the fifth group combined accessions L. angustifolium and L. usitatissimum. Similar results had been obtained earlier by AFLP, RAPD, molecular phylogeny based on chloroplast RbsL sequence, and molecular cytogenetic methods (C/DAPI-banding patterns and localization of rRNA genes on chromosomes) [4,[12][13][14].
Within a subgroup consisted of L. usitatissimum and L. angustifolium, the accessions of large seeded flax (breeding cultivar), dual-purpose flax, and L. angustifolium were rather similar. Their fingerprints did not differ significantly from fingerprints of studied 46 flax varieties. This data were in a good agreement with the suggestion that L. angustifolium was the progenitor of L. usitatissimum [5,7].
The accession of large seeded flax landrace, the accessions of winter flax, and the accessions of dehiscent flax differed significantly from the other members of cluster H. Both accessions of dehiscent flax grouped together (supported by a high bootstrap value) and had species-specific SSAP markers.
It should be noted that flax accessions, which are genetically distant from modern flax cultivars, are particular important for flax breeding. The genetic diversity of cultivated flax decreased significantly during the last decades. It might lead to the lack of useful alleles in genomes of modern cultivars [13]. Therefore, introduction of new useful traits from the ancient primitive forms of cultivated flax and wild species could increase the polymorphism of modern flax varieties. SSAP markers allowed us to identify the unique accessions which are important for the investigation of the history of flax domestication.
L. marginale, the last member of sect. Linum, is a wild flax native to Australia. We found that it had the maximal chromosome number (2n = 84) in the genus Linum. The number indicated a high level of ploidy of L. marginale genome. SSAP patterns of the species were significantly different compared with the other species of sect. Linum. Therefore, L. marginale clustered apart from the other species. The obtained results were in contradiction with ITS and chloroplast topologies which clustered the species together with L. bienne and L. usitatissimum [4]. Rogers [46] assumed that Australian and New Zealand species L. marginale Cunn. and L. monogynum Forst. were related to European species L. hologynum Reichenb. (sec. Linum). This assumption based on the fact that diploid chromosome number of L. hologynum (2n = 42) corresponded to haploid chromosome number of L. monogynum and L. marginale. Moreover, all the three species had fused styles and pantoporate pollen grains that were unusual for blue-flowered flaxes. Thus, all the abovementioned data indicated that the phylogenic lineages of L. marginale need further investigation.
The data obtained in the present study, as well as the results of other molecular phylogenetic and chromosomal investigations, indicated that members of section Linum were not as closely related as members of other sections. Therefore, taxonomic revision of this section is desirable.

Chromosome Location of Cassandra Retrotransposon.
As differences in SSAP fingerprints for several flax species were found, we decided to analyze the distribution of Cassandra retrotransposon along the chromosomes of Linum species. Cassandra is a terminal-repeat retrotransposon in miniature (TRIM) that carries conserved 5S rDNA sequences in its LTRs. Cassandra was found in a number of vascular plants [31]. In our work, we investigated the distribution of this retrotransposon along the chromosomes of Linum species using tyramide FISH. We revealed that Cassandra localized in pericentromeric and subtelomeric regions of chromosomes that was typical for transposable elements [47]. A more uniform distribution of Cassandra retrotransposon was found in L. narbonense in comparison with L. usitatissimum, L. grandiflorum, and L. amurense. It was probably due to a higher content of transposable elements correlated with a larger size of its chromosomes (therefore its genome).

Conclusions
The availability of LTR sequences of flax retrotransposons and high polymorphism of SSAP markers offer a promising potential for SSAP analysis of genus Linum. Applications of SSAP analysis, for example, evolutionary and phylogenetic studies, assessment of genetic diversity, accession identification, and search for exotic genepools of cultivated flax, could be applied to L. usitatissimum and other Linum species. SSAP analysis was shown to be very useful for characterization of flax varieties and identification of accession belonging to different species or sections and provided new information about of phylogenetic relationships within the genus Linum.