Development and Validation of Gene-Based SSR Markers in the Genus Mesembryanthemum

Bioinformatics tools have been employed for the direct development of gene-based simple sequence repeat (SSR) markers. Through the analysis of 28,056 Mesembryanthemum expressed sequence tag (EST) sequences, a total of 5,851 ESTs containing SSRs were identified, amounting to approximately 17.07 Mb. Among these, 938 EST sequences harbored more than one SSR marker, and 788 EST-SSR sequences were found in compound form. The most prevalent types of SSR motifs were mononucleotide repeats (MNRs), accounting for 44%, followed by di-nucleotide repeats (DNRs) at 37%, and trinucleotide repeats (TNRs) at 16%. Notably, TNR or longer SSR motifs primarily consisted of shorter repeat lengths, with only 51 motifs containing 10 or more repeats. The BLASTX analysis successfully assigned functions to 4,623 (79%) of the EST sequences. Among the developed primer sets, 21 primers amplified a total of 65 alleles, with primer PMA79 EST-SSR exhibiting the maximum of six alleles. The polymorphic information content (PIC) values ranged from 0 to 0.76, with a mean of 0.47. The marker index (MI) and discriminating power (D) values reached 0.66 (primer PMA63) and 0.95 (primer PMA20), respectively. Utilizing the unweighted pair group method with arithmetic mean (UPGMA), a dendrogram was constructed, successfully segregating the 24 Mesembryanthemum genotypes into three distinct clusters, with a similarity coefficient ranging from 0.96 to 0.38. In this study, we have developed a total of 83 EST-SSR primer pairs specific to the Mesembryanthemum genus. These newly developed EST-SSRs will serve as valuable tools for researchers, particularly molecular breeders, enabling gene-based identification and trait selection through marker-assisted breeding approaches.


Introduction
Mesembryanthemoideae (Aizoaceae) comprises a single genus, Mesembryanthemum, which consists of approximately 101 species and is indigenous to arid and semiarid regions of South Africa [1].It is also found in the Mediterranean region, the Atlantic Islands, Saudi Arabia, South Australia, and California [2].Mesembryanthemum plays a signifcant role in its native habitat by thriving in harsh, arid environments where other plants struggle to survive [3].Several species of Mesembryanthemum have been recognized for their antioxidant properties, nutritional and medicinal importance, and ability to accumulate salt, thereby contributing to bioremediation efects [2,4,5].Despite its diverse signifcance, certain species of Mesembryanthemum are classifed as endangered or critically endangered by the International Union for Conservation of Nature (IUCN) [6].Furthermore, molecular research, including the assessment of genetic diversity and genome mapping, has been hindered by the 2 Scientifca limited availability of codominant molecular markers such as simple sequence repeats (SSRs).
Initially identifed in humans, SSRs or microsatellites are repetitive DNA sequences consisting of 1-6 nucleotide core units [7,8].Tese markers are widely distributed throughout most plant genomes.SSR markers possess several advantages, including high variability, codominant inheritance, easy detection, multiallelic nature, transferability between species, and amenability to PCR amplifcation [7,9,10].However, the development of specifc SSR markers typically involves labor-intensive, time-consuming, and costly procedures.Te emergence of expressed sequence tag-simple sequence repeats (EST-SSRs) derived from EST and cDNA sequences [11] has become the preferred choice for SSR markers, given the growing availability of EST and cDNA sequences in global sequence databases such as NCBI [12].Moreover, EST-SSR markers are located in the coding region of the genome, making them ideal DNA markers for crossspecies transferability and gene tagging for desired traits [13,14].EST-derived SSR markers are expected to exhibit higher conservation and greater abundance among related species compared to anonymous sequence-derived SSR markers [14].In barley (Hordeum vulgare L.), approximately 78% of the 165 EST-SSR markers used successfully amplifed in wheat, followed by 75% in rye (Secale cereale L.) and 42% in rice (Oryza sativa L.) [14].
While EST-SSR markers have been developed and validated for numerous eudicot plants, including Vicia faba [15], Vigna angularis [16], and Lens culinaris Medik [17], to the best of our knowledge, SSR markers have not yet been developed in Mesembryanthemum.Terefore, this study was conducted to generate EST-SSR markers specifc to the Mesembryanthemum genus.

Materials and Methods
In May 2021, a total of 28,056 Mesembryanthemum EST sequences corresponding to 17.07 Mb were retrieved from the National Center for Biotechnology Information (NCBI) website (https://www.ncbi.nlm.nih.gov).Tese sequences underwent a cleaning process to remove poly-A and poly-T tails using the TRIMEST program sourced from EMBOSS [18].Te identifcation of EST-SSRs was carried out using the MISA-web program developed by Beier et al. [19].By employing the MISA-web engine online (https://webblast.ipk-gatersleben.de/misa/),mono, di, tri, tetra, penta, and hexa tandem repeats with minimum repeat unit criteria of 10, 6, 5, 5, 5, and 5, respectively, were selected (Table 1).A total of 7,181 SSR loci were discovered across 5,851 EST sequences.To design EST-SSR primers, the Primer3web software was utilized.Te "targets" option was employed to indicate the location of the SSR motif to ensure the selection of appropriate fanking primers.Te remaining software settings were maintained as default, except for the annealing temperature (set at 60 °C ± 3 °C) and primer length (set at 20 bp with a range of +6, −2 bp).A BLASTX search was conducted on the NCBI database to determine the putative function of the developed SSR markers.However, only 28 EST-SSR primers were employed for amplifying the genomic DNA from 24 Mesembryanthemum genotypes (Table 2).Te iMEC online software [20] was utilized to calculate the polymorphism information content (PIC), heterozygosity index (H), discriminating power (D), marker index (MI), average heterozygosity (av.H), and resolving power (R) for each primer.In addition, a dendrogram representing the 24 Mesembryanthemum genotypes was constructed using NTSYS software and the unweighted pair group method with arithmetic mean (UPGMA) [21].

Results and Discussion
We present the novel development of unique EST-SSR markers derived from easily accessible ESTs for Mesembryanthemum.Approximately 17.07 Mb of Mesembryanthemum EST sequences, totaling 28,056 sequences, were analyzed to identify 7,181 EST-SSR markers (Table 3).Among these markers, 5,851 ESTs contained a total of 7,181 SSR repeats, indicating that 20.8% of the EST sequences harbored at least one SSR.Te frequency of SSR occurrence was calculated as one repeat per 2.38 kb, which is comparable to the frequencies observed in Mentha piperita (1/3.4 kb) and pepper (1/3.8 kb) [22,23].Varshney et al. [14] reported that around 5% of ESTs contain SSRs when the minimum repeat length is set to 20 bp, indicating that the frequency of SSRs can vary signifcantly depending on the search criteria employed.Out of the 5,851 SSRs identifed, 938 sequences contained multiple SSRs, and 788 SSRs occurred in compound form (Table 3).Te distribution and frequency of diferent motifs in SSRs have been observed to vary widely across plant species.In this study, mononucleotide repeats (MNR) were the most abundant (44%), followed by di-nucleotide repeats (37%), and trinucleotide repeats (16%), as depicted in Figure 1.MNRs have been shown to be valuable in bridging gaps in linkage maps constructed using SSR markers [24].
Te BLASTX searches successfully assigned putative functions to 4,623 (79%) of the identifed EST-SSRs.Tis information is valuable for guiding the development of specifc markers targeting desired genes and facilitating further exploration of gene-related information [27].

Validation
Twenty-eight recently designed EST-SSR primers (provided in Table 1) were carefully chosen to encompass all types of nucleotide repeats.Tese primers were utilized to amplify genomic DNA extracted from 24 Mesembryanthemum genotypes.Out of the 22 primers that successfully produced amplifcation, 21 primers exhibited polymorphic amplifcation profles, resulting in a total of 65 alleles being amplifed (Table 5).Te maximum number of alleles, six in total, was observed for the PMA79 EST-SSR primer.Te polymorphic information content (PIC) values, which estimate the discriminatory power of a locus based on allele number and frequencies, ranged from 0 to 0.76, with an average of 0.47 (Table 6).Te marker index (MI), which assesses the overall efciency of a molecular marker, varied from 0 (PMA44) to 0.66 (PMA63), with a mean of 0.41.In addition, the discriminating power (D) of the primers ranged from 0 (PMA44) to 0.95 (PMA20), averaging at 0.67 (Table 6).
Te resulting UPGMA dendrogram (Figure 2), which is a visual representation of the genetic relationships, classifed the Mesembryanthemum genotypes into three distinct clusters.Tis clustering indicates that there are underlying genetic similarities and diferences among the genotypes.Te UPGMA method organizes the genotypes based on their genetic profles, allowing us to observe patterns of relatedness.
Te similarity coefcient, ranging from 0.38 to 0.96 with a mean of 0.67, provides a quantitative measure of genetic similarity or dissimilarity among the genotypes.A higher similarity coefcient suggests a closer genetic relationship,  Te diversity in the range of similarity coefcients (0.38 to 0.96) signifes a substantial genetic variation within the Mesembryanthemum genotypes being studied.Te mean similarity coefcient of 0.67 suggests a moderate level of genetic similarity on average, implying a balanced mix of genetic relatedness and diversity among the genotypes.Understanding the genetic diversity and relationships among these Mesembryanthemum genotypes is crucial for various applications, including breeding programs, conservation eforts, and understanding the evolutionary history of these genotypes.
Due to their gene specifcity, EST-SSRs are valuable tools for gene tagging and comparative investigations.Tey can be employed in the development of linkage maps and studies on diversity across related species, as demonstrated by Sahu et al. [27] and Akash and Myers [12].Te newly developed set of EST-SSRs presented in this study ofers molecular breeders enhanced resources for gene-based identifcation and selection of traits through marker-assisted breeding.
with coefcients closer to 1.0 share a larger proportion of genetic material.

Table 1 :
Expressed sequence tag-simple sequence repeats (EST-SSRs) frequencies by repeat motif in Mesembryanthemum.

Table 2 :
Te Mesembryanthemum genotypes, totaling 24, along with details of their collection sites.

Table 4 :
Expressed sequence tag-simple sequence repeats (EST-SSRs) frequencies by nucleotide repeat type in Mesembryanthemum.

Table 5 :
List and characteristics of Mesembryanthemum expressed sequence tag-simple sequence repeat (EST-SSR) markers.

Table 6 :
Number of alleles, size range of amplifed fragments, and polymorphism statistics calculated with iMEC for 24 Mesembryanthemum genotypes using 21 expressed sequence tag-simple sequence repeat (EST-SSR) loci.