Genetic Diversity of Aromatic Rice Germplasm Revealed By SSR Markers

Aromatic rice cultivars constitute a small but special group of rice and are considered the best in terms of quality and aroma. Aroma is one of the most significant quality traits of rice, and variety with aroma has a higher price in the market. This research was carried out to study the genetic diversity among the 50 aromatic rice accessions from three regions (Peninsular Malaysia, Sabah, and Sarawak) with 3 released varieties as a control using the 32 simple sequence repeat (SSR) markers. The objectives of this research were to quantify the genetic divergence of aromatic rice accessions using SSR markers and to identify the potential accessions for introgression into the existing rice breeding program. Genetic diversity index among the three populations such as Shannon information index (I) ranged from 0.25 in control to 0.98 in Sabah population. The mean numbers of effective alleles and Shannon's information index were 0.36 and 64.90%, respectively. Similarly, the allelic diversity was very high with mean expected heterozygosity (He) of 0.60 and mean Nei's gene diversity index of 0.36. The dendrogram based on UPGMA and Nei's genetic distance classified the 53 rice accessions into 10 clusters. Analysis of molecular variance (AMOVA) revealed that 89% of the total variation observed in this germplasm came from within the populations, while 11% of the variation emanated among the populations. These results reflect the high genetic differentiation existing in this aromatic rice germplasm. Using all these criteria and indices, seven accessions (Acc9993, Acc6288, Acc6893, Acc7580, Acc6009, Acc9956, and Acc11816) from three populations have been identified and selected for further evaluation before introgression into the existing breeding program and for future aromatic rice varietal development.


Introduction
Rice (Oryza sativa L.) is one of the most important food crops in the world. Approximately three billion people around the world consume rice as a basic food which provides about 50 to 80% of their daily calories. Aromatic rice is preferred over nonaromatic rice during special occasions and for export, and thus it commands a higher market price. One of the major features of this kind of rice is its special aroma that is appreciated by many people and represents a high value-added trait [1]. Three different things seem to have led to the growth in popularity of aromatic rice: globalization, health consciousness, and culinary changes [2]. Consequently, rice needs attention to the improvement of its cooking qualities as well as its several biochemical and morphological characteristics [3]. The demand for aromatic rice is increasing day by day. Unfortunately, aromatic rice production is affected by some abiotic and biotic stresses, susceptibility to pests and diseases, and strong shedding [4].
Agronomic value of rice variety depends on many characteristics [5]. The most important features include high yielding ability, resistance to diseases and pests, resistance 2 BioMed Research International to undesirable environmental factors, and high quality of the products. Genetic diversity studies occupy an important position in breeding and improvement program as they ensure efficient utilization of germplasm resources and effective breeding system for the improvement of closely related crop species. Genetic variation analysis helps breeders in observing germplasm as well as predicting possible genetic potentials [6]. The improvement of rice breeding plummeted progressively during the last ten years due to the poor basis of the parent materials [7]. The research of the rice genetic variety is essential for cultivars rating, identification, conservation, and purity as well as breeding [8].
Genetic diversity is mainly measured based on the morphological differences of quantitative important traits. However, this method has some disadvantages in terms of time, space, and labour cost. In addition, this method cannot define the exact level of genetic diversity among the germplasms, because of the additive gene action on the expression of the traits (economically important traits), thus making environmental factors mask their true phenotypic performance [9,10]. Phenotypic expression is affected by the environment; therefore, selection based on morphological traits is seductive [11,12]. Use of molecular markers is the chromosomal landmark through which an organism can be recognized and has gained popularity as a genetic diversity tool [13]. Among the PCR-based markers, the SSR markers have proved to be very effective tools in the study of genetic diversity and organism relationships due to their high polymorphic nature and transferability [13,14]. In recent years, microsatellite markers have been widely used to screen, characterize, and evaluate genetic diversity in many crop species [15].
For marker-assisted selection as well as gene tagging, rice microsatellites had shown their utility [16,17]. The SSR markers can be effectively applied for developing unique DNA profiles of rice genotypes because of having a high level of polymorphism and greater information. The statistical analysis has been used to measure the mutual relationships between various characters and yield improvement. Genotypic evaluation of yield components can identify their relationship with grain yield in aromatic rice and the information of these relationships can be helpful to find superior aromatic rice genotypes [18]. In the present study, the genetic diversity of several high yielding aromatic rice genotypes was determined by using SSR markers. This was needed to identify the potential diverse genotypes for use as a parent in future rice breeding program.

Materials and Methods
. . Plant Material and Experimental Design. A total of 53 rice accessions including three local check varieties (MRQ74, MR219, and MR253) were used in this study as shown in Table 1. These rice accessions and check varieties were collected from the Malaysian Agricultural Research and Development Institute (MARDI). These rice accessions were collected from Sarawak (10), Peninsular Malaysia (10), and Sabah (30) by MARDI. All of the accessions were indica type. The experiment was conducted inside the net house at the experimental field of Universiti Putra Malaysia. The sprouted seeds of the 53 rice accessions were sown in the different pots using randomized complete block design (RCBD) with three replications.
. . Selection of SSR Markers. A total of 147 SSR markers were selected for diversity analysis, out of which 32 primers showed clear, distinct polymorphic bands among the 53 aromatic rice accessions selected for the analysis as shown in Table 2.
. . DNA Extraction. The DNA was extracted from 21-dayold seedlings leaves of aromatic rice genotypes using hexadecyltrimethylammonium bromide (CTAB) method [19]. The quality of DNA was determined by running it on 1% agarose gel with 1x TBE buffer (Trizma base with EDTA and boric acid; pH was adjusted to 8.0 with NaOH) at 70 V for 45 minutes. The gel was observed by UV Transilluminator lamp. The assessment of DNA concentration was implemented using NanoDrop Spectrophotometer (ND-1000, NanoDrop Inc., USA). The DNA was diluted to 50 ng using TE buffer and stored at 4 ∘ C before for the commencement of PCR.
. . SSR PCR Protocol and Bands Separation. Polymorphic thirty-two (32) SSR markers were used for genotyping the entire 53 rice accessions. Total PCR reaction was optimized to be 15 l and this included 1 l of about 50 ng DNA template, 7.0 l DreamTaq PCR master mix (Thermos Scientific Inc.), 1 l of each primer (forward and reverse primer), and 5.0 l nuclease free water. Touch-down PCR protocol was followed [20]. The band separation was done by running the PCR products on 3% metaphor agarose at 80 v for 60 min in 1% TBE along with 50 bp DNA ladder. The gel was viewed using Bio-Rad gel documentation machine. The gel picture was analyzed using Bio-Rad Image lab software for the band size. The data were saved in Excel for further analysis.
. . Data Analysis. Genetic diversity parameters such as percentage polymorphic loci (PPL), effective allele number ( ), gene diversity (ℎ), Shannon's information index ( ), gene frequency, and gene flow ( ) were computed [21]. Genetic differentiation of population (GST), which is the measure of the proportional amount of variation within subpopulation as compared with the total population, was computed. When GST is equal to "0," this implies that the subpopulations are identical; when the value is "1," they are completely different. The gene flow was computed from GST as follows: where is the effective population size and is the fraction of individuals in a population. If is <1, this indicates that populations tend to differentiate, and when ≥ 1, there is little differentiation among populations. Polymorphism information content (PIC) was computed from the formula given below:  where is the frequency of the th allele for the th marker and summed over alleles.
Analysis of molecular variance (AMOVA) was conducted to assess the genetic structure of the populations [22] using Arlequin software. Cluster analysis was performed to obtain dendrogram based on similarity coefficient using unweighted pair group method with arithmetic mean (UPGMA). Additionally, the covalence structure of the 53 accessions was determined through three-dimensional principal component analysis using NTSYS-pc software (version 2.1).

Results and Discussion
The present study evaluated the genetic diversity of 50 aromatic rice accessions and 3 check varieties as a control. These accessions were obtained from three different regions, namely, Sarawak, Peninsular Malaysia, and Sabah. The study of genetic diversity in any breeding population is essential as it constitutes the backbone of any breeding and improvement program. It helps in the development of crop that is suitable and adaptable to rapid climate change through the introduction of foreign genes [23,24]. Thus, genetic diversity is needed for developing ideal and desired crop varieties for present and future needs. In this study, one hundred and forty-seven (147) SSR markers were screened, out of which 32 SSR markers were found to be polymorphic and suitable for diversity analysis. The use of SSR for rice diversity study is very crucial as it provides accurate and unbiased assessment and reveals in-depth information on the genetic divergence of a germplasm material [25]. SSR marker has been widely recognized for its codominant inheritance pattern, high informative power, and transferability among the species, hence, its superiority as a marker of choice for plant germplasm improvement program [26,27].
. . Allelic Diversity. From 147 SSR markers screened, 32 markers (21.77%), which displayed clear and repeatable polymorphic bands, were selected for analysis as shown in Table 3 and Figures 1(a) and 1(b). A total of 131 alleles were recorded, and the number of alleles per locus ranged from 2 in RM3134 to 7 in RM462 with an average of 4.09. The expected heterozygosity differed among the markers and it ranged from 0.01 (RM23) to 1.13 (RM172) with an average of 0.60.
Results from the present investigation revealed remarkably abundant genetic variation among the 53 aromatic rice genotypes. The number of alleles ranged from 2 to 7. The number of alleles observed in this study was higher than findings of [28,29], where the alleles' number was reported within 2.40 to 3.35 per locus. On the other hand, higher number of alleles as much as 6.60 to 14.60 have been reported using other rice varieties [30,31]. A total of 128 alleles with an average of 3.28 alleles per locus and PIC value of 0.24 were observed by [32] using 39 SSR markers.
The number of alleles indicates the richness of the population. Since SSR are short tandem repeats, generally allele numbers of 2 to 7 alleles per locus are considered good as seen in this study. Allele number of 1-6 alleles/locus with an average of 3.24 has been reported in colored upland rice germplasm in Malaysia [25]. The PIC value ranged from 0.25 to 0.98 with an average of 0.61. The richness of information a marker can give, otherwise known as PIC reported in this study, was very interesting. Similarly, lower genetic diversity was reported with an average of 2.75 alleles per locus and an average PIC value of 0.38 from 40 Pakistan rice accessions [33]. On the other hand, 4.69 alleles per locus were observed with an average PIC value of 0.81 among the 36 landraces having different therapeutic values from India [34].
High PIC of 0.25 to 0.98 as seen in this study revealed that the markers have the required properties to be used in diversity study. The average PIC value in this study was higher than the value from [35] that reported average PIC value equivalent to 0.39. Genetic diversity indices such as expected heterozygosity as well as Shannon's and Nei's index among the markers were very high (>0.5, except in Nei's index), thus reflecting the heterozygous nature of the population. Percentages of polymorphic bands for Sarawak, Peninsular Malaysia, Sabah, and check groups were 75.66, 66.07, 95.46, and 22.40%, respectively, with an average of 64.90%, as shown in Table 4. Among the populations, Sabah population exhibited highest genetic diversity levels (95.46%), while control varieties had the lowest genetic diversity (22.40%).
The average number of alleles per locus ( ) varied from 0.80 (control) to 1.98 (Sabah). The effective number of alleles per locus ( ) was small compared to the number of alleles per locus and it ranged from 0.16 (control) to 0.54 (Sabah) with the mean number of 0.36. Shannon's information index ( ) was very high and varied from 0.25 (control) to 0.98 (Sabah) with 0.58 as mean as shown in Table 4. High and low genetic diversity index as seen in the Sabah and control population might not be unconnected with the population size. Population size plays an important role in genetic       differentiation in germplasm. The higher the population is, the more the likelihood of genetic differentiation is and, thus, the higher the heterozygosity is [36]. The magnitude of the stochastic process and the degree of change in genetic properties of a population depend on its effective size [37].
Additionally, Shannon's information index, another population genetic parameter, ranged from 0.25 for control varieties to 0.98 for Sabah accessions with an average of 0.58. This average value was less than the value described earlier [38], which was found to be 0.88. The high value of Shannon's information index in the present study was another indication of the presence of high genetic diversity in the rice germplasm under consideration. Nei's gene diversity, which varied from 0.16 for check accessions to 0.58 for Sabah varieties with an average value of 0.36, also indicated the high level of divergence in the population. This value aligned with that of [39], which reported mean Nei's gene diversity of 0.37 but lower than 0.50 [40].

. . Cluster Analysis and Principal Component Analysis.
Cluster analysis based on UPGMA method grouped the 53 accessions into ten distinct clusters at the coefficient of 1.05. The distance coefficient ranged from 0.49 to 1.23 as shown in Figure 2. Cluster I consisted of 2 accessions, while clusters II and III had 7 and 3 accessions, respectively. Cluster IV had the highest number of accessions with 29 rice accessions. Clusters V, VI, VII, VIII, IX, X had 1, 3, 3, 2, 2, 1, respectively, as shown in Table 5. Cluster II comprises accessions mostly of Sarawak (4) and Sabah (3) origin, while cluster IV comprises entire check varieties (3) together with all the accessions from Peninsular Malaysia except one accession (Acc6292). Additionally, 17 out of 30 accessions of Sabah origin were also found in cluster IV. Accessions from Sabah population were found virtually in the cluster group except for clusters I and II which were found to be lacking.
Critical analysis of the covariance displacement and structure as revealed by the three-dimensional PCA shows that 53 accessions were clustered into 10 distinct groups, corroborating the output from cluster analysis as shown in Figure 3. Groups V and X were found to have one accession each, while groups I, VIII, and IX contain 2 accessions each. On the other hand, groups III, VI, and VII had three members each, while groups II and IV were found to be exceptional in terms of high membership with 7 and 29 rice accessions, respectively. The result of PCA showed clear geographical correspondence to the accession with grouping patterns.
In a previous study, 29 rice genotypes were reported to have been grouped into two distinct clusters with the aid of 20 SSR markers [41]. In another study, three distinct clusters were also found from 18 rice cultivars studied using 44 SSR markers [39]. The clustering patterns in the current study gave some consideration to the geographical origin of the populations. As evident from the dendrogram, all the accessions in cluster I came from Sarawak population, while four out of seven (7) accessions from cluster II belonged to Sarawak population. Additionally, all the check varieties which were used as the control population were grouped together in cluster IV.
All these point to the accuracy and usefulness of the SSR markers in tracing the phylogeny or pedigree of a germplasm or breeding materials. This observation corresponds to the previous observations of other rice germplasm studies [42,43]. In another rice diversity study, 42 colored rice varieties were reported to have clustered according to their country and region of origin [25]. Similarly, clustering pattern has also been reported by [44] based on allelic and morphological data along with the location in rice varieties using SSR markers. Accessions that are found clustered together are assumed to have high genetic similarity, while those that are found far away from each other are considered to be divergent.
. . Analysis of Molecular Variance. The AMOVA results displayed highly significant genetic differences among accessions within the population. Of the total genetic variation in the 53 accessions, 89% was due to genetic variation within the population. This indicates the existence of high genetic differentiation among the genotypes within the groups. On the other hand, the genetic variation among the group accounted for 11% of the total variation as shown in Table 6. Consequently, the differential between the overall groups and their geographical groups had really happened and resulted in relatively high genetic diversity.
Variation of similar pattern as observed in this study among rice germplasm has been reported in a previous study [45]. In one study, which involved 41 rice genotypes from three populations, 67% of the total variation was attributed  to variation within the genotype while variation among the three populations accounted for the remaining 33%. The presence of high variability within the population represents high level of genetic differentiation which will further strengthen the divergence of the population. High genetic differentiation is very important within the germplasm for creating a desirable heterotic group in base breeding populations [23]. Thus genetic diversity characterization is very important as it provides the basis for planning conservation strategy, utilization, and establishment of breeding and improvement for rice plant [46].

Conclusions
Genetic diversity is an important concept in any breeding program. It can be studied using SSR markers for the identification of potential parent in order to achieve heterosis in future aromatic rice breeding program. SSR markers were exploited to provide an unbiased estimate of the diversity pattern in this rice germplasm. The current study found the existence of high levels of diversity among 53 rice accessions which are good for the introduction of new genes in the existing genotypes. The dendrogram constructed to identify the genetic similarities among these genotypes showed that accessions from the same regions were found to cluster mostly together implying a correlation between molecular groupings and their source of collection. Clustering pattern on the basis of SSR markers provides ample information in identification and confirmation of rice genotypes and accessions. This information is significantly crucial for the development of pure aromatic rice breeding lines. Rice genotypes sharing common sources clustered into the same group. Based on SSR diversity analysis and clustering patterns, the following accessions (Acc9993, Acc6288, Acc6893, Acc7580, Acc6009, Acc9956, and Acc11816) have been identified as diverse accessions and suitable as a parent in the future breeding program.

Conflicts of Interest
No potential conflicts of interest were reported by the authors.