Little data is available on microsatellite dynamics in the duplicated regions of the rice genome, even though efforts have been made in the past to align genome sequences of its two sub-species. Based on the coordinates of duplicated sequences in the
Microsatellites represent a class of tandem DNA repeats with 1 to 6 bp long repeat units. These sequences occur in almost all the organisms and frequently constitute the hypervariable regions of the genome. No specific functions have been assigned to most of the microsatellites till date. However, in some cases at least, microsatellite alleles provide protective or adaptive advantage to the host [
Availability of whole-genome sequences for rice (
Whole-genome sequence of
Repeatmasker (
The data generated by mining of duplicated sequences and associated microsatellites were subjected to statistical analysis using
Microsatellites constitute nearly 1% of the eukaryotic genomes, though in some organisms like
Evidences exist for genome duplications in rice that occurred between 53 and 94 mya sometime prior to divergence of the cereal genomes [
Based on the data presented earlier by Yu et al. [
Occurrence of genes and microsatellite repeats in duplicated regions of the rice genome.
|
Duplicated segments |
Intergenic | Genic | Gene frequency |
Repeat frequency | |
---|---|---|---|---|---|---|
Exon | Intron | |||||
Chromosome 1 corresponding chromosome 5 | ||||||
Segment 1.1 | 58 (81685) | 46 | 9 | 3 | 6807.08 | 16337 |
Segment 5A1.1 | 50 (75106) | 21 | 20 | 9 | 2589.86 | 8345.11 |
Segment 1.2 | 6 (9866) | 2 | 3 | 1 | 2466.5 | 0 |
Segment 5A1.2 | 4 (5909) | 2 | 1 | 1 | 203.76 | 0.00 |
Segment 1.3 | 163 (244228) | 98 | 42 | 23 | 3757.35 | 9769.12 |
Segment 5A1.3 | 169 (268072) | 110 | 41 | 18 | 9243.86 | 9928.59 |
Segment 1.4 | 3 (4300) | 2 | 0 | 1 | 4300 | 0 |
Segment 5A1.4 | 1 (2247) | 1 | 0 | 0 | 77.48 | 2247.00 |
| ||||||
Chromosome 2 corresponding chromosomes 4 and 6 | ||||||
Segment 2.1 | 342 (518707) | 200 | 89 | 53 | 3652.87 | 17886.45 |
Segment 4A2.1 | 347 (522868) | 199 | 97 | 51 | 18029.93 | 13071.70 |
Segment 2.2 | 102 (149764) | 49 | 43 | 10 | 2825.74 | 29952.8 |
Segment 6A2.2 | 105 (146574) | 56 | 35 | 14 | 5054.28 | 24429.00 |
Segment 2.3 | 81 (124845) | 50 | 14 | 17 | 4027.26 | 15605.63 |
Segment 6A2.3 | 77 (114157) | 45 | 22 | 10 | 3936.45 | 16308.14 |
| ||||||
Chromosome 3 corresponding chromosomes 7, 10, and 12 | ||||||
Segment 3.1 | 29 (42425) | 14 | 11 | 4 | 2828.33 | 21212.5 |
Segment 7A3.1 | 31 (47154) | 21 | 9 | 1 | 1626.00 | 23577.00 |
Segment 3.2 | 29 (41410) | 14 | 12 | 3 | 2760.67 | 20705 |
Segment 7A3.2 | 36 (49456) | 22 | 9 | 5 | 1705.38 | 16485.33 |
Segment 3.3 | 37 (59771) | 26 | 5 | 6 | 5433.73 | 11954.2 |
Segment 10A3.3 | 42 (66214) | 24 | 9 | 9 | 2283.24 | 16553.50 |
Segment 3.4 | 23 (28749) | 15 | 1 | 7 | 3593.63 | 28749 |
Segment 10A3.4 | 28 (39198) | 16 | 6 | 6 | 1351.66 | 19599.00 |
Segment 3.5 | 29 (41024) | 15 | 10 | 4 | 2930.29 | 41024 |
Segment 12A3.5 | 24 (37014) | 13 | 8 | 3 | 1276.34 | 18507.00 |
| ||||||
Chromosome 4 corresponding chromosomes 8 and 10 | ||||||
Segment 4.1 | 17 (26044) | 7 | 6 | 4 | 2604.4 | 8681.33 |
Segment 8A4.1 | 16 (21129) | 11 | 4 | 1 | 728.59 | 0.00 |
Segment 4.2 | 40 (62581) | 22 | 12 | 6 | 3476.72 | 15645.25 |
Segment 10A4.2 | 40 (59065) | 22 | 11 | 7 | 2036.72 | 59065.00 |
| ||||||
Chromosome 8 corresponding chromosome 9 | ||||||
Segment 8.1 | 28 (36632) | 21 | 4 | 3 | 5233.14 | 6105.33 |
Segment 9A8.1 | 33 (42741) | 28 | 3 | 2 | 1473.83 | 8548.20 |
Segment 8.2 | 130 (191894) | 73 | 47 | 10 | 3366.56 | 31982.33 |
Segment 9A8.2 | 122 (180824) | 72 | 41 | 9 | 6235.31 | 12054.93 |
| ||||||
Chromosome 11 corresponding chromosome 12 | ||||||
Segment 11.1 | 111 (168247) | 51 | 40 | 20 | 2804.12 | 12017.64 |
Segment 12A11.1 | 101 (158798) | 45 | 39 | 17 | 5475.79 | 14436.18 |
Segment 11.2 | 43 (59793) | 25 | 15 | 3 | 3321.83 | 8541.86 |
Segment 12A11.2 | 47 (65442) | 20 | 18 | 9 | 2256.62 | 21814.00 |
The size of the aligned pair and the alignment scores between two segments are generally in inverse relationship to their divergence time. However, in the present case, such a relationship has not been observed, as the most recent pair of duplicated sequences on chromosome 11 and 12 [
A representative figure of a duplicated segment mapped between chromosomes 11 and 12.
We earlier reported 45,782 microsatellites in 374.5 Mb of rice genome [
Gene versus repeat density on the entire duplicated segments in the rice genome. Duplication ratio refers to the ratio of the segment reported duplicated by Yu et al. [
CCG repeats (and direct and reverse complementary permutations thereof) were found most abundant in either set of sequences in consistency with the earlier reports [
Traceability of microsatellites originating from group I sequences into group II sequences.
Motif | Region | Length (bp) in group I sequences | Traceability in group II sequences | ||
---|---|---|---|---|---|
Equal | Short | Long | |||
Chromosome 1 corresponding chromosome 5 | 9 | 2 | 2 | ||
(CCG)n | Intergenic | 58 |
|
||
(CCG)n | Intergenic | 78 |
|
||
(CGG)n | Intergenic | 60 |
|
||
(CGG)n | Intergenic | 60 |
|
||
(GAAAA)n | Intergenic | 26 |
|
||
(GAAAA)n | Intergenic | 33 |
|
||
(TTTTC)n | Intergenic | 26 |
|
||
(TTTTC)n | Intergenic | 26 |
|
||
(TTTTC)n | Intergenic | 26 |
|
||
(TTTTC)n | Intron | 26 |
|
||
(TTTTC)n | Intron | 26 |
|
||
(TTTTC)n | Intergenic | 22 |
|
||
(TTTTC)n | Intergenic | 26 |
|
||
| |||||
Chromosome 2 corresponding chromosome 4 | 6 | 2 | 4 | ||
(CCG)n | Intron | 174 |
|
||
(CGA)n | Intron | 150 |
|
||
(CGA)n | Intron | 150 |
|
||
(CGG)n | Intergenic | 58 |
|
||
(CGG)n | Intergenic | 58 |
|
||
(CGG)n | Intergenic | 211 |
|
||
(CGG)n | Intergenic | 126 |
|
||
(CGG)n | Intergenic | 211 |
|
||
(GAAAA)n | Intergenic | 28 |
|
||
(TTTTC)n | Intron | 22 |
|
||
(TTTTC)n | Intergenic | 28 |
|
||
(TTTTC)n | Intergenic | 27 |
|
||
| |||||
Chromosome 2 corresponding chromosome 6 | 2 | 1 | 2 | ||
(CCG)n | Intron | 74 |
|
||
(CCG)n | Intergenic | 123 |
|
||
(CCG)n | Intergenic | 75 |
|
||
(TTTTC)n | Intron | 27 |
|
||
(TTTTC)n | Intergenic | 27 |
|
||
| |||||
Chromosome 3 corresponding chromosomes 7, 10, and 12 | 0 | 0 | 2 | ||
(CGG)n | Intergenic | 59 |
|
||
(GAAAA)n | Intergenic | 22 |
|
||
| |||||
Chromosome 4 corresponding chromosomes 8 and 10 | 0 | 0 | 0 | ||
| |||||
Chromosome 8 corresponding chromosome 9 | 1 | 2 | 1 | ||
(CCG)n | Intergenic | 72 |
|
||
(CCG)n | Intergenic | 155 |
|
||
(CCG)n | Intergenic | 199 |
|
||
(TAA)n | Intergenic | 29 |
|
||
| |||||
Chromosome 11 corresponding chromosome 12 | 1 | 2 | 1 | ||
(CCG)n | Exon | 76 |
|
||
(CCG)n | Exon | 154 |
|
||
(CGG)n | Intergenic | 147 |
|
||
(TCG)n | Exon | 70 |
|
Abundance of microsatellite motifs in duplicated regions of the rice genome.
Out of the 259 microsatellites existing in the duplicated sequences, only 45 (17%) were found conserved in the paralogous sequences. Considering the mutability of microsatellites per locus per generation in rice, as described by Grover et al. [
It was also interesting to note that at some of the genomic positions a single microsatellite repeat corresponded to two microsatellite repeats with the same motif (Table
Description of paralogous loci where microsatellite motif has been found altered either by splitting and integrating, or replaced with another motif.
Duplication pair | Motif at group I site | Motif at group II site |
---|---|---|
DP 1A5 | (CCG)n | (TCC)n |
(TTAA)n | (CCG)n | |
(CGG)n | (CCG)n | |
(CGA)n | ||
| ||
DP 2A4 | (CGG)n | (CGA)n |
| ||
DP 8A9 | (CCG)n | (CCG)n |
(TCG)n | ||
(TAA)n | (CGA)n | |
| ||
DP 11A12 | (CCG)n | (CCG)n |
(CCG)n | ||
(CGG)n | (CGA)n | |
(CCG)n | (CCG)n | |
(CCG)n | ||
(CCG)n | (CCG)n | |
(CCG)n |
Out of 259, only 68 (26.25%) microsatellites were found to be associated with genes. Out of these genic microsatellites, 17 (25%) were present in exonic regions and remaining 51 (75%) were located in the intronic regions. Interestingly, 18 of the repeats and their counterparts were located to different genomic entities. For example, while one locus was located in the intergenic region, its paralgoue occurred in the genic region. Such spatial distribution can occur due to homologous recombination [
The authors’ microsatellite research has been supported by Indian Council of Agricultural Research (ICAR), Department of Science and Technology (DST) and Defence Research and Development Organization (DRDO). M. Roorkiwal acknowledges research fellowship from University Grants Commission (UGC).