Little is known about the general biology of minisatellites. The purpose of this study is to examine repeat mutations from the D1S80 minisatellite locus by sequence analysis to elucidate the mutational process at this locus. This is a highly polymorphic minisatellite locus, located in the subtelomeric region of chromosome 1. We have analyzed 90,000 human germline transmission events and found seven (7) mutations at this locus. The D1S80 alleles of the parentage trio, the child, mother, and the alleged father were sequenced and the origin of the mutation was determined. Using American Association of Blood Banks (AABB) guidelines, we found a male mutation rate of
The human genome can be grossly partitioned into three categories: nonrepetitive (single copy sequences), moderately repetitive (families of retroposon-like sequences), and highly repetitive (“classical” satellite) DNA [
Studies have shown that minisatellites may mutate by unexpectedly complex conversion-like events [
Little is known about the general biology of minisatellites [
Nucleotide sequences of observed repeat units. The consensus sequence represents the most common nucleotide observed in each position of the repeats. Twenty variations based on the consensus sixteen base repeat unit are tabulated. Each repeat unit is assigned a letter code. Dots (·) represent a match to the consensus sequence as represented by type H repeat unit. Letters represent nucleotide differences when compared to the consensus sequence and correspond to A, G, C, and T nucleotides. A (-) represents a missing nucleotide in the repeat unit A.
· | · | · | · | · | · | · | · | · | |||||||||
· | · | · | · | · | · | · | · | · | · | · | · | ||||||
· | · | · | · | · | · | · | · | · | · | · | · | · | · | · | |||
· | · | · | · | · | · | · | · | · | · | · | · | · | · | ||||
· | · | · | · | · | · | · | · | · | · | · | · | · | · | ||||
· | · | · | · | · | · | · | · | · | · | · | · | · | · | · | |||
· | · | · | · | · | · | · | · | · | · | · | · | · | · | · | |||
(Consensus) | |||||||||||||||||
· | · | · | · | · | · | · | · | · | · | · | · | · | · | ||||
· | · | · | · | · | · | · | · | · | · | · | · | · | |||||
· | · | · | · | · | · | · | · | · | · | · | · | · | · | · | |||
· | · | · | · | · | · | · | · | · | · | · | · | · | · | · | |||
· | · | · | · | · | · | · | · | · | · | · | · | · | |||||
· | · | · | · | · | · | · | · | · | · | · | · | · | · | · | |||
· | · | · | · | · | · | · | · | · | · | · | · | · | · | · | |||
· | · | · | · | · | · | · | · | · | · | · | · | · | |||||
· | · | · | · | · | · | · | · | · | · | · | · | · | |||||
· | · | · | · | · | · | · | · | · | · | · | · | ||||||
· | · | · | · | · | · | · | · | · | · | · | · | · | |||||
· | · | · | · | · | · | · | · | · | · | · | · | · | · | ||||
· | · | · | · | · | · | · | · | · | · | · | · | · |
The general structure of the D1S80 locus is that of a monomeric type VNTR consisting of four 5′ repeat units (A-B-C-D) and seven 3′ repeat units (H-I-J-I-I-L-G) that are constant (motifs), with variable number of repeats in between. These two motifs are essentially identical in almost all alleles sequenced in all population groups so far (data not shown). There is a 132 base pair 5′ flanking region which includes the forward PCR primer (P1) and a 32-base pair 3′ flanking region which includes the reverse PCR primer (P2) sequence. The two-flanking restriction site polymorphism, HinfI (G
The D1S80 locus was previously used for forensic analysis due to its small size; however, this locus has recently been used as an associative tool in the elucidation of the chromosome 1p36 deletion syndrome which is one of the most common deletion syndromes located near a chromosomal terminus (one in 5000 births) [
Recent and past work with respect to this locus has included the use of two single nucleotide polymorphisms (SNPs). These included rs16824398, which is a SNP that involves a HinfI restriction site at a nucleotide 58 bases downstream from the forward primer and another SNP in the 3′ flanking region that involves an Fnu4HI restriction site, at the base next to the last repeat. Interestingly, all 18-repeat alleles to date have been found to be associated with HinfI(+) and Fnu4HI(−) restriction site polymorphisms at the 5′ and 3′ ends, respectively (Figure
Buccal epithelial cell samples from 45,000 trios for paternity analysis were collected in quadruplicate from each individual using cotton-tipped swabs and air dried at ambient temperature. Each paternity short tandem repeat (STR) assay was done using duplicate independently prepared DNA extractions and assayed using proprietary inhouse STR multiplexes. Each multiplex included a shared locus (generally D1S80) with other multiplexes to confirm sample identity. D1S80 was assayed in combination with two to three other STR loci in multiplex reactions. The other STRs were selected to produce lower molecular weight fragments that would not interfere with analysis of the larger D1S80 amplified products. Amplified products were analyzed using nondenaturing high-resolution polyacrylamide gel electrophoresis with silver stain detection. Alleles were manually determined by comparison with allelic ladders containing all common D1S80 alleles. All STR-based parentage testing was performed with full knowledge and consent of the tested individuals or authorized parent/guardian. Appropriate institutional review board approval was obtained for sequence analysis of the samples.
To fully understand the implications and delineate the mutational mechanisms that the D1S80 locus undergoes, we examined the number of parent child allele transfers analyzed at this locus for the year 1996 from Laboratory Corporation of America, Burlington, NC. Observed mutations were confirmed as coming from concordant father or mother by analyzing short tandem repeat (STR) data from samples collected from alleged father, mother, and child using Amp
Several factors were used in the elucidation of the origin of the mutation between the alleged father and mother of the child. The first factor was the phase of the SNPs at the 5′- and 3′-flanking sequence of the repeat array. Another factor was a match of the repeat units of the alleles most likely to be associated between father and mother and finally we assumed that the minimal size change was the most probable one.
A mutation rate was then estimated by the number of samples sequenced that contain the mutation and compared to the number of trios sampled that year at Laboratory Corporation of America. Parentage calculations and mutation rate analysis were performed following guidelines put forth by the American Association of Blood Banks (AABB) [
Total genomic DNA was isolated from the trios that contained the mutation by standard organic extraction with phenol/chloroform/isoamyl alcohol followed by purification and concentration using Amicon Ultracel centrifugal filters (Millipore Corporation, Billerica, MA, USA). Quantitation of total genomic DNA was performed using Quantifiler human DNA quantitation kit (Applied Biosystems, Foster City, CA, USA). Paternity was confirmed by analyzing a set of fifteen short tandem repeat (STR) markers using the Identifiler human identification kit (Applied Biosystems, Foster City, CA, USA) as per manufacturer’s recommendations. The samples were run on an automated 310 Genetic Analyzer and analyzed with GeneMapper ID software version 3.2 (Applied Biosystems, Foster City, CA, USA). The D1S80 locus was amplified using primers described by Kasai et al. [
Paternity calculations were performed using AABB guidelines [
DNA sequence characteristics of seven trios illustrating the mutational event. The standard paternity test, called a trio, involves the child, mother, and alleged father. Allele represents the number of 16 base pairs repeat units making up the entire allele. The alphabets in each sample represent the arrangement of the repeat types making up each allele. The alphabet code for the repeat types and the corresponding bases is given in Table
Sample | Allele | HinfI | Repeat types | Fnu4HI | Comments | ||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Trio | ||||||||||||||||||||||||||||||||||||||
1 | Possible progenitor | ||||||||||||||||||||||||||||||||||||||
1 | |||||||||||||||||||||||||||||||||||||||
1 | Possible progenitor | ||||||||||||||||||||||||||||||||||||||
1 | Child-18 allele 2 | 18 | Positive | A | B | C | D | D | E | C | H | H | I | I | H | I | J | I | I | L | G | ||||||||||||||||||
1 | Mother-18 allele 2 | 18 | Positive | A | B | C | D | D | E | C | H | H | I | I | H | I | J | I | I | L | G | ||||||||||||||||||
1 | Father-22 allele 2 | 22 | Positive | A | B | C | D | D | E | C | H | H | H | I | J | I | I | I | H | I | J | I | I | L | G | ||||||||||||||
2 | Trio | ||||||||||||||||||||||||||||||||||||||
3 | Possible progenitor | ||||||||||||||||||||||||||||||||||||||
3 | |||||||||||||||||||||||||||||||||||||||
3 | Possible progenitor | ||||||||||||||||||||||||||||||||||||||
3 | Child-18 allele 1 | 18 | Positive | A | B | C | D | D | E | C | H | H | I | I | H | I | J | I | I | L | G | ||||||||||||||||||
3 | Mother-25 allele 2 | 25 | Negative | A | B | C | D | D | D | E | F | C | G | H | I | I | I | J | H | I | J | H | I | J | I | I | L | G | |||||||||||
3 | Father-31 allele 2 | 31 | Positive | A | B | C | D | D | E | F | C | G | H | I | I | H | I | H | I | I | K | I | K | I | K | I | I | H | I | J | I | I | L | G | |||||
3 | Trio | ||||||||||||||||||||||||||||||||||||||
7 | |||||||||||||||||||||||||||||||||||||||
7 | Progenitor | ||||||||||||||||||||||||||||||||||||||
7 | Child-24 allele 2 | 24 | Negative | A | B | C | D | D | E | F | C | G | H | I | I | I | J | H | I | J | H | I | J | I | I | L | G | ||||||||||||
7 | Mother-24 allele 1 | 24 | Negative | A | B | C | D | D | E | F | C | G | H | I | I | I | J | H | I | J | H | I | J | I | I | L | G | Donor, child allele 2 | |||||||||||
7 | Father-24 allele 2 | 24 | Negative | A | B | C | D | D | E | F | C | G | H | I | I | I | J | H | I | J | H | I | J | I | I | L | G | ||||||||||||
7 | Mother-31 allele 2 | 31 | Positive | A | B | C | D | D | E | F | C | G | H | I | I | H | K | I | K | H | K | I | K | I | K | I | I | H | I | J | I | I | L | G | |||||
4 | Trio | ||||||||||||||||||||||||||||||||||||||
8 | |||||||||||||||||||||||||||||||||||||||
8 | Progenitor | ||||||||||||||||||||||||||||||||||||||
8 | Child-31 allele 2 | 31 | Positive | A | B | C | D | D | E | F | C | G | H | I | I | H | I | H | I | I | K | I | K | I | K | I | I | H | I | J | I | I | L | G | |||||
8 | Father-31 allele 2 | 31 | Positive | A | B | C | D | D | E | F | C | G | H | I | I | H | I | H | I | I | K | I | K | I | K | I | I | H | I | J | I | I | L | G | Donor, child allele 2 | ||||
8 | Father-26 allele 1 | 26 | Positive | A | B | C | D | D | E | F | C | G | H | I | I | H | I | H | I | I | I | I | L | S | J | I | I | L | G | ||||||||||
8 | Mother-31 allele 2 | 31 | Positive | A | B | C | D | D | E | F | C | G | H | I | I | H | K | I | K | H | K | I | K | I | K | I | I | H | I | J | I | I | L | G | |||||
5 | Trio | ||||||||||||||||||||||||||||||||||||||
10 | Possible progenitor | ||||||||||||||||||||||||||||||||||||||
10 | |||||||||||||||||||||||||||||||||||||||
10 | Possible progenitor | ||||||||||||||||||||||||||||||||||||||
10 | Child-28 allele 2 | 28 | Positive | A | B | C | D | D | E | C | G | H | I | K | K | I | I | H | I | I | H | I | J | I | H | I | J | I | I | L | G | ||||||||
10 | Mother-24 allele 1 | 24 | Positive | A | B | C | D | D | E | C | G | H | I | K | K | I | I | K | I | I | H | I | J | I | I | L | G | Mother's sequence does not match child 26 | |||||||||||
10 | Father-31 allele 2 | 31 | Positive | A | B | C | D | D | E | F | C | G | H | I | I | H | K | I | K | H | K | I | K | I | K | I | I | H | I | J | I | I | L | G | |||||
6 | Trio | ||||||||||||||||||||||||||||||||||||||
11 | Possible progenitor | ||||||||||||||||||||||||||||||||||||||
11 | |||||||||||||||||||||||||||||||||||||||
11 | Possible progenitor | ||||||||||||||||||||||||||||||||||||||
11 | Child-24 allele 2 | 24 | Negative | A | B | C | D | D | E | F | C | G | H | I | I | I | J | H | I | J | H | I | J | I | I | L | G | ||||||||||||
11 | Mother-18 allele 1 | 18 | Positive | A | B | C | D | D | E | C | H | H | I | I | H | I | J | I | I | L | G | ||||||||||||||||||
11 | Father-31 allele 2 | 31 | Positive | A | B | C | D | D | E | F | C | G | H | I | I | H | K | I | K | H | K | I | K | I | K | I | I | H | I | J | I | I | L | G | |||||
7 | Trio | ||||||||||||||||||||||||||||||||||||||
12 | |||||||||||||||||||||||||||||||||||||||
12 | Progenitor | ||||||||||||||||||||||||||||||||||||||
12 | Child-20 allele 2 | 20 | Negative | A | B | C | D | E | F | C | G | H | I | I | I | J | H | I | J | I | I | L | G | ||||||||||||||||
12 | Mother-20 allele 2 | 20 | Negative | A | B | C | D | E | F | C | G | H | I | I | I | J | H | I | J | I | I | L | G | Donor, child allele 2 | |||||||||||||||
12 | Mother-18 allele 1 | 18 | Positive | A | B | C | D | D | E | C | H | H | I | I | H | I | J | I | I | L | G | ||||||||||||||||||
12 | Father-31 allele 2 | 31 | Positive | A | B | C | D | D | E | F | C | G | H | I | I | H | I | H | I | I | K | I | K | I | K | I | I | H | I | J | I | I | L | G |
¶Represent the mutated region of the parental and child allele.
Only two studies are reported that have dealt with the mutational events within this locus [
Mutation rates for minisatellites have been estimated to range from 0.5% to greater than 20% per generation in some studies and in another study the rate was reported as 0.53 × 10−3 to 1.53 × 10−3 [
Several studies have examined mutational mechanisms according to infinite allele model (IAM) and the one-step stepwise mutation model (SMM). They have found that minisatellites sometimes fit both mechanisms; however, short tandem repeat loci (STR) were most similar to the simulation results under the SMM model of mutation. In this study, we found that this locus fit into the one-step stepwise mutation model (SMM) mechanism in six out of seven instances similar to STR loci [
Jeffreys et al. [
In addition to the differences in the repeat regions of the progenitor and mutated alleles, Jeffreys et al. have also found that the polymorphisms flanking the minisatellite repeat array have failed to reveal exchange of flanking markers [
The data we have reported highlights the importance of a number of questions. (1) Do repeats have functions and do they have an effect on genome structure? (2) Are the allelic clades the result of constraints on mutation? (3) Is selection acting at the D1S80 locus?, and (4) Why is the D1S80 locus conserved among primates (unpublished data) including humans?
The authors declare no conflict of interests, financially or otherwise.
The authors would like to thank Mrs Yasotha Balasubramaniam, Toronto, ON, Canada for the artwork and the four anonymous reviewers for their comments and suggestions. This study was partly supported by funding provided by the National Institute of Justice, Department of Justice, Washington, DC, USA, under project number 2009-D1-BX-0293 to the University of Southern Mississippi and by a startup grant for one of the authors (K. Balamurugan) from the University of Southern Mississippi, Hattiesburg, MS, USA. Points of view in the document are those of the authors and do not necessarily represent the official view of the U.S. Department of Justice.