Validation of Simple Sequence Length Polymorphism Regions of Commonly Used Mouse Strains for Marker Assisted Speed Congenics Screening

Marker assisted speed congenics technique is commonly used to facilitate backcrossing of mouse strains in nearly half the time it normally takes otherwise. Traditionally, the technique is performed by analyzing PCR amplified regions of simple sequence length polymorphism (SSLP) markers between the recipient and donor strains: offspring with the highest number of markers showing the recipient genome across all chromosomes is chosen for the next generation. Although there are well-defined panels of SSLP makers established between certain pairs of mice strains, they are incomplete for most strains. The availability of well-established marker sets for speed congenic screens would enable the scientific community to transfer mutations across strain backgrounds. In this study, we tested the suitability of over 400 SSLP marker sets among 10 mouse strains commonly used for generating genetically engineered models. The panel of markers presented here can readily identify the specified strains and will be quite useful in marker assisted speed congenic screens. Moreover, unlike newer single nucleotide polymorphism (SNP) array methods which require sophisticated equipment, the SSLP markers panel described here only uses PCR and agarose gel electrophoresis of amplified products; therefore it can be performed in most research laboratories.


Introduction
In recent years, there has been a steady increase in the creation and use of genetically engineered mutant mice for use in biomedical research. Frequently, the genetic background of such mice allows only specific experiments and the mutations have to be transferred to different genetic background(s) to facilitate other kinds of experiments. There are many examples where genetic background is shown to influence the phenotype of a transgenic or knockout mouse line [1][2][3][4][5][6][7][8][9][10][11][12]. Traditionally however, mutant mice are generated using strains that have shown exceptional performance in terms of their suitability for production of transgenic or knockout mice lines. For example, FVB strain and BDF1 strain mice are most commonly used for transgenic mice production [13] and ES (embryonic stem) cells derived from 129 inbred strains are commonly used for knockout mice production. For technical reasons, chimeras developed in knockout mice generation will carry a mixed genetic background (e.g., 129 and B6) adding further complexity to the analysis [14]. Furthermore, ES cells derived from C57BL6/N inbred strain have been used in mouse genetic resources such as KOMP (Knock-Out Mouse Project) and EUCOMM (European Conditional Mouse Mutagenesis) [15]. Also, particular strains of mice seem to have better suitability for a specific research purpose. In addition, mixed background strains between C57BL/6J 2 International Journal of Genomics and DBA/2J were used for ENU mutagenesis projects because the cross between two different inbred strains (one strain (male) was used for ENU exposure and mated with another strain (female)) is useful for mapping and positional cloning of the mutated gene using genetic polymorphisms existing between them [16,17]. The main disadvantage of a given mutation under a particular strain background is that the strain background may limit its use for a specific research purpose. In such a situation the mutation is needed to be transferred into a strain background of choice through a process called backcrossing. Backcrossing involves about 10 generations of successive breeding into a recipient strain of choice to achieve 99.9% congenic (genetic composition) for that strain (http://www .informatics.jax.org/silverbook/). This painstaking process consumes about 2.5 to 3 years of time, a fact that often limits its feasibility and usefulness given the pace of scientific research. In some cases, studies are published with animals after only 5 generations of backcrossing, in an attempt to compensate the time required and the need to obtain some results in the new strain [18]. A technique called "marker assisted speed congenic" used for over a decade helps in achieving congenic strain in 5 or less, unlike the usual 10 generations required in traditional backcrossing [19,20]. The small sequence differences between mouse strains called "microsatellite markers" served as useful tools in detecting the chromosome regions of origin in the offspring when two inbred strains of mice are bred together. To use in these assays, many microsatellite markers have been identified and characterized by various researchers between the donor and recipient strains of their choice ( [21] and [22, page 6] and [23][24][25][26][27][28][29][30]). However, the information about the markers that can differentiate between commonly used strains for transgenic and knockout mice generation is not tested sufficiently and is not available readily. In this study, we tested 423 markers, ∼10 to 30 per chromosome, using the genomic DNAs from 10 commonly used mice strains-particularly the strains used in transgenic and knockout research. We evaluated the markers that could be used in the agarose gel electrophoresis method which is a simple technique commonly used in most molecular biology laboratories these days. The data presented here will serve as a valuable tool for various investigators in choosing the markers useful for their speed congenic breeding.

Selection of Oligonucleotide Primers.
The primers were chosen based on the following criteria: (1) evaluation and establishment of at least 6 markers per chromosome, (2) distance between the adjacent markers kept as minimum as 10 to 15 centimorgans (cM), and (3) polymorphic bands appreciable when analyzed in a 4% agarose gel electrophoresis and resolved by electrophoresis distance of up to 10 centimeters from the loading wells. All markers were chosen from the Mouse Genome Informatics (MGI) database links (http:// www.informatics.jax.org/searches/probe report.cgi? Refs key=22816 and ftp://ftp.informatics.jax.org/pub/data-sets/ index.html (numbers 7 and 12 in the list)).

Mice
Strains. The mice were purchased from Charles River Laboratories or The Jackson Laboratory. The strains, the rationale for including these strains in the panel, and the vendors are listed in Table 1.

DNA Extraction, PCR Amplification, and Agarose Gel
Electrophoresis. The DNA samples were extracted from the tail pieces of about 3-5 mm length using Gentra Puregene Tissue Kit (Cat. # 158622). Twenty ul PCR reactions were set up using 1X reaction buffer containing 20 mM Tris pH 8.4, 50 mM KCl, 3 mM MgCl 2 , and 1 unit of Taq DNA polymerase (New England Biolabs, Cat. # M0273) under the following conditions: one cycle of 95 ∘ C -2 min followed by 35 cycles of 95 ∘ C -30 sec, 55 ∘ C -30 sec, and 72 ∘ C-60 sec and one cycle of 72 ∘ C -5 minutes, followed by a holding temperature at 4 ∘ C until the samples were removed from the machine. Fifteen ul of PCR products was resolved using 4% agarose gels for 120 to 150 minutes at 200-constant-volt electric current. The agarose was purchased from Phenix Research Products (Item Number RBA-500: Molecular Biology Grade) and the gels were prepared on 0.5X TAE buffer diluted from a stock of 50X (Fisher Scientific, Cat. # BP1332-20). The gels contained ethidium bromide dye (0.5 g/mL) to aid the visualization of PCR bands. Each panel of gel included one or more lanes of 100-base-pair molecular weight marker (New England Biolabs, Cat. # N3231) to assess the PCR product sizes. The bromophenol blue dye-front was allowed to run for up to 10 centimeters from the wells and the gels were imaged using BioRad Gel Doc XR system. Wherever necessary, the gels were run longer to resolve the bands.

Analysis and Interpretation of Polymorphic PCR Bands.
The cropped images were imported into an Excel file for analysis and interpretation. Band sizes were assigned numbers 1, 2, 3, or 4 to indicate their sizes compared to the rest of the bands in that set. Number 1 was assigned to the smallest sized polymorphic band, 2 to the next biggest in the group, and so on. The Excel file along with the gel images was converted into a .pdf file for generating Figure 1

Results and Discussion
Mutant mice created using transgenic and knockout techniques are available under certain specific backgrounds. In order to best use such mutants for a specific research purpose, they routinely need to be bred into other strain backgrounds through successive breeding of about 10 generations. A quicker way to attain highest recipient genome can be achieved by a process called speed congenic breeding in which the polymorphic markers between the recipient and donor strains are screened among offspring in each generation and the offspring with highest recipient genome is chosen as a breeder for the subsequent generation [19][20][21]. Although there are a few reports describing the marker sets suitable for certain pairs of strains, there are no wellestablished marker sets available across the most commonly used strains in transgenic and knockout mouse techniques. Our primary objective of this work was to test the suitability of several markers in an agarose gel electrophoresis method, a technique that uses least expensive equipment and reagents and is readily available in most molecular biology laboratories. We sought to test a large number of microsatellite markers (Tables 2 and 3 and S1 in Supplementary Material available online at http://dx.doi.org/10.1155/2015/735845) among 10 most commonly used mouse strains particularly those used in transgenic and knockout mice techniques.
We used Mouse Genome Informatics (MGI) database and short-listed the primers based on the criteria described in Section 2. MGI database lists several of the microsatellite markers identified in various inbred strains. Although the database is extensive, it is difficult to choose marker sets for speed congenic screening because it lacks the critical information such as the sizes of PCR products of various markers in different inbred strains and if the differences among strains can be identified using conventional agarose gel electrophoresis.
As per the information available on MGI database, predicted PCR band size differences among strains for some markers ranged from as few as a few base pairs to over 100 base pairs. Selection criteria for markers and the mice strain are described in Section 2. Taking C57BL6J and 129X1 strains as a comparison pair, for example, we chose markers with differences of 30 bp and above (as per MGI database), a range that can be easily resolved using about 2% agarose gels. We aimed to choose markers in this range for all the chromosomal locations with an interval of 5 to 15 cM. However, we were unable to find suitable markers in some regions with this criterion. In such locations, we tested markers with as less as 8 to 12 bp size difference. Such small differences can be best resolved using polyacrylamide gel electrophoresis (PAGE). However, in a typical speed congenic screening that involves analysis of about 100 markers for each sample, applying PAGE method for screening becomes quite laborious. We used 4% agarose gels to resolve makers even with very small size differences; this seemed to be sufficient to resolve the bands when they were run about 8 to 10 centimeters from the loading well. In order to keep the assay conditions uniform we used 4% agarose gel for all the markers tested. Table 2 shows the list of the markers tested and found polymorphic in at least one of the 10 strains analyzed. We tested a total of 423 markers of which we present useful data for 195 markers. The gel images and interpretation of polymorphisms are tabulated in Figure 1. The gel images indicate that there were readily appreciable size differences for most markers whereas some markers showed very narrow size difference between some strains. In general, sizes of the majority of the markers matched the information on MGI database but there were some discrepancies noticed. The agarose gel electrophoresis using 4% gels could be useful in detecting differences among the markers in different strains.
The data presented in Figure 1 indicate that some markers could readily distinguish several strains from each other. We initially aimed to identify good panel of markers for every 10 cM. Although we found adjacent markers as close as 10 cM in most locations, we failed to achieve this in some regions. We screened several available markers in such regions and were unable to find markers that matched the criteria set forth in our assay. It should be noted that there were markers that differed very minutely among some strains. If such markers are chosen in an actual speed congenic screening, we recommend including samples from both donor and recipient strains and an equimolar mixture of the two in the panel in order to ensure the proper banding pattern of offspring. The gel images and the interpretation of polymorphism provided in Figure 1 will serve as a comprehensive tool for a researcher who wishes to undertake congenic breeding in a pair of strains from the list.
There is some debate about the minimum number of markers needed per chromosome for identifying the regions between the donor and recipient strains [32]. A study concluded that as few as three markers per chromosome were sufficient to achieve similarly meaningful results as that of over 6 markers per chromosome [33]. Computer simulation [20] indicated that a relatively modest selection effort of 60 evenly spaced markers with 25 cM spacing (corresponding to 3 markers per chromosome), 16 males per generation, would typically reduce unlinked donor genome contamination to below 1% by four backcross generations (N5). We conclude that the list presented here can serve to choose panels of markers for most two-strain combinations with at least 3 to 4 markers per chromosome (with exception of a very few combinations). Further studies that compare smaller and larger panels of markers in the same set of samples for marker assisted speed congenics are needed to address this question unequivocally.
In recent years there have been advancements in the approaches used for speed congenics. These methods include (i) use of fluorescently labeled primers to amplify SSLP markers followed by resolving the products in sequencing gels [29] and (ii) microarray chips of SNPs [29,[34][35][36]. With the advent of newer methods particularly those that use SNP based marker analysis, a very high number of markers per chromosome can be screened simultaneously that increases the resolution severalfold compared to the conventional SSLP based markers analysis. Although there are some computer simulation studies for assessing the efficiency of speed congenic screening in general [32] and algorithm based reports to compare SNP and SSLP based speed congenic screens [37] there are no systematic studies to compare the two methods to assess the efficiency and cost effectiveness of each method. Here, we compare the SNP and SSLP based approaches in terms of their adoptability and feasibility to most laboratory settings including their cost. The hands-on-time in performing gel based assays has been reduced greatly by newer methods that use SNP arrays. However these methods have some limitations compared to traditional agarose gel electrophoresis (AGE) based SSLP marker analysis. (1) The newer methods are expensive in terms of the initial investment in reagents and/or operational costs compared to SSLP-AGE method. On the contrary, basic requirements needed for AGE based systems are readily available in any molecular biology laboratory and the only additional investment needed will be to synthesize the required oligonucleotides. (2) Subset of markers to be analyzed in the subsequent generations cannot be skipped from the panel unlike in the AGE based method where the number of markers to be analyzed will become significantly reduced in successive generations and so also the overall cost of the assay. Other advantages of AGE based SSLP marker analysis over these newer methods are as follows: (i) the equipment needed to perform the assay is readily available in most laboratories and (ii) it can be routinely performed by many researchers and technicians without the need of special training as needed for SNP based approaches.
Considering the highest resolution that is possible with the SNP based method, it can be regarded as the superior method of all. However, the microchips that are currently available are expensive and the cost for analyzing each mouse DNA sample runs to about US $100 to $150. Assuming that about 15 mouse DNA samples per generation for 5 generations are analyzed, a typical speed congenic project would cost about $15,000 to $22,500. On the other hand, when using SSLP-AGE approach, since marker analysis is done manually, the markers that were fixed in the previous generation can be skipped in the successive generations; this makes the SSLP-AGE based method cost effective compared to SNP based method. It is estimated that a typical SSLP based speed congenic screen that employs analysis of 15 samples per generation for 5 generations using 5 markers per chromosome would need about 2000 to 2300 PCR reactions. At the rate of $2 to $2.5 per reaction (cost analysis done in our laboratory) it will cost about $4000 to $5750 for one full speed congenic project. Furthermore, SSLP-AGE method can be performed in any simple molecular biology labs compared to SNP method that requires expensive equipment.

Conclusions
Although some information about microsatellite marker differences between the commonly used inbred mouse strains was available, there was no systematic study to validate a large panel of markers for SSLP-AGE based speed congenic screening. The panel of marker sets validated and presented in this study serves as a ready reference for researchers who wish to perform cost-effective speed congenic screening in a pair of strains from the panel. The assay can be performed in any standard molecular biology lab. The data in this report is available at ftp://ftp.informatics.jax.org/pub/datasets/ index.html#Guru.