Construction of a Llama Bacterial Artificial Chromosome Library with Approximately 9-Fold Genome Equivalent Coverage

The Ilama is an important agricultural livestock in much of South America. The llama is increasing in popularity in the United States as a companion animal. Little work has been done to improve llama production using modern technology. A paucity of information is available regarding the llama genome. We report the construction of a llama bacterial artificial chromosome (BAC) library of about 196,224 clones in the vector pECBAC1. Using flow cytometry and bovine, human, mouse, and chicken as controls, we determined the llama genome size to be 2.4 × 109 bp. The average insert size of the library is 137.8 kb corresponding to approximately 9-fold genome coverage. Further studies are needed to further characterize the library and llama genome. We anticipate that this new library will help facilitate future genomic studies in the llama.


Introduction
There are six species in the family Camelidae; namely, Camelus bactrianus, Camelus dromedarius, Lama glama (llama), Lama guanicoe (guanaco), Lama pacos (alpaca), and Vicugna vicugna (vicuña). Llamas are especially important for agriculture in Andean South America. They are also indispensable in the Bolivian and Peruvian high plain (Altiplano) where they are well adapted to the harsh conditions. In these locations, the uses of llamas include fiber, meat, and draft. In the impoverished communities of the Altiplano, the utilization of llamas, in these ways, may represent a considerable amount of a family's disposable income. Improvement in the production of llamas would aid the rural families of the Altiplano to have a steady food source and income as well as aid in becoming self-sufficient [1]. Llamas and alpacas are also rising in popularity in the United States.
Our long-term goal is to increase the understanding of the genetic basis of phenotypic differences in camelids. In most common livestock and companion animal species, bacterial artificial chromosome (BAC) libraries have facilitated genetic work. However, until recently, a BAC library has not existed for camelids, which has impeded the progress of genetic research in this family. Therefore, there was a critical need for a llama BAC library. It is important to recognize that the llama genome size has already been determined [2]. This demonstrates the growing importance of this livestock species.
BAC libraries can provide a way of obtaining complete gene sequences, including intron sequences. Intron sequences are important because they contain more polymorphisms than exons. In addition, introns allow for the design of primers, which amplify entire exons. BAC libraries have been developed for a number of plant and animal species of major and minor economic/agricultural impacts including durum wheat [3], coffee [4], pepper [5], ginseng [6], quinoa [7], bovine [8,9], domestic dog [10], and sheep [11]. These BAC libraries have been used to facilitate physical mapping, gene cloning, and many other types of studies [12].
A linkage map for the llama species is yet to be created. However, this and other future projects will help in identifying the causative genetic differences in Camelids to improve selection for particular traits. Such identification of mutations responsible for phenotypes, for instance, has already been performed in sheep [13] and pigs [14] as well as a number of other species.

Preparation of Megabase Size DNA.
Blood was drawn from llamas after which red blood cells were separated from the white blood cells by centrifugation. The white cells were resuspended in sterile normal saline (0.9% NcCl). This procedure was repeated three times and white cells were counted using a haemocytometer. The cells were diluted in sterile normal saline to a concentration of 1 × 10 8 /mL. Eight milliliters of the white blood cells suspension was combined with 8 mL of low melting point (LMP) agarose to 100 µL form plugs. The formed plugs were incubated at 4 C for 15 min. and harvested into 50 mL centrifuge tubes. Forty-five ml of lysis buffer (0.5 M EDTA, pH 9.0; 1% lauryl sarcosine and 1 mg/mL proteinase K) was added to each tube and they were incubated at 50 C for 48 hours with gentle rotation.

Size Selection.
While still in the plug, the purified DNA was washed and partially digested with either EcoRI or BamH1. The plugs were then loaded in a CHEF Gel (Bio-Rad Laboratories, Hercules, CA USA) and DNA was separated/fractionated by pulse field electrophoresis. The resulting fractionated DNA was divided into three size categories, cut from the gel, resuspended, and purified. The size categories we used, based upon Lambda DNA standard, were 100-200 kb, 200-300 kb, and 300-400 kb. The sizes were subsequently identified as low, medium, and high molecular weight, respectively.

Creation of Libraries.
The libraries were constructed with approximately 150 ng llama digested DNA, which was used at a rate of 0.6-1.0 ng/µL in ligation reactions of linearized dephosphorylated pECBAC1 according to Zhang [15], with minor modifications. At the recommended 1 : 4 molecular weight ratio (vector to DNA), ElectroMax DH10B cells (Invitrogen, Carlsbad, CA) were transformed with ligation mix by electroporation with a BioRad (Hercules, CA) Gene Pulser at a voltage of 2.5 kV, a capacitance of 25 µF, and an impedance of 100 Ω, all according to the manufacturer's specifications. Cells were placed on LB agar containing 12.5 µg/mL chloramphenicol, 15 µg/mL IPTG, and 60 µg/mL X-gal and incubated at 37 • C for 24-36 h. White colonies were manually picked and transferred directly to 384-well microtiter plates containing 50 µL LB freezing medium, with 12.5 µg/mL of chloramphenicol in each well according to Zhang et al. [16]. After incubaton at 37 • C for 24 h, microtiter plates were placed in a −80 • C freezer for long-term storage.
BAC DNA was isolated from 200 and 280 clones from the BamHI and EcoRI libraries, respectively, by an alkaline lysis method, digested with NotI, and separated on 1% agarose CHEF gels. The insert size was determined by comparison with a molecular-weight ladder on the gel, and an average insert size was calculated for the set of clones tested from each library.
Colonies containing individual BACs were doublespotted in high-density arrays on nylon membranes using a Genetix Q-Bot at the Arizona Genomics Institute (Tucson, AZ, USA) as described in [17].

Determining Complexity of the Llama Genome.
The llama genome size was determined by flow cytometry as previously described [18].

Creation of a Llama BAC Library.
Our laboratory succeeded in developing the first and currently the only llama BAC library reported to date. Our llama BAC library contains an average insert size of 137.8 kb consisting of 196,224 clones, together equaling nearly a 9-fold genome coverage. While we calculated the size of the llama genome, by flow cytometry, to be 2.4 × 10 9 bp, we note that Romanini reported, in 1985, the llama genome size to be about 3.2 × 10 9 bp [2]. We deferred to this number and used it in all subsequent calculations.

Characterization of the Library.
Clonal BAC DNA was isolated from a number of individual clones and restricted with the enzyme NotI to release the insert. The digestion products were size fractionated in an agarose gel to determine length. Insert ends were sequenced. The insert sizes range from both libraries was from 75 kb to 306 kb. The average insert size of both libraries was 137.8 kb. The inserts selected as low inserts averaged 112.5 kb and ranged from 97 to 200. The inserts selected as medium inserts averaged 134 kb and ranged from 75 to 300 kb. The inserts selected as high inserts averaged 170.6 kb and ranged from 100 to 306 kb. After examination of 300 clones from all three size ranges, we estimate the number of clones without inserts to be 10% (see Figure 1).
The average insert size of the llama BAC library analyzed in this project was 137.8 kb. The llama genome was reported to be 3.2 × 10 9 base pairs by Romanini [2]. This means that the library would have to contain 196,224 different inserts to cover the entire genome. Of course, this operates on the assumption that each segment of the genome was only copied into one insert. In reality, there are probably certain segments from the genome that are copied multiple times in several inserts.
Currently, there are 511 plates containing 384 wells each representing 36 µL of ligation reaction. Given an average insert size of 137.8 kb, this means that we currently have about 196,224 inserts represented in the library. We have an additional 950 µL of ligation reaction to electroporate and evaluate.
We hybridized the membranes with two low-copy probes, coat color genes, bovine tyrosinase (GenBank NM 174480.3), and Lama pacos MC1R (GenBank EU135880). We observed 38 hits for the tyrosinase probe and 25 for MC1R. The average number of positive hits for these hybridizations was 31.5, which is higher than the estimated genome coverage of the library. This discrepancy may be due to an underestimation of the genome coverage of the library or it may reflect the presence of duplicate genetic loci for these genes in the llama genome.

Conclusions
We successfully created about a 9-fold llama BAC library. Using this library, we characterized 480 of our 196,224 clones. The llama BAC library is an important genomic resource that can be utilized for the improvement of camelid production for subsistence farmers on the Altiplano and commercial producers throughout the Andean Region. The library can also be used in the assembly of contigs for generation of a physical map. The physical map will be an important step in unraveling the ancestry of domesticated South American camelids, llama, and alpaca, to better understand its relationship with wild relatives, vicuna and guanaco, that are found throughout the region. Moreover, physical mapping would allow one to do synteny studies utilizing the DNA sequence information from camelids. We envision that the genomic resources developed for South American camelids may be useful in comparative analyses with closely related species of economic importance in Africa and Asia.
In addition, our llama BAC library can be used to physically map genes to specific loci using fluorescence in situ hybridization (FISH) technology. Previously, mapping of genomes in sheep and pigs has led to significant increases in addressing questions in genetic research. For instance, marker-assisted selection has been used to remove deleterious traits within a short time using genome mapping [13,14]. Additionally, a wide array of research has been furthered using the RPCI-11 human BAC library created in 2001, that is, insert-end sequencing, clone fingerprinting, highthroughput sequence analysis, and diagnostic studies [19]. Our project and the future projects using our BAC library will aid in llama production throughout the world.