Microarray-Based Comparative Genomics: Genome Plasticity in Mycobacterium bovis

Mycobacterium bovis is the causative agent of bovine tuberculosis, a disease responsible for annual losses to global agriculture of $3 billion and with serious repercussions for public health and animal welfare. The UK program for the control of bovine tuberculosis involves regular testing of cattle with a crude preparation of mycobacterial antigens (tuberculin), followed by compulsory slaughter of positive reactors. However, in the last decade the number of herd breakdowns has been increasing across the UK, especially in the south-west, where prevalence has now risen to 3.5% of cattle herds. This has worrying implications for the control strategy, which currently costs ∼£25 million/year. A range of techniques exist for the genetic typing of M. bovis isolates. These include restriction fragment length polymorphism (RFLP) with probes such as the polymorphic glycine-rich sequences (PGRS), a minisatellite method (VNTR), and spacer-oligonucleotide typing (spoligotyping). The application of these techniques has allowed the integration of molecular and epidemiological data to aid in disease control. The current method of choice for isolates at the Veterinary Laboratories Agency (VLA) is spoligotyping, a rapid simple method based on a polymorphic region called the direct repeat (DR) locus [1]. This locus is composed of multiple 36bp DR copies that are interspersed by non-repetitive, unique short sequences called spacers. Isolates of M. bovis differ in the presence or absence of spacers and adjacent DRs, allowing a barcode to be generated for each molecular type (Figure 1). At the VLA approximately 16 000 strains have been spoligotyped. Analysis of this data shows that


Introduction
Mycobacterium bovis is the causative agent of bovine tuberculosis, a disease responsible for annual losses to global agriculture of $3 billion and with serious repercussions for public health and animal welfare. The UK program for the control of bovine tuberculosis involves regular testing of cattle with a crude preparation of mycobacterial antigens (tuberculin), followed by compulsory slaughter of positive reactors. However, in the last decade the number of herd breakdowns has been increasing across the UK, especially in the south-west, where prevalence has now risen to 3.5% of cattle herds. This has worrying implications for the control strategy, which currently costs ∼£25 million/year. A range of techniques exist for the genetic typing of M. bovis isolates. These include restriction fragment length polymorphism (RFLP) with probes such as the polymorphic glycine-rich sequences (PGRS), a minisatellite method (VNTR), and spacer-oligonucleotide typing (spoligotyping). The application of these techniques has allowed the integration of molecular and epidemiological data to aid in disease control. The current method of choice for isolates at the Veterinary Laboratories Agency (VLA) is spoligotyping, a rapid simple method based on a polymorphic region called the direct repeat (DR) locus [1]. This locus is composed of multiple 36bp DR copies that are interspersed by non-repetitive, unique short sequences called spacers. Isolates of M. bovis differ in the presence or absence of spacers and adjacent DRs, allowing a barcode to be generated for each molecular type (Figure 1).
At the VLA approximately 16 000 strains have been spoligotyped. Analysis of this data shows that in the UK there are only 10 major spoligotypes. Furthermore, two of these spoligotypes, 09 and 17, represent over 65% of all isolates. Type 09 is dispersed throughout the world, while Type 17 appears unique to Great Britain (GB), suggesting recent clonal expansion. Indeed, the majority of GB isolates can be related back to the Type 09, simply on the basis of spoligotype pattern (Figure 1). This suggests that DR deletions are clonal, and that progenitor clones would be predicted to have more spacers.
Prior to the availability of the M. bovis genome sequence, comparative genomics of the M. tuberculosis complex was performed using hybridizationbased methods with micro-and macroarrays. These experiments revealed 10 deletions from the genome of M. bovis, ranging in size from ∼1 to 12.7 kb [3,6]. The deletions impacted on a range of metabolic functions and putative virulence factors, e.g. loss of the RD5 locus removed the genes for three phospholipase C enzymes from the genome, a known virulence factor in Listeria and Clostridium spp. [7]. However, a fourth phospholipase gene, plcD, is intact in M. bovis and may compensate for the loss of the other genes. The RD7 locus encompasses one of the mce operons, originally described by Riley and colleagues as a putative mycobacterial invasin [2]. The genome sequence revealed that there are in fact four mce operons in M. tuberculosis, encoding a family of 24 proteins [5]. It is therefore possible that loss of one mce operon may be compensated by the remaining loci.
Analysis of the presence or absence of these deletions across the M. tuberculosis complex allowed a phylogenetic tree to be generated showing the evolutionary relationships between the strains [4]. From this analysis it was clear that the genome of M. bovis had undergone the greatest number of deletions, and that gene loss had been a major force in shaping the genome. However, the role these deletions played in the evolution of the bacillus is unclear. While they may represent hostadaptive mutations, it is also possible that they represent the fixation of deleterious mutations, or the removal of genetic redundancy. It is also unclear whether this process of deletion is continuing in 'modern' M. bovis.

Results
The aim of this project is to determine whether clones of M. bovis, clustered on the basis of molecular type, share phenotypic characteristics that may explain their relative success as pathogens.
DNA microarray technology allows the largescale analysis of whole genomes for comparative genomics. Using this technology we can therefore rapidly screen the genomes of M. bovis strains for deletions, using an M. tuberculosis H37Rv array and exploiting the >99.9% sequence identity between the two bacilli. This study will concentrate on the 10 most prevalent GB spoligotypes, i.e. Types 09,17,12,11,13,22,25,35,20 and 10. Our initial analysis has focused on the variation between the two dominant types, 09 and 17. Previously we had used a range of lipid profiling techniques to identify an alteration in the cell wall lipid profiles between these strains. However, due to the fact that ∼10% of the coding capacity of the genome is dedicated to lipid metabolism, it was not possible to determine the genetic basis for this phenotype. The array-based approach has, however, identified