Assessment of the Genetic Diversity of Mycobacterium tuberculosis esxA, esxH, and fbpB Genes among Clinical Isolates and Its Implication for the Future Immunization by New Tuberculosis Subunit Vaccines Ag85B-ESAT-6 and Ag85B-TB10.4

The effort to develop a tuberculosis (TB) vaccine more effective than the widely used Bacille Calmette-Guérin (BCG) has led to the development of two novel fusion protein subunit vaccines: Ag85B-ESAT-6 and Ag85B-TB10.4. Studies of these vaccines in animal models have revealed their ability to generate protective immune responses. Yet, previous work on TB fusion subunit vaccine candidate, Mtb72f, has suggested that genetic diversity among M. tuberculosis strains may compromise vaccine efficacy. In this study, we sequenced the esxA, esxH, and fbpB genes of M. tuberculosis encoding ESAT-6, TB10.4, and Ag85B proteins, respectively, in a sample of 88 clinical isolates representing 57 strains from Ark, USA, and 31 strains from Turkey, to assess the genetic diversity of the two vaccine candidates. We found no DNA polymorphism in esxA and esxH genes in the study sample and only one synonymous single nucleotide change (C to A) in fbpB gene among 39 (44.3%) of the 88 strains sequenced. These data suggest that it is unlikely that the efficacy of Ag85B-ESAT-6 and Ag85B-TB10.4 vaccines will be affected by the genetic diversity of M. tuberculosis population. Future studies should include a broader pool of M. tuberculosis strains to validate the current conclusion.


Introduction
The need for an improved vaccine against tuberculosis (TB) has never been more urgent. One in three people today are infected with Mycobacterium tuberculosis, the causative agent of TB, and worldwide approximately three million people die from TB annually. The currently available TB vaccine, Bacille Calmette-Guérin (BCG), has failed to consistently protect against the most contagious form of the disease, adult pulmonary TB, despite its widespread use [1][2][3][4]. Developing a new vaccine, which may serve as a booster or a replacement for BCG, is of critical importance in the fight against worldwide TB-related morbidity and mortality [4,5].
Of the various vaccine candidates proposed, fusion subunit vaccines have received considerable attention in the recent literature, especially those composed of antigenic proteins ESAT-6, Ag85B, and TB10.4 [3][4][5][6][7][8]. It appears that the multiple epitopes that fusion subunit vaccines offer makes them more effective than single-peptide vaccines in interacting with the complexity of the host immune response against TB and the genetic restriction imposed by major histocompatibility complex molecules [3,9]. Two fusion subunit vaccines, Ag85B-ESAT-6 and Ag85B-TB10.4, which are the focus of the present study, have been found to induce protective cell-mediated immunity in animal models [3,6,7,9,10]. Ag85B-ESAT-6 is currently in expanded Phase I studies in which the vaccine is tested in BCG-vaccinated, latently infected, and individuals from TB endemic regions [4]. As these candidates move forward in or toward clinical trials, it will be critically important to evaluate their protective potential as global vaccines via bioinformatic approaches built upon the comparative genomics of the pathogen population and the immunomics of the host population. Bioinformatics approaches are invaluable to the development of effective vaccine candidates. Comparative genomics of the pathogen population, for instance, allows vaccine candidates that are potentially ineffective due to genetic diversity of the pathogen population to be discredited before they reach the costly stages of clinical trials. In other words, bioinformatic approaches can provide information based on which a rational selection of clinical trial sites can be made. As for subunit vaccines, comparative genomics can help analyze whether antigenic targets are conserved among infectious strains of an organism in order to ensure their protective efficacy across diverse pathogen populations circulating in different geographic region [8].
Although previous studies have suggested that M. tuberculosis has a relatively stable genome in comparison with other bacteria [11,12], recent genomic studies have revealed biologically significant variation among clinical strains [13]. Hebert and colleagues, for instance, revealed considerable genetic variation in the PPE18 gene of M. tuberculosis, with important implications for the ability of the Mtb72f vaccine candidate to provoke protective immunity against diverse populations of M. tuberculosis [14]. Furthermore, the interaction of the genetic variation of the PPE18 component of Mtb72f with the allelic variation of human MHC-II DRB1 proteins negatively affects vaccine epitope binding to DRB1 proteins [15]. Taken in the context of vaccine development, revelations like these are crucial to the survival of vaccine candidates as potential clinical vaccines. A similar comparative genomics study on Ag85B-ESAT-6 and Ag85B-TB10.4 subunit vaccines may provide useful information for predicting the protective efficacy of these candidates in the pre-or early stages of their clinical evaluation.
Little information has been documented on the genetic variation of the genes encoding for ESAT-6, Ag85B, and TB10.4 proteins. If any of these three genes is highly variable, the protective efficacy of Ag85B-ESAT-6 and Ag85B-TB10.4 subunit vaccines might be compromised on the global stage. To further investigate the ability of these two-vaccine candidates in recognizing naturally occurring M. tuberculosis strains, we investigate the genetic diversity of the esxA, esxH, and fbpB genes of M. tuberculosis that encode for the components of the two new subunit vaccines in a sample of 88 M. tuberculosis strains collected from Turkey and Arkansas, USA.

M. tuberculosis Isolates.
The clinical strains used in the present study are from Ark, USA, and Turkey. Following the work of Herbert and colleagues, the isolates were selected to represent different geographical regions, including Arkansas and Malatya, Turkey, to assess the impact of regional genetic variability on the two subunit vaccine candidates [14]. Each of the selected isolates represents a different strain of M. tuberculosis with a distinct IS6110 restriction fragment polymorphism (RFLP) pattern with more than five bands or a distinct combination of a common IS6110 RFLP pattern with five or less bands and a unique spoligo typing pattern (Table 1). The rationale for including isolates from two geographical regions was to discern the potential impact of genetic variation on future vaccination with Ag85B-ESAT-6 and Ag85B-TB10.4 in separate populations.
Our initial intent was to analyze the same set of clinical isolates (n = 225) used by Hebert and colleagues in their work on PPE18 and pepA [14]. However, after initial sequencing of a randomly selected subset (n = 41) of the 225 isolates revealed no genetic variation in the ESAT-6, TB10.4, and Ag85B genes, we selected only 47 of the remaining 84 isolates that had previously shown variation in the PPE18 gene (n = 47) for the current study. This decision was made with the consideration of cost-effective lab procedure, reasoning that local genetic variations would be indicative of broader genomic variability. The 88 study isolates represent 57 different strains from Ark, USA, and 31 distinct strains from Turkey. The 88 study isolates shared 49 different spoligotypes that represent 47 different spoligo international types, as determined by using the query tool of the fourth international spoligotyping database (SpolDB4) [16]. The present sample represents 80 of 84 strains (95.2%) that showed PPE18 variation in Hebert's study.

Genetic Diversity of esxA, esxH, and fbpB and Corresponding Amino Acid Sequences.
Among the 88 strains investigated, genetic analysis of esxA and esxH in this study revealed no nucleotide polymorphisms in the genes encoding for ESAT-6 and TB10.4 proteins. Of the 88 strains, 38 (43.2%) belong to principal genetic group 1, 29 (33.0%) belong to principal genetic group 2, and 21 (23.9%) belong to principal genetic group 3. The principle genetic groups were defined by SNPs in the katG and gyrA genes as described previously by Sreevatsan and colleagues [12]. Unlike the study by Herbert et al., where genetic group 1 strains were found to have the highest frequency of DNA polymorphisms in the PPE18 protein, a component of the Mtb72f vaccine [14], strains in all of the three principal genetic groups showed no DNA variations in both esxA and esxH. This observation suggests that these gene regions might be conserved among M. tuberculosis strains of different geographic origins and among different genetic groups of the pathogen. In fact, Gey Van Pittius and colleagues have recently posited interspecies conservation of the ESAT-6 gene region as part of a novel Gram-positive secretion system with distant homologues in Bacillus subtilis, Bacillus anthracis, Staphylococcus aureus, and Clostridium acetobutylicum [18].
The analysis of fbpB, the gene encoding for Ag85B revealed only one synonymous C to A SNP, located at position 714 bp of the gene sequence, among 39 (44.3%) of the 88 strains sequenced. Double strand sequencing was conducted on the 39 isolates to confirm the existence of this SNP. Although this SNP had no effect on the amino acid sequence of the peptide when translated, it is indicative of an allelic variation in the M. tuberculosis gene pool. Of the 39 strains, 15 (17.0%) belong to principal genetic group 1, 11 (12.5%) belong to principal genetic group 2, and 13 (14.7%) belong to principal genetic group 3. Furthermore, the SNP was not found to be associated with any specific geographic origin of the study strains, suggesting that this particular nucleotide polymorphism is of ancestral origin. These data suggest that M. tuberculosis Ag85B antigen is highly conserved in, at least, certain populations of M. tuberculosis clinical strains.

Implications for
Immunization. TB remains one of the deadliest infectious diseases of our times. Despite widespread use of the BCG vaccine, the disease continues to claim 2-3 million lives per year. The need for a new vaccine has never been more urgent. In order to gain insight into the efficacy of two new vaccine candidates, two fusion proteins combining Ag85B and ESAT-6, and Ag85B and TB10.4 [6,9,10], respectively, the gene regions encoding for ESAT-6, TB10.4, and Ag85B proteins were analyzed for their variability. Experiments involving these two candidates have revealed them to be effective in generating protective immunity in animal models [6,9,10].
Yet, while animal models have played an important role in the development of new TB vaccines so far, they are not always representative of the internal human biological environment. As Flynn noted earlier, an important drawback of the murine model is that the pathology of pulmonary TB in mice is quite different from that in humans [19]. Specifically, the heterogeneity of granuloma types observed in the human host is not displayed in the mouse lung [19]. Similar difficulties arise when using other animal models. In the case of bovine infection, the pathology of TB is quite similar to human host response in granulomatous reactions, but differs with respect to cavitation [20]. Nonhuman primate models also represent the human pathology of TB quite well, but like cattle are limited by economic and infrastructural factors [20].
Furthermore, current preclinical studies of new TB vaccines' protection against M. tuberculosis infection in animal models do not take the population diversity of M. tuberculosis into consideration. However, as Hebert and colleagues noted previously [14], genetic diversity of M. tuberculosis genes can be found among clinical isolates, and such diversity may have important implications for the efficacy of the new vaccines. Thus, comparative genomics of the pathogen population stands as an additional useful tool for pre-clinical evaluation of new vaccines, providing information complementary to those from current in vivo and in vitro studies. Given the resource-demanding nature of clinical trials, comparative genomics serves as a method for predicting the potential protection of proposed vaccines candidates in the general population. Hebert and colleagues' work revealed that the PPE18 protein, part of the Mtb72f subunit vaccine, was quite variable among isolates collected from Turkey and Arkansas [14]. Analyzing the variability of antigens targeted by potential vaccines in a diverse set of isolates at the genomic level may indeed allow researchers to avoid developing a vaccine that is only variably effective like the current BCG. The findings of such study can also inform the rational selection of the study populations, with a consideration of covering the diverse pathogen populations in clinical trials of new vaccines.
Our observation that ESAT-6, TB10.4, and Ag85B proteins were highly conserved in our study sample comprising strains from two geographically distant regions and three different principal genetic groups suggest that it is unlikely that the efficacy of Ag85B-ESAT-6 and Ag85B-TB10.4 subunit vaccines will be affected by the genetic diversity of M. tuberculosis population. Thus, the protective efficacy of these two novel vaccine candidates may have a wider reach than Mtb72b vaccine, which contains a highly variable antigen of M. tuberculosis [14]. However, our findings also indicate the need for further bioinformatics research on the three genes investigated and their specific interaction with the host immune system. While highly conserved genes are indicative of homologous protein antigens, they may also suggest a lack of selective pressure by the host immune system and thus a lack of recognition on behalf of the host's immune response. Previous examples of the inability of animal models to accurately represent the human host environment suggest that the degree to which the human host system interacts with these important peptides remains to be studied.
The goal of pre-clinical evaluation of these two vaccines may be furthered by future studies that include a larger sample of isolates from a greater range of geographic origins, making the data even more representative of the diversity of M. tuberculosis worldwide. The finding of this study that esxA, esxH, and fbpB were conserved across 88 clinical strains from Arkansas and Turkey does not confirm that they are conserved globally. Including a larger and more genetically diverse sample of isolates would address the two primary limitations of this study-the number and diversity of the isolates used.
Another factor that must be taken into account is the potential impact that host diversity may have on the global coverage of these two new TB vaccines. This study looked at the diversity of pathogen genes coding for vaccine proteins, but even uniformly conserved proteins may fail to induce protective immunity if host diversity impedes their ability to effectively bind effector immune cells. McNamara and colleagues provide an eloquent, bioinformatic approach to studying the impact that host diversity may have on Mtb27f vaccine coverage by analyzing the allelic variation of human class II MHC DRB1 proteins and its impact on proper vaccine epitope binding [15]. A similar study on Ag85B-ESAT-6 and Ag85B-TB10.4 may provide insightful information on host diversity and its impact on the coverage of these vaccines.
With these future directions in mind, the results of the present study represent an important first step in the pre-clinical bioinformatic assessment of Ag85B-ESAT-6 and Ag85B-TB10.4 vaccine candidates, and the impact that genetic diversity among their respective antigenic protein targets has on their potential success as global vaccines. The finding that esxA, esxB, and fbpB genes are highly conserved in two distinct populations suggests that Ag85B-ESAT-6 and Ag85B-TB10.4 vaccine candidates may be effective in geographically distinct areas of the world.