Bacillus Strains Most Closely Related to Bacillus nealsonii Are Not Effectively Circumscribed within the Taxonomic Species Definition

Bacillus strains with >99.7% 16S rRNA gene sequence similarity were characterized with DNA:DNA hybridization, cellular fatty acid (CFA) analysis, and testing of 100 phenotypic traits. When paired with the most closely related type strain, percent DNA:DNA similarities (% S) for six Bacillus strains were all far below the recommended 70% threshold value for species circumscription with Bacillus nealsonii. An apparent genomic group of four Bacillus strain pairings with 94%–70% S was contradicted by the failure of the strains to cluster in CFA- and phenotype-based dendrograms as well as by their differentiation with 9–13 species level discriminators such as nitrate reduction, temperature range, and acid production from carbohydrates. The novel Bacillus strains were monophyletic and very closely related based on 16S rRNA gene sequence. Coherent genomic groups were not however supported by similarly organized phenotypic clusters. Therefore, the strains were not effectively circumscribed within the taxonomic species definition.


Introduction
CBD 118 was one of the two first Bacillus strains not related to the B. cereus group reported to harbor the capsule genes carried on pXO2 by Bacillus anthracis (USF Center for Biological Defense (CBD)) [1,2]. Luna et al. isolated and sequenced the capsule operon (capA, capB, capC, capD, and promoter), repA, capR, acpA, IS1627, ORF43, ORF48, and ORF61 on a large plasmid in CBD 118 [1]. Its status as a carrier of B. anthracis capsule genes spurred research into determining its closest relatives, to aid in circumscribing the reservoir of genes essential for virulence in B. anthracis. When near full length 16S rRNA gene sequences were compared, the most similar type strains to strain CBD 118 were Bacillus circulans ATCC 4513 T (98.9%) and Bacillus nealsonii DSM 15077 T (99.3%). Strain CBD 118 differed from B. circulans ATCC 4513 T and B. nealsonii DSM 15077 T for 10 and 12 of 100 phenotypic traits evaluated, respectively. The percentages of DNA:DNA binding in two pairings each of strain CBD 118 to B. circulans ATCC 4513 T and B. nealsonii DSM 15077 T were 12.5 and 10.2% and 10.8 and 8.3%, respectively. Thus, strain CBD 118 is differentiated by phenotypic and genome-based methods from the only validly named species with greater than 98.7% 16S rRNA gene sequence similarity [3][4][5]. Strain CBD 118 was the sole exemplar of a novel species. Prior to the proposal of novel species, studies of ten or more strains are recommended in order to detail intraspecies diversity and to foster appropriate type strain assignment [6][7][8]. To identify the requisite closely related strains, the V1-V3 hypervariable regions of the 16S rRNA gene [9] from strain CBD 118 were compared to sequences available in GenBank. Eight potential sibling strains were obtained for study. Although the eight strains tested negative for capsule production and for the pXO2 genetic marker by PCR, the group retained taxonomic-if not biodefense-significance. This work presents the polyphasic taxonomic characterization of these eight strains with respect to CBD 118. Incongruent strain-strain associations within this polyphasic data set illustrate the difficulties in applying a pragmatic, taxonomic, bacterial species definition to groups of strains that do not fall into coherent clusters based on genetic and phenotypic analyses.
Bacterial species are currently defined by pragmatic criteria in a coordinated, polyphasic scheme of 16S rRNA sequence-based phylogeny, indirect whole genome comparisons by DNA:DNA hybridization and analysis of numerous covariant phenotypic characters [3,5,10,11]. Key requisites of the taxonomic species definition can be condensed as follow: (i) a species should be a monophyletic group with a high degree of genetic similarity, (ii) the recommended thresholds of 70% DNA similarity and 5 • C ΔT m are guidelines, not absolute limits for circumscribing new species, (iii) genomic boundaries for a separate species should be defined after analysis of the collective phenotype, (iv) phenotypic intragroup homo-or heterogeneity can only be understood after analysis of as many traits as possible among at least five and preferably more strains, (v) a bacterial species should not be classified unless it can be recognized by multiple independent methods and possesses a set of determinative phenotypic properties [3,5,11].
Underlying these guidelines are assumptions about the genetic and phenotypic characteristics of bacterial species that may not be equally applicable to all groups of bacteria [12][13][14][15][16]. That is, it is usually assumed that there are clusters of strains, for example, "sequence clusters" [17], "ecotypes" [18], and so forth, distinct from other clusters. Investigators have been encouraged to develop other genomic-bas-ed methods to supplement or even supplant DNA:DNA hybridization as the acknowledged standard for delineating genospecies clusters [3,4,6,16,19]. Various methods are increasingly used to define genetic and phenotypic similarity among strains-from multilocus sequence typing (MLST) [20] up to the analysis of whole genomes [13,14]. Ever more precise and detailed descriptions of similarity among strains and between clusters can be obtained by advances in sequencing technology, its application to more isolates and by polyphasic phenotypic analysis of increased numbers of characters. But a more fundamental and less tractable problem is that of the species level circumscription of related bacteria that do not appear to fit readily into sequence clusters and hence within the current taxonomic species definition [14]. Taxonomic species definitions continue to be refined as new techniques become available and new strains are described [3,4,6,16,19]. Our study illustrates complexities that can be encountered as polyphasic methods are applied to greater numbers of strains forming a broader sample of the microbial world.

Preservation and Authentication of Bacillus Strains.
Upon receipt, each strain was subcultured by streaking to tryptic soy agar (TSA) and TSA with 5% sheep red blood cells (TSA-BA) and grown at 30 • C. After 24, 48, and 72 h incubation, plated strains were examined for purity based on the presence of colonies of a single morphotype. A single, wellisolated and representative colony was designated as the progenitor colony and streaked for pure culture reisolation on plates of TSA, TSA-BA, and TSA with 5 mg L −1 MnSO 4 , incubated at 30 • C for up to 72 h. Characteristic, well-isolated colonies on these plates served as first passage sources of inocula for initial phenotypic characterization as detailed below. Colony morphologies for each strain were observed at 24 and 48 h for consistency with the progenitor colony and were described for standard colony features including color, surface texture and degree of luster, relative transmittance of direct light through the colony, shape, margin configuration, elevation, diameter in mm and hemolysis reaction. Each of the Bacillus strains in this study including the type strains presented one or more differential colony features that were documented and subsequently monitored as evidence of purity and authenticity whenever strains were subcultured. Phenotypic tests and other procedures utilizing broths were routinely subcultured at the incubation end point to TSA-BA check plates. After 24 and 48 h incubation at 30 • C, check plates were reviewed for the presence of colonies of a single, differential morphotype, characteristic of each strain.
The Bacillus including type strains were inoculated from the progenitor colony to aerated tryptic soy broth (TSB), grown to late log phase, subcultured to a TSA check plate, aseptically harvested by centrifugation, resuspended in TSB with 10% glycerol, aliquoted to multiple cryovials, and subjected to a controlled freeze prior to storage at −85 • C. One week after cryostocking, a cryovial of each strain was thawed, subcultured on TSA and TSA-BA plates, enumerated for viability and again evaluated for the single, differential colony morphotype. Prior to retesting of phenotypic characters and other analyses, strains were subcultured from the cryostocks and endospore production was induced on TSA with International Journal of Microbiology 3 5 mg L −1 MnSO 4 plates. Serial transfer of strains was restricted by the use of single, characteristic endospore-producing colonies as inoculation sources for subsequent testing.
Preservation of strain authenticity was evaluated at the end of the study. Four strains that formed an apparent genomic cluster were subcultured from cryopreserved stock, retested for seven differential phenotypic traits including colony morphotype and resequenced for the 16S rRNA gene. The resultant sequences were compared to the original sequences deposited in GenBank.

16S rRNA Gene Sequence Analysis.
Amplification of 16S rRNA gene sequences from Bacillus strains, sequencing of the approximately 1500 bp long products, fragment assemblies and alignment of 16S rRNA gene sequences from type strains of selected Bacillus species were performed as previously reported [2]; an additional primer, 534R, was employed in some amplifications. Identification of phylogenetic neighbors was carried out by BLAST 2.2.20+ [25] and megaBLAST (discontinuous option) [26] searches of GenBank [27]. Calculation of pairwise sequence similarity to nearest neighbors used the EzTaxon global alignment algorithm [28]. Alignments of 16S rDNA sequences were also made using the Infernal secondary structure based aligner and SeqMatch scores (S ab) calculated with RDP10 at the Ribosomal Database Project website [29]. Nucleotide (nt) positions in the hypervariable regions V1-V3 were identified in the 16S sequence of strain CBD 118 by alignment with conserved regions at nt positions 48-70, 346-366, and 490-511 in rrnE of B. subtilis subsp. subtilis strain 168 T (NC 000964.2, Locus tag: BSUr022, GeneID: 2914197, updated 3/2010 to NC 000964.3, Locus tag: BSU rRNA 30, GeneID: 8303085) [9]. In the E. coli numbering system, regions V1-V3 correspond to nt positions 69-99, 137-242 and 433-497, respectively [30]. Using GenBank bl2seg, pairwise alignments of 461 bp from strain CBD 118 (nt 26-461 corresponding to rrnE nt positions 48-511) were made to 16S rDNA sequences from closely related strains. Presumptive signature sequences (PSS) within the V1-V3 region were identified and compared to all GenBank sequences using BLASTN 2.2.20+ with parameters adjusted for short input sequences [25]. Dendrograms were constructed from approximately 1390 bp or 448 bp using neighbor-joining, maximum parsimony and maximum likelihood algorithms (PHYLIP v. 3.6.80) [31] with 1000 bootstrap replications performed to estimate support for each branch.

DNA:DNA Hybridization Studies.
Strains were subcultured from cryopreserved stock, grown to late log phase in aerated TSB, harvested by centrifugation and provided to the Deutsche Sammlung von Microoganismen und Zellkulturen (GmbH) (DSMZ) as ≥3 g wet weight biomass preserved in 50:50 sterile dI H 2 O:2-propanol. Prior to harvesting, each broth culture was screened for a single characteristic morphology in wet mounts using phase contrast microscopy at x1000 under oil and subcultured on TSA-BA check plates grown at 30 • C. After 24 and 48 h incubation, check plates corresponding to the preserved biomass for each strain were reviewed for the presence of colonies of a single morphotype, consistent with that previously determined to be characteristic of and differential for the strain. Biomass was shipped only after no apparent evidence of contamination or mislabeling of strains was detected.
DNA:DNA hybridizations were performed by the Identification Service of the DSMZ. Cells of preserved biomass were disrupted in a French pressure cell and the DNA purified by chromatography on hydroxyapatite. DNA:DNA hybridization was carried out at 65 • C using a model Cary 100 Bio UV/VIS spectrophotometer equipped with a Peltier-thermostatted 6×6 multicell changer and a temperature controller with in situ temperature probe (Varian) [32,33].

Cellular Fatty Acid
Analysis. Cellular fatty acid (CFA) composition of Bacillus strains was determined by Microbial ID, Inc. (MIDI) at both study start and end point. Each strain was subcultured from cryopreserved stock, inoculated from a single, characteristic colony to a TSA slant and grown at 30 • C for 48 h to foster endospore formation. Prior to shipment, each slant was subcultured on a TSA-BA check plate grown at 30 • C and observed after 24 and 48 h incubation. The check plate for each strain was reviewed for colonies of a single, differential morphotype and the slant shipped to MIDI only after no evidence of contamination or mislabeling of strains was discerned.
Strains were grown under standardized conditions on tryptic soy broth agar quadrant streak plates at 28 • C for 24 h. To reduce disparities in the effective physiological age of the cells, biomass was harvested from colonies growing in the third streaked quadrant. Fatty acid methyl esters were extracted by a four-step procedure of saponification, methylation, extraction and sample clean-up. Fatty acid peaks were analyzed by gas chromatography and named by comparing retention times to those in a known mixture. A dendrogram program used a multivariate clustering algorithm to produce unweighted pair matching based on similar CFA content between strains and generated a tree scaled to Euclidian distance (ED).

Phenotypic Characterization.
All tests were incubated at 30 • C unless otherwise noted [34]; incubation periods are specified. Differential tests were performed at minimum twice or as specified; prior to re-testing, strains were subcultured from cryostocks held at −85 • C in TSB, 10% glycerol. Control strains of Bacillus and Paenibacillus included B. cereus ATCC 14579 T , B. circulans ATCC 4513 T , B. megaterium ATCC 14581 T , B. nealsonii DSM 15077 T , B. pumilus ATCC 7061 T , B. thuringiensis ATCC 10792 T , and P. polymyxa ATCC 43865 T . Sporulation was induced on TSA with 5 mg L −1 MnSO 4, grown for 40-48 h. Hemolysis reaction was determined on TSA with 5% sheep red blood cells (REMEL), grown for 48 h. Pigment production and mean colony diameter were evaluated on TSA, tryptone blood agar base and tryptone glucose yeast extract plates, 24, 48, 72 h and 1 week. Motility was determined by either stab inoculation of motility test medium (REMEL), observed at 24 and 48 h, or phase contrast observation of wet mounts made with aerated cells grown in TSB to log phase, 3 to 6 h. Cell morphology, endospore characterization and swelling of the sporangium, presence of parasporal bodies and motility were observed in wet mounts using phase contrast microscopy at ×1000 under oil. Anaerobic growth was evaluated after 1 week in the Mitsu-bishi Pack-Anaero anaerobic gas generating system with the following pre-reduced media: fluid thioglycollate medium with dextrose and indicator (REMEL), tryptone glucose yeast extract agar plates, and anaerobic agar [34], inoculated in the molten state. Oxidase reaction was tested with Kovács' phenylenediamine redox dye reagent (Becton, Dickinson). Growth of cells at defined temperatures was tested in 3 mL of TSB in 13 × 100 mm tubes for 48 h in water baths set to 30, 35, 40, 45, 50, 55, and 60 + 1 • C and examined for turbidity at 24 h intervals. Growth of cells at defined pH was tested in the same manner in TSB adjusted to pH 4.6, 5.6, 6.1, 6.5, 6.8, 7.3, 7.8, 8.1, and 8.5. Salt tolerance was tested on nutrient agar plates supplemented with 0, 1, 3, 7 and 10% NaCl, incubated for 5 days. Physiological tests per-formed on cells grown in commercial media (REMEL) included casein and starch hydrolysis, incubated 14 days; growth on mannitol egg yolk polymyxin agar, incubated 48 h; growth in methyl red Voges-Proskauer (MRVP) broth for final pH and VP reaction, tested at 3, 5, and 7 days; nitrate reduction tested in nitrate broth at 3, 7, and 14 days and on nitrate agar slants, at 3 and 5 days; and growth on Sabouraud's 4% glucose agar, pH 5.6, incubated 72 h. Gelatin hydrolysis was tested in 12% nutrient gelatin (REMEL) for 2 weeks. Hydrolysis of Tween-80 was tested on plates of a peptone-based medium [35], incubated for 4 weeks. Acid production from 49 carbohydrates or carbohydrate derivatives was tested using the API 50 CH panel and API CHB/E medium with mineral oil overlay, ≥4 test panels per strain, in combination with eleven biochemical tests from the API 20 E kit, ≥2 test panels per strain, incubated for 48 and 24 h, respectively (bioMérieux). Acid production was read at 24 and 48 h in a semiquantitative way, where 0 was assigned to negative reactions of the same alkaline red as the no-carbohydrate control and 5 assigned to yellow indicator shifts of maximum intensity. Values of 1, 2, 3, or 4 were given to intermediate reactions with 3, 4, and 5 being considered positive. Differential phenotypic traits between paired strains were enumerated. Each differential character state was assigned a numerical value-that is, 1 = negative, 2 = variable, 3 = positive-and subjected to hierarchical cluster analysis (SPSS for Windows, Release 15.0.1.1, 2007). A dendrogram was generated using average linkage between groups, scaled in Euclidian distance units.

Presumptive Signature Sequences.
Hypervariable regions V1-V3 in the Bacillus 16S rRNA gene sequence had been reported to be discriminatory for most Bacillus species [9]. The eight Bacillus strains were identified as potential sibling strains when 461 bp spanning V1-V3 hypervariable regions of the 16S rRNA gene sequence of strain CBD 118 were compared to GenBank database sequences. Presumptive signature sequences (PSS) were identified in V1 at nt positions 71-92 (PSS 1 A and PSS 1 B) and in V2 at nt positions 183-223 (PSS 2 ) (

Phylogenetic Analysis.
In a neighbor-joining (N-J) tree ( Figure 1) based on approximately 1390 bp of 16S rRNA sequence, strains most closely related to B. circulans and B. nealsonii were divided into two well-supported sister clades. Strain CBD 118 and Bacillus strains with ≥99.7% sequence similarity to CBD 118 formed a complex clade with B. nealsonii DSM 15077 T . In the subtree (Figure 1), the Bacillus strains grouped according to PSS 2 type, but without strong bootstrap support. The B. circulans clade included B. circulans ATCC 4513 T , one strain identified as B. circulans and two Bacillus spp. with ≥99.5% similarity to the type strain. The attribution of species-level identity based solely on 16S rRNA gene sequence similarity is known to be unreliable [6,7] especially among Bacillus [36]. Keswani and Whitman [36] studied the relationship of 16S rRNA sequence similarity (S) to DNA:DNA hybridization (D). Among 40 Bacillus Table 1: Presumptive signature sequences (PSS) conserved in 16S rRNA gene at nt positions 71-92 a,b (PSS 1 ) and 183-223 a (PSS 2 ) in respective V1 and V2 hypervariable regions of Bacillus nealsonii-related strains. a Numbering based on rrnE of Bacillus subtilis subsp subtilis strain 168 T (NC 000964.3); b gap at nt position 80 in PSS 1 sequence alignments to rrnE is reflected in the numbering.

PSS type
Bacillus Strain no. GenBank Accession no. Nucleotides of PSS     Given the estimated 10% reproducibility of % S values (DSMZ), DNA:DNA pairings that have ≥80% S should meet or exceed the recommended 70% threshold to delineate taxa at the species level. Three of 21 strain pairings-P308 with C4T1F3B3; OSS 25 with C4T1F3B3; P308 with IAFILS6tested at ≥80% S; therefore, these four strains appeared to represent a coherent genomic cluster. A diagram (Figure 2) in which the two measurements per pairing were averaged, illustrates varying degrees of genomic coherency among all 7 strains. Averaged % S of 93.5% strongly supports DNA relatedness between strains P308 (PSS 2 b) and C4T1F3B3 (PSS 2 b); % S of 83.4% also supports relatedness between P308 (PSS 2 b) and IAFILS6 (PSS 2 c). But relatedness between C4T1F3B3 (PSS 2 b) and IAFILS6 (PSS 2 c) is not similarly well supported at 70% S, thus the degree of relatedness of each strain to P308 was not reproduced in relation to each other. Also, within this apparent genomic cluster, averaged % S was 87.1% between OSS 25 (PSS 2 a) and C4T1F3B3 (PSS 2 b), 78.2% between OSS 25 (PSS 2 a) and P308 (PSS 2 b), but was 70.25% between OSS 25 (PSS 2 a) and IAFILS6 (PSS 2 c). A genomic cluster based on these four strains incorporates an ∼24% range for % S and values for two pairings that maygiven ∼10% reproducibility-lie in the transitional range for species circumscription. Some strains of a species may show less than 70% S with the type strain or other strains of the same species, thus internal heterogeneity within genomic groupings and species is permitted [6,7,11,16]. However, studies of the average nucleotide identity (ANI) of all conserved genes between any two genomes [13,14,39] support adoption of a higher rather than a relaxed threshold for species circumscription. The 70% threshold for species delineation based on DNA:DNA pairings corresponds to 95% ANI and 85% or 79% conserved protein coding genes between a pair of strains [39], thus substantial phenotypic differences were possible among two or more of these four strains.  Table 3. Consistent with Bacillus [34], the major cellular fatty acids (CFA) measured in the strains were C 14:0, C 15:0 -anteiso , C 15:0iso and C 16:0 . Profiles from a second CFA analysis performed at the end of study (not shown) were consistent with those in Table 3. The second data set deviated in the absence of 1-7 very low % CFAs (most <0.5%; three <1.5%) from the profiles of each of the strains, suggesting differences between the two testing events in the effective physiological age of the strains. The ability to reproduce profiles for a single strain is dependent on standardized conditions for growth medium, incubation time and temperature, and effective physiological age of the cells ( [34], MIDI technical literature). In the second analysis, slight changes in values of major and other CFAs for ≥9 strains followed a parallel pattern of elevation (C 15:0 -iso, C 17:0 -iso) or reduction (C 14:0, C 16:1 ω11c, C 16:0 ),  also suggesting differences in effective physiological age for those strains relative to the first testing event. However, salient differential CFAs were reproduced in the profiles of both data sets, supporting the authenticity of the strain set at both time points in our study. C 15:0 -anteiso was 60% in both profiles for strain OSS 25. Strain P307 was twice distinguished by the summed feature C 17:1 -anteiso B/Iso I and C 19:0 -anteiso. In both data sets, PSS 2 c strains IAFILS6, AD5A, U4A, and ADP4II were differentiated from all other strains by C 17:1 iso ω10c, the summed feature C 17:1 anteiso B/Iso I and C 19:0 anteiso. CFA profiles are known to vary widely in many named Bacillus spp., thus circumscribing species based on CFA content is usually possible only in cases of genomically-homogeneous strains [34]. Therefore, we considered only the linkage of nearest neighbors without attribution of taxonomic level. In a dendrogram scaled to Euclidian distance (ED) and based on the initial data set (Figure 3), the three strains sharing PSS 2 b-P307, P308, and C4T1F3B3-clustered together at near 6 ED. Strain OSS 25 (PSS 2 a) was distantly linked at near 20 ED to the other Bacillus spp. including CBD118 (PSS 2 a). Among PSS 2 c strains, ADP4II linked to U4A at ≤3 ED and IAFILS6 linked to AD5A at 7.5 ED but the linkage between the two pairs was at ≤13 ED. In the dendrogram (not shown) based on the end of study data set, small cumulative differences in individual fatty acid percentages relative those in the initial data set resulted in changes in the level of ED linkage among strains. However, the profile similarities between strains P308 and C4T1F3B3, the 60% of C 15:0 -anteiso that distinguished strain OSS 25, and the differentiation of IAFILS6 by C 17:1 iso ω10c, summed feature C 17:1 anteiso B/Iso I and C 19:0 anteiso were all reproduced in both data sets. In both dendrograms, strains P308 and C4T1F3B3 were clustered at ≤6 ED, IAFILS6 clustered with AD5A at ≥7.5 ED, and OSS 25 was isolated at ≥16 ED. The genomebased cluster of four strains-P308, C4T1F3B3, OSS 25, and IAFILS6-with DNA:DNA pairings of 94-70% S was not reproduced in the CFA-based dendrogram from either data set. 3.6. Differential Phenotypic Characterization. Twenty-five of 100 phenotypic traits differentiate among the nine Bacillus and the two most closely related type strains (Table 4). Type strains B. circulans ATCC 4513 T and B. nealsonii DSM 15077 T are distinguished from the other strains by lack of acetoin production and by acid from 2-ketogluconate. The numbers of characters that separate each pair of Bacillus strains were compiled in a matrix (Table 5) and presented in a dendrogram (Figure 4). The only consistency between the CFA-based ( Figure 3) and phenotype-based dendrogram was the close linkage of two PSS 2 c strains, U4A and ADP4II-one of only two instances in which strains of the same PSS 2 type were directly linked in the phenotype-based dendrogram. Only 3 of 36 phenotypic pairings resulted in ≤5 character differences while 19 strain pairs had ≥10 differences (highest number = 13). As a group, strains IAFILS6, AD5A, U4A and ADP4II sharing PSS 2 c formed the most coherent group with between 3-8 character differences. But in the phenotype dendrogram, closely paired strains U4A and ADP4II clustered with P307 (PSS 2 b), while AD5A links with OSS 25 (PSS 2 a) and C4T1F3B3 (PSS 2 b). Strains U4A and ADP4II differed by one CFA (Table 3) and 3 phenotypic characters, and may represent strain variants of a novel species. Only 6 or 5 characters differentiated P307 (PSS 2 b) from U4A and ADP4II, respectively. DNA:DNA pairing data on these three strains is not, however, available for comparison. The four strains-P308, C4T1F3B3, OSS 25, and IAFILS6-that comprised the genomic-based cluster were differentiated by 9-13 characters, including nitrate reduction, temperature range, and acid production from carbohydrates.  mandates that a species be a monophyletic group with a high degree of genomic similarity that also shares a high order of similarity in many independent phenotypic features [3,5,10,11]. The eight strains collected for comparison to strain CBD 118 are monophyletic (Figure 1) when considering 16S rRNA gene similarity. While recognizing that 16S rRNA sequence lacks resolving power at the level of bacterial species [6,7,11,36], we hypothesized that the PSS types might yet function as exclusionary thresholds, for example, species that shared a PSS 2 type might or might not be the same  species, but strains with different PSS 2 types would not be the same species. While the highest degree of DNA relatedness (93.5% S) was between PSS 2 b strains P308 and C4T1F3B3 (Figure 2), the hypothesis of an exclusionary threshold was contradicted by ≥70% S between strains of different PSS 2 types. However, we suggest that the PSS 2 types remain effective tools to search 16S rRNA sequence databases for more strains of >99.7% similarity.

Incongruence of Character Sets and Application of a Bacterial Species Definition. The taxonomic species definition
No strain tested at greater than 70%-50% S in pairings to CBD 118 (Table 2) and the three strains most closelyrelated to CBD 118 (PSS 2 a) based on % S-P308 (PSS 2 b), C4T1F3B3 (PSS 2 b), and IAFILS6 (PSS 2 c)-can be differentiated by 8, 11 and 12 characters, respectively (Table 5). Strain OSS 25, most closely related to CBD 118 based on 16S rRNA sequence similarity and PSS 2 a, differs by 9 phenotypic characters as well as having only transitional range % S to CBD 118 and distant linkage based on CFA. It is recommended for Bacillus and related genera [6] that the 70% S threshold for species delineation not stand alone in delimiting species but should be supported by other characteristics that differentiate strains of the proposed species from other species. In the application of the taxonomic species definition, phenotype continues to have a salient role in the determination of break-points in genomic data for species circumscription and no single parametergenomic properties or phenotypic traits-should be given undue prominence [3,6,37]. The classification that results from application of the taxonomic species definition should be predictive, establishing determinative properties and therefore cannot be based only on genomic characters [3,7,11]. The 70% threshold could be interpreted flexibly [6,7,11,16,37] and a more relaxed boundary used to circumscribe a genomic grouping of these four strains with CBD 118. The resultant grouping would, however, lack sufficient phenotypic cohesion to be of predictive value and therefore does not justify circumscription as a taxonomic species.
In polyphasic taxonomic studies when the strains and phenotypic characters tested were both sufficiently numerous, the resultant clustering pattern has generally reproduced the genomic grouping [7]. In this instance, the four strains with highest % S to support species circumscription are differentiated by multiple phenotypic, species level discriminators (Tables 2, 3, 5). Strain OSS 25 (PSS 2 a) paired with P308 (PSS 2 b) and with C4T1F3B3 (PSS 2 b) at 78% and 87% S (Figure 2), but were differentiated by 11 and 9 characters respectively, as well as significant differences in CFA profiles. In the phenotype dendrogram (Figure 4), OSS 25 was linked most closely with C4T1F3B3 but not with P308. Strains P308 (PSS 2 b) and IAFILS6 (PSS 2 c) share 83% S but are differentiated by CFA profiles and 12 traits. Strains P308 (PSS 2 b) and C4T1F3B3 (PSS 2 b) share the strongest DNA relatedness with 94% S and were closely linked based on CFA profiles but can be differentiated by 13 phenotypic characters and failure to be linked in the phenotype dendrogram. At the end of the study, these four strains were subcultured from cryopreserved stock, retested for six differential phenotypic traits and resequenced for the 16S rRNA gene. The resultant sequence for each strain was subjected to BLAST analysis and in each case resulted in a 100% match to the region of overlap with the ∼1500 bp previously accessioned into GenBank for the strain. For each strain, the re-evaluation of six phenotypic characters-degree of endospore-driven swelling, colony diameter, hemolysis reaction, growth at 45 • C, growth with 7% NaCl, and nitrate reduction-reproduced the results of previous testing shown in Table 4. These results indicate that the authenticity of these strains was maintained though the course of the study. The internal diversity of strains P308, C4T1F3B3, OSS 25 and IAFILS6 confounds delineation in a phenotypically coherent unit and their circumscription as one species accommodating multiple biovars or ecovars does not, in our minds, support a predictive taxonomy. No common ecological or disease state can be cited to justify the nomination of a pragmatic species epithet for these strains. The designation of genomovars, as originally proposed by Ursing et al. [38], applies to two or more genomic strain clusters within a phenotypically coherent named species that cannot be phenotypically delimited from other strains of the nomenospecies. With these four strains, the converse is the case-one apparent genomic group of strains with four differential phenotypes. It is possible that each of these four strains is the sole exemplar of a novel species and that cohesive phenotypic clusters await the isolation and robust polyphasic characterization of more sibling strains. On the other hand, MLSA [20] on these and more sibling isolates could support the description of one or more species with a high degree of intraspecies diversity-thereupon, a species description could be justified. At this point, rather than being reinforced by coherent phenotypic clustering, potentially coherent genomic clusters among strains are contradicted by interstrain variability and are not therefore effectively circumscribed within the taxonomic species definition.
Difficulties in applying the taxonomic species definition are not new-see the taxonomic histories of Pseudomonas stutzeri [38] and Acinetobacter [40] to cite just two-whereas these nine Bacillus strains are demonstrably novel and their degrees of relatedness appear to confound the taxonomic species definition. Polyphasic data did not clarify relationships and illuminate coherent clusters among these strainsinstead, potentially "transitional" forms were revealed. While acknowledging the current insufficiency of our data set, these strains are reminiscent of Model 9 of Istock et al. [12], "Highly variable partially recombining nonspecies", in which clusters of strains may be discerned, but transitional strains erase any clear demarcation between clusters. Likewise, these strains may be an example of the "continuum of diversity" suggested to characterize groups in which forces promoting coherence dominate those promoting divergence of populations [13]. More data is required to clarify relationships among these strains-particularly sampling more strains in order to determine the range of variation and whether or not discrete phenotypic clusters exist. Indeed, it is hoped that researchers holding closely related strains recognizable by 100% identity to the PSS 2 types will join in collaborating with labs having expertise in recommended methods of Bacillus identification [6] to characterize an expanded number of strains. To this end, the nine Bacillus strains in this study have been deposited in a publicly accessible culture collection.