Comparative Genomic Analysis of Lactococcus garvieae Strains Isolated from Different Sources Reveals Candidate Virulence Genes

Lactococcus garvieae is a major pathogen for fish. Two complete (ATCC 49156 and Lg2) and three draft (UNIUD074, 8831, and 21881) genome sequences of L. garvieae have recently been released. We here present the results of a comparative genomic analysis of these fish and human isolates of L. garvieae. The pangenome comprised 1,542 core and 1,378 dispensable genes. The sequenced L. garvieae strains shared most of the possible virulence genes, but the capsule gene cluster was found only in fish-pathogenic strain Lg2. The absence of the capsule gene cluster in other nonpathogenic strains isolated from mastitis and vegetable was also confirmed by PCR. The fish and human isolates of L. garvieae contained the specific two and four adhesin genes, respectively, indicating that these adhesion proteins may be involved in the host specificity differences of L. garvieae. The discoveries revealed by the pangenomic analysis may provide significant insights into the biology of L. garvieae.

L. lactis is the most studied lactococcal species as a "generally recognized as safe (GRAS)" species, and the genomes of six L. lactis strains (IL1403, KF147, MG1363, NZ9000, SK11, and CV56) have been fully sequenced to date. In contrast, the genomic characteristics of L. garvieae had not been described, but we recently determined the complete genome sequences of a virulent strain L. garvieae Lg2 and a nonvirulent strain L. garvieae ATCC 49156 of the fish pathogen [19]. The two strains shared a high degree of sequence identity, but Lg2 had a 16.5-kb capsule gene cluster that is absent in ATCC 49156. The eight genes in the capsule gene cluster were also conserved in several L. lactis strains and in the human microbiomes [19]. At approximately the same time, draft genome sequences of other three L. garvieae strains (UNIUD074, 8831, and 21881) were also published [20][21][22]. L. garvieae Lg2, UNIUD074, and 8831 were isolated from diseased fish, whereas L. garvieae 21881 was from human blood. In the present study, we compared 2 International Journal of Microbiology the genomic organization of the sequenced L. garvieae strains and also discussed the genes that may be involved in the host specificity of L. garvieae.

Informatics.
Gene annotation was carried out for each of the draft genome sequences of L. garvieae strains UNIUD074, 8831, and 21881 (Table 1). An initial set of predicted protein-coding genes was identified using Glimmer 3.0 [23]. Genes consisting of <120 bp and those containing overlaps were eliminated. All predicted proteins were searched against a nonredundant protein database (nr, NCBI) using BLASTP with a bit-score cutoff of 60. Protein domains were identified by HMMER. Orthology across whole-genomes has been determined using BLASTP reciprocal best hits in all-against-all comparisons of aminoacid sequences.

2.2.
Bacteria. L. garvieae ATCC 43921 (type strain) and ATCC 49156 were obtained from the American Type Culture Collection (ATCC). ATCC 43921 and ATCC 49156 were isolated from mastitis and diseased yellowtail, respectively. L. garvieae Lg2 was isolated in 2002 from yellowtail [24]. L. garvieae NRIC0607 and NRIC0611 were obtained from radish sprout and broccoli sprout, respectively [15]. These strains were cultured in Todd-Hewitt Broth (Becton, Dickinson and Company) for 20 h at 25 • C, suspended in 10% skimmed milk (Becton, Dickinson and Company) solution, and then stored at -80 • C until they were used.

Results
Genes were predicted using our own annotation pipeline in this study because the draft genome data of three L. garvieae strains (UNIUD074, 8831, and 21881) contain only contig sequences. The general features of genomes of L. garvieae strains (ATCC 49156, Lg2, UNIUD074, 8831, and 21881) were summarized in Table 1. The genus Lactococcus is included within the family Streptococcaceae. We constructed a phylogenetic tree for concatenated sequences of To identify orthologs shared by L. garvieae strains, we produced a four-way Venn diagram ( Figure 1). The pangenome consists of 2,920 protein-coding genes with a core of 1,542 genes (53%). A total of 935 genes were found to be specific to the Lg2 (204), UNIUD074 (312), 8831 (118), and 21881 (301) genomes. A majority of the unique genes, in all strains, were annotated as coding for (conserved) hypothetical proteins. The 8831 and 21881 genomes shared most genes (1,730), and this result is in agreement with the phylogenetic tree obtained from concatenated sequences (Supplementary Figure 1). Of the core 1,542 genes, 1,130 (73%) were also conserved among the six sequenced L. lactis genomes, suggesting that these genes may constitute the core genome of lactococci, likely inherited from their common ancestor.
The comparative analysis of genomes of L. garvieae Lg2 and ATCC 49156 has revealed that Lg2 had a 16.5-kb capsule gene cluster that was absent in ATCC 49156 [19]. The Lg2-specific 204 genes contained all 15 genes encoded in the capsule gene cluster, showing that the UNIUD074, 8831, and 21881 also lack the capsule gene cluster ( Table 2). This finding is consistent with the report observed by scanning electron microscopy that L. garvieae 8831 is a non-capsulated strain (Alicia Gibello, personal communication). Comparison of this genetic locus with the corresponding locus in the sequenced L. garvieae genomes revealed that the capsule gene cluster was apparently inserted into the locus syntenic to the sequenced L. garvieae (Figure 2). To evaluate the relationship between the virulence and the presence of capsule gene cluster, we next investigated whether nonpathogenic L. garvieae strains ATCC 43921, NRIC0607, and NRIC0611 have the capsule gene cluster. The PCR product of ATCC 49156 could be confirmed approximately 750 bp, whose size was as shown in Figure 3, by agarose gel electrophoresis (data not shown). No capsule was observed in L. garvieae ATCC 43921, NRIC0607, and NRIC0611, isolated from mastitis, radish, and broccoli sprouts, respectively [15,24], and they are not pathogenic to fish. ATCC 43921, NRIC 0607, and    NRIC 0611, and their PCR products of were approximately less 400 bp, and no capsule gene cluster was also detected in the three strains in the region (Figure 4). The capsulespecific primers used in Figure 4 were designed on the basis of the conserved sequences between Lg2 and ATCC 49156, and the binding sites of these primers were found in the UNIUD074, 8831, and 21881 genomes with only one-or two-base mismatches (Figure 3). Aside from the capsule gene cluster, we have reported other possible virulence genes in the Lg2 genome [19]. Most of the possible virulence genes, such as hemolysin, were also conserved in UNIUD074, 8831, and 21881 (Table 2). However, two adhesin genes (LCGL 1585 and LCGL 1672) are conserved in UNIUD074 and 8831, but absent in 21881.
The LCGL 1585 protein contained the collagen-binding domain (Pfam PF05737), and the LCGL 1672 protein contained the mucin-binding domain (PF06458). On the other hand, the 21881-specific 301 genes contained four possible adhesin genes (Supplementary Table 1). The four adhesins contained an LPxTG-type motif, of which two contained the mucin-binding domain (PF06458), and one had a domain (PF05738) conserved in the collagen-binding surface protein of Staphylococcus aureus.
No capsule gene cluster was encoded in L. garvieae 21881 isolated from human with septicemia (Table 2), and no capsule was detected in L. garvieae isolated from human with endocarditis [25]. Other group has reported that L. garvieae HF isolated from human contained 23 genes that International Journal of Microbiology were absent in L. garvieae UNIUD074, using suppressive subtractive hybridization [26]. We evaluated the presence of the 23 genes in the sequenced L. garvieae strains (Table 3). Of the 23 genes, 18 were conserved in L. garvieae 21881, but only 2-5 were found in the fish isolates.

Discussion
Little genomic information of L. garvieae had been reported, but the complete genome sequences of two L. garvieae strains (Lg2 and ATCC 49156) and draft genome sequences of three L. garvieae strains (UNIUD074, 8831, and 21881) have recently been released. In this study, the comparative genomic analysis of three fish and one human isolates of L. garvieae showed the pangenome structure.
The pathogenic mechanisms of L. garvieae are poorly understood. It has only been demonstrated that virulence of L. garvieae for fish is, in part, dependent on its ability to form a capsule [9]. Our previous study has revealed that a capsule gene cluster in Lg2 is a genomic island [19]. Supporting these findings, the capsule gene cluster was found only in fish-pathogenic strain Lg2 and was apparently inserted into the locus syntenic to the sequenced L. garvieae (Figure 2). ATCC 49156 was originally isolated from diseased yellowtail, has undergone phenotypic changes during its descent from the ancestral strain, and now is nonpathogenic to yellowtails [12,16,17]. We were thus tempted to speculate that a common ancestor of ATCC 49156 and Lg2 had acquired the capsule gene cluster before the divergence, and then ATCC Table 3: Presence of genes specific to L. garvieae HF isolated from human.

Accession Number
Presence in: (1) 21881 Lg2 49156 might have been lost the capsule gene cluster during its subculturing.
Several possible virulence genes, such as hemolysin, in Lg2 were also conserved in UNIUD074 and 8831, whereas the capsule gene cluster was not (Table 2). A previous study showed that subculturing of L. garvieae in synthetic media resulted in the loss of capsule [15]. Further, we have previously demonstrated that the capsule gene cluster was crucial for virulence of L. garvieae for fish [19]. These findings indicate that UNIUD074 and 8831 may be noncapsulated and non-pathogenic for fish.
Although several cases of human infections caused by L. garvieae have been reported [1,24], there is little information about precise mechanisms and factors by which L. garvieae causes infection and disease in human. L. garvieae 21881 isolated from human with septicemia lacked a capsule gene cluster ( Table 2). Other study also showed that no capsule was detected in L. garvieae isolated from human with endocarditis [25]. Genes of L. garvieae HF isolated from human were also more conserved in 21881 than in the fish isolates. This result suggests that human isolates of L. garvieae may share genes that are absent in L. garvieae strains isolated from other environments, and that L. garvieae strains containing those specific genes, even if non-capsulated, may