Identification and Genome Analysis of an Arsenic-Metabolizing Strain of Citrobacter youngae IITK SM2 in Middle Indo-Gangetic Plain Groundwater

Whole-genome sequencing (WGS) data of a bacterial strain IITK SM2 isolated from an aquifer located in the middle Indo-Gangetic plain is reported here, along with its physiological, morphological, biochemical, and redox-transformation characteristics in the presence of dissolved arsenic (As). The aquifer exhibits oxidizing conditions relative to As speciation. Analyses based on 16S rRNA and recN sequences indicate that IITK SM2 was clustered with C. youngae NCTC 13708T and C. pasteuri NCTC UMH17T. However, WGS analyses using the digital DNA-DNA hybridization and Rapid Annotations using Subsystems Technology suggest that IITK SM2 belongs to a strain of C. youngae. This strain can effectively reduce As(V) to As(III) but cannot oxidize As(III) to As(V). It exhibited high resistance to As(V) [32,000 mg L−1] and As(III) [1,100 mg L−1], along with certain other heavy metals typically found in contaminated groundwater. WGS analysis also indicates the presence of As-metabolizing genes such as arsC, arsB, arsA, arsD, arsR, and arsH in this strain. Although these genes have been identified in several As(V)-reducers, the clustering of these genes in the forms of arsACBADR, arsCBRH, and an independent arsC gene has not been observed in any other Citrobacter species or other selected As(V)-reducing strains of Enterobacteriaceae family. Moreover, there were differences in the number of genes corresponding to membrane transporters, virulence and defense, motility, protein metabolism, phages, prophages, and transposable elements in IITK SM2 when compared to other strains. This genomic dataset will facilitate subsequent molecular and biochemical analyses of strain IITK SM2 to identify the reasons for high arsenic resistance in Citrobacter youngae and understand its role in As mobilization in middle Indo-Gangetic plain aquifers.


Introduction
The Citrobacter genus, which belongs to the Enterobacteriaceae family, was first described in 1932 [1]. Till now, 18 species of Citrobacter have been identified [2,3] from varied sources like soil, water, sewage, feces, and intestinal gut of animals and humans [4,5]. Members of Citrobacter are enteric Gram-negative and rod-shaped coliform bacteria with 1:0 × 2:0 -6:0 μm in size [6]. Some strains of Citrobacter are opportunistic pathogens and can cause infections in immunocompromised patients [7]. It is often reported that Citrobacter species are the cause of meningitis in infants [8]. Among the different species of Citrobacter, C. youngae causes inflammation in the peritoneum, the membrane that covers the inner wall of the abdomen [9]. Furthermore, different strains of Citrobacter are known to be resistant to several heavy metals, including arsenic (As; [10,11]). Arsenic (As) is a geogenic metalloid contaminant that has affected the health of animals and humans [12][13][14][15]. Sustained consumption of As-polluted water [>10 μg/L, WHO permissible limit [16]] can cause acute to chronic health problems in humans [13,15]. Depending on its electronic configuration, As can exhibit multiple oxidation states: +V [arsenates; As(V)], +III [arsenites; As(III)], 0, and -III [15,17]. In nature, +V and +III are the predominant oxidation states of As [14]. These arsenic species can exist in inorganic ( i As V / III ) and methylated ( m As V / III ) forms [18]. In mammalian systems, As(III) is identified as more toxic than As(V) [19]. Monomethylated arsenous acid [MMA(III)] is considered to be the most toxic of all forms of As(V) and As(III) [20,21]. However, MMA(III) and other m As forms, such as monomethylated arsenic acid [MMA(V)] and dimethylated As species, DMA(III), and DMA(V), are generally not detected in food or water. At times they are found in urine or its by-products [22,23].
The i As V / III forms are the major forms of As present in groundwater [15]. In aquifers, As redox transformations can be affected by site-specific conditions such as the prevalent pH, redox potential (E H ), co-ions, and the presence of labile organic matter [12,14]. Although the source of arsenic pollution in groundwater is mainly inorganic [24][25][26], involvement of the indigenous microbial population in Asredox transformations and mobilization cannot be ignored if abundant labile organic matter was present [18,27]. These microbes can metabolize arsenic due to the presence of Asmetabolizing genes in their system [18,19]. These Asmetabolizing organisms are classified as (a) As(V) reducers [28,29] and/or (b) As(III) oxidizers [30,31]. The presence of these organisms in groundwater could indicate microbially mediated arsenic transformation and resultant speciation [27]. Specifically, As(V) reducers have the potential to control As concentrations in the groundwater due to their capacity to bioreduce host minerals of As(V) [32]. The bacterial genera known for their efficient reduction of As(V) to As(III) are Sulfurospirillum, Bacillus, Wolinella, Clostridium, Staphylococcus, Desulfomicrobium, and Citrobacter [33][34][35].
In this study, a strain of Citrobacter youngae IITK SM2 (hereafter referred to as strain "IITK SM2") was isolated from middle Indo-Gangetic plain (IGP) groundwater in India, at conditions that were oxidizing with respect to arsenic speciation. The objectives of the study were as follows: (i) identification of major genes, including As-resistant genes, present in this strain through whole-genome sequencing (WGS); (i) determination of physiological, morphological, biochemical, and redox-transformation characteristics of this isolate in the presence of dissolved arsenic; and (iii) identification of features in the arsenic-operon system contained in this strain relative to other organisms. To the best of our knowledge, this is the first strain of Citrobacter youngae which shows effective As(V) reduction to As(III). This information will be helpful in the identification of the As(V) metabolizing enzymes and other proteins involved in the influx and efflux of As relative to this bacteria.

Materials and Methods
2.1. Study Site, Sample Collection, and Analysis. Groundwater was sampled from a previously identified As-polluted aquifer in Baikunthpur, Uttar Pradesh, India (26°33 ′ 47.3 ″ N and 80°15 ′ 18.5 ″ E), situated in the Indo-Gangetic plains [ [36]; Figure S1 of the supporting information]. Two different groundwaters, As-polluted and As-unpolluted, were sampled from this site ( Figure S1). For each sample, pH, temperature (°C), conductivity (μS cm -1 ), and redox potential (E H ; V) were measured at the site using a portable multiparameter meter (Thermo Orion Star A329) and suitable electrodes. Three sets of water samples were collected from the polluted aquifer. Two sets were filtered using 0.2 μm nylon syringe filters (Cole-Parmer). One of these filtered samples was immediately acidified using 1% (v/v) trace-metal grade HNO 3 for the determination of dissolved total arsenic (As T ) and other elements using inductively coupled plasma mass spectrometry (ICP-MS). The other set of filtered samples was left unacidified and was collected without headspace in 15 mL centrifuge tubes. This filtered-unacidified sample set was utilized for dissolved carbon analysis using total organic carbon (TOC) analyzer, and for the measurement of dissolved inorganic As(V) and As(III) using ion chromatography coupled with ICP-MS (IC-ICP-MS). Due to the reported inaccuracies in measured redox potential values using electrode [37][38][39][40][41], E H was also estimated using measured As(V) and As(III) concentrations using the Nernst equation for As-polluted groundwater [42] (Section S1 of the Supporting Information).
For bacterial culturing, unfiltered groundwater samples were collected in sterilized tubes without headspace and were sealed with Parafilm. All these samples were placed in ice gel packs and transported within 1 h to the laboratory. Subsequently, the unfiltered samples were transferred to predefined As-amended agar plates inside a laminar hood.
2.2. Chemicals. All solutions and buffers were prepared in ultrapure water (Milli-Q, resistivity > 18:2 MΩ − cm) and were either filtered using 0.2 μm nylon syringe filters or autoclaved at 121°C for 20 min before use. For preparing 100 mg mL -1 stock solutions of As(V) and As(III), Na 2 HA-sO 4 ·7H 2 O and NaAsO 2 , respectively, were used. For qualitative As redox transformation test, a stock solution of 1 M AgNO 3 was prepared prior to use and was stored in dark at 4°C. Chemicals used in this study, and their manufacturers and purities, are detailed in Table S1 of the Supporting Information.
2.3. Analytical Techniques. Elemental concentrations, including total dissolved arsenic (As T ), were measured using inductively coupled plasma mass spectrometry (ICP-MS; Thermo iCAP-Qc), with germanium as an internal standard for As measurement. All standards and samples were analyzed in a 1% HNO 3 matrix. Dissolved As(V) and As(III) concentrations were measured using ion chromatography coupled with ICP-MS (IC-ICP-MS; Thermo Scientific iCAP Q with the Thermo Scientific Dionex ICS-5000 IC). Analytes were eluted in 100 mM (NH 4 ) 2 CO 3 over IonPac AS7 analytical column.

BioMed Research International
Dissolved total carbon (TC) and inorganic carbon (DIC) concentrations were measured with total organic carbon analyzer (TOC-L; Shimadzu TNM-L ROHS), and dissolved TOC was estimated from TC and DIC (TOC = TC -DIC). The optical density of cultures was monitored by the measurement of absorbance at 600 nm (OD 600 ; Biospectrometer; Eppendorf). The method detection limits of various techniques are listed in Table S2 of the Supporting Information. Morphology of the bacterial strain was determined microscopically. The Gram-staining test followed by optical microscopy was performed at 100x magnification (Quasmo; Ecostar-plus). For higher magnification and better resolution, tungsten scanning electron microscopy with associated energy dispersive X-ray spectroscopy (W-SEM-EDX; JEOL JSM 6010 LA) was used. Before SEM-EDX analysis, samples were gold-coated at 7-10 nm.

Isolation of Arsenic-Resistant Bacteria.
To isolate Asresistant bacteria, 100 μL of As-polluted groundwater was added to each of the As-amended Lysogeny agar plates (0-400 mg L -1 ) and spread using sterilized glass beads in a laminar flow hood. These plates were incubated at 37°C for 24 h, and sixteen distinct colonies were isolated upon visual identification. However, only two isolates grew in 400 mg L -1 each of As(III)-and As(V)-containing Lysogeny agar plates. Of these isolates, the strain (IITK SM2) which showed more efficient growth in the presence of arsenic was selected for WGS. Strain IITK SM2 was inoculated in Lysogeny broth (LB) and diluted several times by streaking the culture on variables As-supplemented (0-10,000 mg L -1 ) Lysogeny agar plates until single colonies were obtained. For further analyses, these cultures were preserved in 12-15% glycerol solution at -80°C.

Morphological, Physiological, and Biochemical
Characterization of Strain IITK SM2. Experiments were performed with strain IITK SM2 to determine its bacterial group, optimum growth conditions, and response of bacterial growth to specific biochemical tests. Experimental details of these tests are provided in Section S2 of the Supporting Information.

Effect of Arsenic on the Growth: Kinetics and
Morphology. The growth profile of IITK SM2 was studied in the absence and presence of arsenic [As(V) or As(III)]. Before starting this experiment, it was confirmed that dissolved As was not present in the background media. Initially, a single colony of this isolate from Lysogeny agar plate was inoculated at 30°C (optimum temperature) and 120 rpm in arsenic-free minimum salt media (MSM; Table S3 of the Supporting Information), supplemented with 10 mM glucose as the only C source. Optical absorbance at 600 nm (OD 600 ) of this culture was regularly recorded (BioSpectrometer® basic; Eppendorf) until an OD 600 of~0.1 was reached. Thereafter,~1% of the bacterial suspension was individually transferred to set-up different cultures which contained 0 mM, 1.33 mM, 3.33 mM, 6.67 mM, 10 mM, 13.33 mM, and 20 mM of either As(V) or As(III) in 10 mM glucosesupplemented MSM. These cultures were incubated at 30°C and 120 rpm. After 8 h of transfer, the OD 600 was measured for all the cultures. This experiment was performed three times in triplicate. Among all the concentrations investigated, the maximum biomass was obtained at 10 mM of As(V) dosage ( Figure S2 of the Supporting Information). Consequently, 10 mM of As was chosen as the optimum dosage for comparative growth profile study in the presence of either 10 mM As(V) or 10 mM As(III) along with the Asfree control. Doubling times for all these three cultures [10mM As(V), 10mM As(III), and 0mM As] were calculated from respective growth curves by considering two log-unit increase in bacterial population. For these three cultures, morphological characterization of bacterial cells was performed using SEM. Details of sample preparation for SEM analysis are discussed in the section S3 of the Supporting Information.
2.5.2. Resistance to As and Other Heavy Metals Typically Present in Contaminated Groundwater. Minimum inhibitory concentration (MIC) tests were performed to evaluate the resistance of strain IITK SM2 to As(III) and As(V), along with certain heavy metals found in contaminated groundwaters like Fe(III), Cr(VI), Mn(II), Ba(II), and Zn(II).The MIC was defined as the lowest metal concentration at which no bacterial growth was observed [43]. The detailed procedure used for determining MIC is discussed in the section S4 of the Supporting Information.
2.6. Redox Transformation of Arsenic. The capability of strain IITK SM2 to transform As(V) to As(III) or vice versa was qualitatively estimated by the formation of colored precipitates upon addition of silver nitrate, per a slightly modified procedure from what is detailed in a previous study [44]. This procedure is discussed briefly in Section S5 of the Supporting Information. Again, MSM supplemented with 10 mM glucose was used as the culturing media without any background As(V) and As(III). Arsenic was either added as As(V) or As(III) to obtain final concentrations as 0, 50, 100, and 500 mg L -1 of each form of arsenic. Also, systems without bacterial cultures were initiated as controls with the same set of As concentrations (0-500 mg L -1 ). For confirmation of dissolved arsenic concentrations, As T was measured using ICP-MS, and As(V) and As(III) were measured using IC-ICP-MS before and after the addition of strain IITK SM2.

Molecular Characterization.
To identify the genus and species of strain IITK SM2, 16SrRNA and whole-genome sequencing were performed. The methods used for the isolation of genomic DNA and for 16S rRNA sequencing are detailed in Section S6 of the Supporting Information.
2.7.1. Whole-Genome Sequencing (WGS). Libraries were constructed in alignment with microbial WGS with the recommendations of Nextera™ DNA flex library preparation kit from Illumina Inc. To determine the mean fragment size, libraries were loaded and analyzed on a high-sensitivity D1000 ScreenTape. The Illumina libraries were diluted to 4 nM pooled, spiked with 5% PhiX, premade Illumina library, and loaded onto a Miseq v2 kit. The sequencing was performed for 2 x 150 cycles. The raw data obtained from Illumina MiSeq were recorded as FASTQ files. The adapter-free reads were obtained using an adapter trimming plugin. The quality check of the reads was done using FastQC v0.11.3 [45]. High-quality reads were obtained using Trimmomatic v0.39 [46]. De novo assembly and scaffolding were done using SPAdes v3.14.1 [47], where filtered reads were assembled without biasing the assembly to any known genome. The quality check was performed using QUAST v5.0.2 [48]. For assembly of the genome, only contigs with more than 500 bp were considered. The g-DNA sequences were assembled into 30 different contigs. The assembly was annotated using National Center for Biotechnology Information, Prokaryotic Genome Annotation Pipeline [ [49]; NCBI-PGAP v2020-09-24 build4894)]. The g-DNA of the strain was mapped to reference organisms-Citrobacter freundii FDAARGOS 549 (GenBank accession number NZ_ CP033744.1) and Citrobacter youngae NCTC 13708 (GenBank accession number NZ_UFWE01000006.1) using Bowtie 2 [50].

Sequence Comparison of IITK SM2 with Different
Strains. To identify whether the isolate belongs to a new species or not, type (strain) genome server (TYGS) using formula 2 was applied for the computation of digital DNA-DNA hybridization (dDDH) value [51]. Furthermore, subsystem features in strain IITK SM2 were also compared with features of different Citrobacter species using rapid annotations with subsystem technology [RAST, [52]. The different types of strains used for these analyses are detailed in Table S4 of the Supporting Information.
Based on the coding sequences (CDs), orthologous gene clusters were determined by comparing the genome of isolate IITK SM2 with genomes of closely related strains. These strains were chosen based on dDDH values. For this clustering, the OrthoVenn web server was used [53]. Default parameters such as e-value cut-off of 10 -5 and inflation value of 1.5 were used to compare protein similarity and to generate orthologous clusters, respectively.
Through the WGS, arsenic-resistant genes were also identified in the g-DNA of this strain. The arrangement of these genes, or ars operon in IITK SM2, was compared with genetic arrangement of different species of Citrobacter and selected strains of Enterobacteriaceae (Table S4 of the Supporting Information).

Phylogenetic Analysis.
The evolutionary relationship of strain IITK SM2 was determined by the comparison of its 16S rRNA sequence with sequences of closely related bacterial species. Furthermore, to accurately differentiate among different species, the highly conserved recN gene sequence was used. For these analyses, sequences of related bacterial strains were downloaded from the NCBI database using basic local alignment search tool [BLAST, [54]]. Multiple alignments of protein sequences of As(V)-reductase gene arsC identified in strain IITK SM2 were performed using CLUSTAL_W [55]. Also, to understand the evolutionary history of these genes, arsC sequences of our isolate were compared with sequences of bacterial species from different genera than of strain IITK SM2. Accession numbers of dif-ferent strains used for these analyses are detailed in Table S4 of the Supporting Information.
The bootstrap method was used for testing the phylogeny with 1000 replicates [56]. Phylogenetic trees of different sequences mentioned above were prepared using the neighbor-joining [NJ] method [57]. For computing the evolutionary distances, the p-distance method was used [58] and the units were reported in base differences per site. MEGA X was used for conducting these evolutionary analyses [59].

Results and Discussion
3.1. Geochemistry of As-Polluted Aquifer. The As-polluted groundwater sampled from the study site exhibited oxidizing conditions with respect to arsenic speciation as suggested by the measured dissolved arsenic, co-solutes, and calculated redox potential (E m H~1 12 mV). On an average,~70 μg L -1 of total dissolved arsenic (As T ) was recorded, of which the dissolved inorganic As(V) and dissolved inorganic As(III) concentrations were~43 μg L -1 and~22 μg L -1 , respectively. Methylated forms of arsenic were not measured in groundwater because it is well documented in the literature that simple methylated forms such as MMA(III), DMA(V), MMA(V), and DMA(V) are usually not detected in food or water [15,22,23,60,61] but are detected in urine or its by-products [62][63][64]. Furthermore, the sum of these dissolved inorganic As(V) and As(III) was within the 10% of As T , which also suggests that methylated forms of arsenic, if present, were in negligible quantity in the groundwater of our interest (Table 1). Higher concentration of As(V) as compared to As(III) supports the prevalence of oxidising conditions with respect to arsenic speciation. However, the presence of significant concentration of the reduced form of arsenic [~22 μg L -1 as As(III)] in such oxidizing aquifers indicated a potential role of As(V)-reducing organisms. Furthermore, high average DIC (~1296 mg L -1 ) and TOC (~40 mg L -1 ) concentrations as compared to unpolluted aquifer (DIC~186 mg L -1 and TOC~11 mg L -1 ) suggest the potential role of microbial activity in this groundwater. Other water quality parameters are detailed in Table 1.
One of the most widely accepted mechanism of arsenic mobilization in groundwater is reductive dissolution of iron (oxy)hydroxide. [FeOOH (s) ] [24,25,[65][66][67][68][69][70][71][72][73]. However, this mechanism is more prevalent under reducing conditions [66]. For the oxidizing groundwater from where the strain IITK SM2 was isolated as in this study, As(V) might still be released from reductive dissolution of FeOOH but the presence of As(III) in such aquifers hints towards the microbially mediated As(V)-reduction to As(III). However, a detailed and careful study would be required for identifying the role and resultant mechanisms of such microbes in arsenic speciation in oxidizing aquifers. This would require systematic investigation and comparison of indigenous microbial population from both As-free and As-polluted groundwaters.

Characterization of Bacterial Strain IITK SM2
3.2.1. Classification of the Isolate. The isolate was rod-shaped Gram-negative, catalase-positive, and motile bacteria, which exhibited a negative starch hydrolysis test. Strain IITK SM2 can grow over pH 4-10, with an optimum pH of 7.25. Furthermore, the isolate grew over 15-45°C, with an optimum growth at 30°C at pH 7.25 ( Figure S3 of the Supporting Information). The strain could tolerate NaCl up to 6% (w /v) and indicated an optimal growth at 1.5% (w/v) of the salt. After incubation for 2 d in an anaerobic chamber, colonies grown in Lysogeny agar plates were circular, opaque, and yellow, which suggested that the isolate was a facultative anaerobe. Furthermore, strain IITK SM2 showed resistance to ampicillin (100 μg L -1 ) and hygromycin (50 μg L -1 ), but growth was not observed in the presence of kanamycin (50 μg L -1 ), or chloramphenicol (25 μg L -1 ), or ciprofloxacin (20 μg L -1 ), or gentamycin (10 μg L -1 ), or streptomycin (50 μg L -1 ).

Resistance to Heavy
Metals. Among the metal species considered in this study, IITK SM2 showed resistance to As(V), As(III), Fe(III), Cr(VI), Mn(II), Ba(II), and Zn(II) up to certain levels ( Table 2). Of these species, the maximum resistance was observed for As. The minimum inhibitory concentrations (MIC) of As(V) and As(III) were 32,000 mg L -1 3.2.3. Impact of Arsenic on Growth and Morphology. The growth of IITK SM2 was found to be more in the presence of arsenic as compared to As-free condition in minimum salt media (MSM) supplemented with glucose ( Figures S2  and 2a). The lag phase of this strain varied with the type of As (III versus V, Figure S2) stress provided. For the comparative growth profile study investigated for 0 mM As, 10 mM As(V) and 10 mM As(III) conditions, the shortest lag phase was observed in the absence of any As (8 h), followed by increasing lag phases in the presence of 10 mM As(III) (32 h) and 10 mM As(V) (48 h) (Figure 2(a)). Logarithmic growth was observed between 8-24 h, 32-60 h, and 56-96 h for As-free, 10 mM As(V)-containing and 10 mM As(III)-containing conditions, respectively. The maximum biomass was observed in the presence of As(V) followed by As(III) and As-free conditions (Figure 2(a)). The   Figure 1: Phylogenetic position of strain IITK SM2 relative to other strains of genus Citrobacter based on 16S rRNA sequences. The neighbor-joining method was used for tree construction [57]. Percentage bootstrap values corresponding to 1000 replicates are shown next to the branches in "bold". Branch lengths are shown in "narrow italics" below each branch. The tree was drawn to scale. 6 BioMed Research International growth rate (k) and doubling time (DT) observed in the absence of As was 0.08 h -1 and 8.9 h, respectively. In Asstressed conditions, DT increased to 10.5 h and 17.4 h in the systems containing 10 mM As(V) [k = 0:07 h −1 ] and 10 mM As(III) [k = 0:04 h −1 ], respectively. Although the addition of dissolved As(V) and As(III) retarded the growth rate, no inhibitory effect of arsenic on the growth of this strain was observed at 10 mM of As dosage. On the contrary, higher biomass was obtained in As-stressed conditions suggesting that the isolate metabolizes As and obtains energy for its growth [74,75]. Furthermore, more growth in the presence of As(V) as compared to As(III) indicated that strain IITK SM2 possibly had a mechanism to effectively metabolize As(V) relative to As(III).
The IITK SM2 strain was rod-shaped as confirmed by SEM analysis (Figures 2(b)-2(d)). The average length of a bacterial cell in the absence of dissolved As was 2:8 ± 0:6 μ m (Figure 2(b)). However, in the presence of As(III), the length of bacteria increased to 5:6 ± 2:2 μm, which suggested that As(III) induced stress condition to this isolate (Figure 2(c)) that resulted in filamentation [76]. This elongation indicated that cell division might be affected due to As(III) stress. However, in the presence of As(V), this strain was found to be clustered together, with no significant change in the cell size (2:0 ± 0:6 μm) as compared to that of the As-free condition (Figure 2(d)).

Redox Transformation of Arsenic by IITK SM2.
Qualitative silver nitrate assay indicated that strain IITK SM2 was an As(V) reducer (Figure 3). The formation of yellow-and brown-colored precipitates was observed with standard salts of As(III) [As(III) s ; Figure 3a] and As(V) [As(V) s ; Figure 3c], respectively, possibly due to precipitation of Ag 3 As III O 3(s) and Ag 3 As V O 4(s) [44]. However, in the presence of strain IITK SM2, the yellow color in precipitated solids was retained under As(III)-stressed conditions [As(III) I ; Figure 3(b)], but brown precipitates were not observed for As(V) conditions [As(V) I ; Figure 3(d)]. In fact, the precipitated solids were increasingly yellow with increasing As(V) concentrations. Further, measurements of As T by ICP-MS, and of As(V) and As(III) by IC-ICP-MS, before and after the reaction with IITK SM2, confirmed these qualitative results. A complete reduction of As(V) to As(III) was observed in As(V)-amended conditions, whereas no change was observed in the conditions initially containing As(III) ( Table S6 of the Supporting  Information). Overall, results indicate that the isolate can mediate As(V) reduction to As(III), but not vice-versa. The presence of arsenate reducers, like C. youngae IITK SM2, in the middle Gangetic plain groundwater, could be the reason for significant concentrations of dissolved As(III) in oxidizing conditions for arsenic speciation.

Genes Identified in Arsenic Metabolism of IITK SM2
3.3.1. Whole-Genome Sequencing (WGS) and Comparison with Other Organisms. To identify the species of IITKSM2, WGS was performed. The mean fragment size of PCRenriched library of g-DNA of the strain was found to be 564 bp with a concentration of 14.1 ng μL -1 . In total, 2,672,974 bp raw reads were obtained for g-DNA, whereas total reads that survived after trimming and filtering were 1,778,915. The final genome size was found to be 4,857,938 bp with a guanine-cytosine (GC) content of 51.70% with N 50 value of 438,827 bp. The respective assembly length was~4.8 Mbp. The NCBI-PGAP annotation showed that the g-DNA contained a total of 4725 genes, which were associated with 4553 coding sequences (CDs) and 66 tRNA, 6 rRNA, and 9 noncoding RNA (ncRNA) sequences. Besides, 91 pseudogenes were also present.
Distance-based phylogeny developed by considering recN sequence indicated that IITK SM2 was clustered with C. youngae NCTC 13708 T and C. pasteuri NCTC UMH17 T ( Figure S4 of the Supporting Information). However, the    youngae NCTC 13708 T (85.8%), followed by C. youngae CCUG 30791 T (83.8%). As the proposed cut-off for species delineation is 70% [77,78], these distances confirm that IITK SM2 belongs to a strain of C. youngae (Table 3). Furthermore, the difference in GC content between IITK SM2 and type strains of C. youngae was ≤0.1%, which confirms that the isolate is a strain of C. youngae. Although IITK SM2 belongs to C. youngae, it is different from other C. youngae strains as suggested by the comparison of subsystem features and gene clustering of different Citrobacter species. The coverage of subsystem features in IITK SM2 and comparison of counts of each feature in different Citrobacter species using Rapid Annotations using Subsystems Technology (RAST) server suggested that IITK SM2 was different from the other type strains of C. youngae ( Figure 4 and Table 4). Subsystem features grouped under virulence, disease, defense (F3), phages, prophages, transposable elements, plasmids (F7), and membrane transporter (F8) were much higher in strain IITK SM2 as compared to other type strains of C. youngae [NCTC 13708 T and CCUG 30791 T ; Table 4]. On the contrary, much lesser feature counts of motility and chemotaxis (F14) and protein metabolism (F12) were observed in IITK SM2. Furthermore, a Venn diagram of protein clustering of strain IITK SM2 with closely related Citrobacter strains, NCTC 13708, CCUG 30791, and FDAARGOS 549, indicated a total of 4653 protein clusters (Figure 4(b)). Of these, 4590 orthologous clusters that contained at least two strains and 3686 were single-copy clusters. Although IITK SM2 shared 3718 orthologous protein clusters with the other three strains, maximum clusters (4184) were shared with CCUG 30791. Most of the unique orthologous clusters identified in IITK SM2 represented proteins with unknown functions. However, two protein clusters were identified as the IS11595 family transposase and metal binding proteins. These analyses confirmed that our isolate is a novel strain of Citrobacter youngae and was named Citrobacter youngae IITK SM2.

Presence of Arsenic-Resistant Genes.
From the draft genome of isolate, Citrobacter youngae IITK SM2 Asresistant genes were identified ( Figure 5(a)). These genes belongs to the ars operon system [33,35,79]. Genes corresponding to arsC, arsB, arsA, arsD, arsR, and arsH were present in the g-DNA of this isolate. These genes have specific functions. The arsenate reductase arsC, which encodes a protein of smaller molecular weight (13-15 kDa), belongs to the thioredoxin superfamily and mediates the reduction of As(V) to As(III) in the cytoplasm [79,80]. The ars operon further contains an efflux pump (arsB) specific to arsenic, which encodes arsenite permease and extrudes As(III) out of the cell [81]. Resistance to As(V) and As(III) is provided by the expressions of arsC and arsB genes, which are controlled by a transcriptional repressor, arsR [80,82]. The presence of these As-resistant genes could be the possible reason of such high MICs of As(V) and As(III) observed in strain IITK SM2. In addition to these genes, As-operon also contains arsD and arsA. It is known that arsD exhibits a weak As(III)-responsive transcriptional repressor activity [83], and the gene arsA provides higher resistance to elevated levels of As(III) by encoding intracellular ATPase, which forms a dimer with arsB [79].
The arsenic-resistant gene arsH is a NADPH-dependent flavin mononucleotide reductase [84] and was also identified in the g-DNA of IITK SM2. It was reported that the presence of an arsH gene increased resistance to inorganic As species ( i As V/III ) in some bacteria [85,86]. However, some studies proved that neither overexpression nor mutation of arsH protein provided resistance to inorganic As in Thiobacillus ferrooxidans [87] and cyanobacterium Synechocystis sp. PCC 6803 [88]. The exact function of the arsH gene remains unclear. Recently, a study showed that arsH detoxified organoarsenic compounds like MMA(III) and aromatic arsenic species by their oxidation to MMA(V) [89]. These Asresistant genes were arranged in three distinct ways, which contained (1) arsCBRH, (2) arsACBADR, and (3) an independent arsC gene in strain IITK SM2 (Figure 5(a)). It is possible that the presence of these genes might have a role in high arsenic resistance exhibited by IITK SM2.

Comparison of As-Resistant Genes in Strain IITK SM2
with Genes in Reference Organisms. The distribution of ars genes identified in the isolate differed from other type strains of Enterobacteriaceae and especially of Citrobacter species  Figure 5). C. youngae NCTC 13708 was the closest to IITK SM2, which contained the unique six gene operon (arsAC-BADR) but was regulated in the opposite direction compared to IITK SM2 ( Figure 5(b)). Such a sextet gene cluster was neither observed for any other Citrobacter species nor for any mentioned pioneer strains of the Enterobacteriaceae family. Besides this, operon IITK SM2 contained the arsCBRH operon and an independent arsC gene ( Figure 5(a)), which were absent in NCTC 13708. These differences suggest that our isolate could be even more effective in arsenic resistance than NCTC 13708. Although arsCBRH operon was present in other Citrobacter species, such as C. braakii ATCC 51113 T , C. freundii FDAARGOS 549 T , and strain bta3-1 T , the independent arsC gene was not identified in any of the chosen reference strains (Figures 5(b) and 5(c)). Earlier studies have referred to such proteins as "fusion proteins," which if functional could provide evolutionary advantage in sensing and/or detoxifying As(III) in the environment [90,91]. Other reference chromosomes or plasmids mostly contain the fivegene arsCBADR operon identified in C. tructae SNU WT2 T , C. freundii FDAARGOS 549 T and E. coli (R773 and R46), or the three-gene arsRBC operon present in C. sedalki NBRC Subsystem feature counts Subsystem category distribution Subsystem coverage Metabolism of aromatic compounds (9) Stress response (87) Respiration (114) Dormancy and sporulation (2) Nitrogen metabolism (40) Fatty acids, lipids, and isoprenoids (56) DNA metabolism (85) Secondary metabolism (21) Regulation and cell signaling (66) Motillty and chemotaxis (16) Cell division and cell cycle (7) Protein metabolism (231) Nucleosides and nucleotides (86) RNA metabolism (56) Iron acquisition and metabolism (53) Membrane transport (91) Phagea, prophages, transposable elements, plasmids (25) Miscellaneous (33) Photosynthesls (0) Potassium metabolism (17) Virulence, disease and defense (63) Cell wall and capsule (43) Cofactors, vitamins, prosthetic group, pigments (181) Carbohydrates (376) Phosphorus metabolism (31) Sulfur metabolism (26) Amino acids and derivatives (362)  Figure 4: Distribution of subsystem features (a) and orthologous protein clusters (b) in the genome of Citrobacter youngae strain IITK SM2 of subsystem features was performed using the rapid annotations using subsystems technology (RAST) server, where RASTtk annotation scheme was used [57]. The Venn diagram of the clustering of proteins based on the coding sequences (CDs) was constructed using the whole genome sequences of isolate IITK SM2 and of strains of Citrobacter youngae (CCUG 30791 and NCTC 13708) and Citrobacter freundii (FDAARGOS 549). 10 BioMed Research International Table  4: Comparison of subsystem features in different Citrobacter species using rapid annotations using subsystems technology (RAST) server. The annotation scheme used was RASTtk [57]. Accession numbers of different type strains are mentioned in

12
BioMed Research International , and E. coli chromosomes. An exception in the arrangement of As metabolizing genes in C. cronae Tue2-1 T was observed, where two transacting repressors were identified (arsRCBDAR). These arsCBADR and arsRBC were not identified in strain IITK SM2. Even though a dDDH value of~84% suggested that the other strain of C. youngae, CCUG 30791, was closer to IITK SM2, no Asmetabolizing genes were identified in this type strain. The identified unique arrangement of ars genes in this isolate was consistent with the possibility that IITK SM2 is a novel Asresistant strain of C. youngae.

Presence of Three Arsenate-Reductase (arsC) Genes.
The isolate IITK SM2 contained three different arsC genes designated as arsC1, arsC2, and arsC3 ( Figure 6). The numbers of amino acids in these arsenate reductases were 119, 141, and 141, respectively. The alignment of their protein sequences suggested that 41 amino acids were completely conserved in these three genes (Figure 6(a)). Furthermore, the protein sequences of arsC2 and arsC3 showed~83% similarity. The phylogenetic tree based on NJ method suggested that the smaller arsenate-reductase, arsC1, clustered with S. enterica (GAS71778.1), whereas arsC2 and arsC3    Figure 6: Multiple alignment (a) and phylogeny (b) of arsenate-reductase gene (arsC) sequences identified in the strain Citrobacter youngae IITK SM2. Multiple alignments of arsC sequences (arsC1, arsC2, and arsC3) of IITK SM2 were performed using CLUSTAL_W. Nucleotides conserved in all the three arsC genes are highlighted in yellow. The neighbor-joining method [57] was used for developing distance-based phylogeny of arsC genes. For this phylogeny, arsC genes in strain IITK SM2 were compared with other As(V)-reducing organisms (excluding strains of Citrobacter). Percentage bootstrap values corresponding to 1000 replicates are shown next to the branches. The tree shown in (b) was drawn to scale. 14 BioMed Research International were not clustered with any arsC's of the selected bacterial strains. However, arsC2 and arsC3 were clustered together ( Figure 6(b)). This analysis suggested that the smaller arsenate reductase in strain IITK SM2 might have evolved from Salmonella-type strains, whereas the larger arsC's might be native of Citrobacter. The presence of these arsenate reductases along with arsB, arsA, arsD, arsR, and arsH in strain IITK SM2 could be responsible for its high arsenic resistance and for As(V) reduction to As(III) ( Figure S5). Detailed mechanism and kinetics of microbially mediated arsenate reduction by this isolate would help to understand the role of IITK SM2 on arsenic speciation in groundwater.

Conclusions
The Gram-negative, rod-shaped facultative anaerobe bacterial strain IITK SM2 could survive under high concentrations of dissolved As and could reduce As(V) to As(III). Apart from dissolved As, this isolate also showed resistance to some of the other heavy metals found in groundwater, such as Fe(III), Mn(II), Zn(II), Ba(II), and Cr(VI). Enhanced bacterial growth was observed in the presence of dissolved As(III) and As(V), but the former was more toxic to the cells than the latter. The IITK SM2 is a novel strain of Citrobacter youngae and is different from other strains in terms of a number of critical subsystem features related to membrane transporters, virulence and defense, motility, protein metabolism, and phages, prophages, and transposable elements. The presence of As-metabolizing genes such as arsC, arsB, arsA, arsD, arsR, and arsH were identified in the genome of this strain. A unique clustering of As-resistant genes was also found in the g-DNA of IITK SM2 as (1) arsCBRH, (2) arsACBADR, and (3) an independent arsC gene, which was not observed in any other Citrobacter species and selected strains of Enterobacteriaceae family. Furthermore, two different varieties of arsC genes were identified, where one As(V)-reductase gene might have evolved from Salmonella and the other may have been native to Citrobacter.
The information presented in this study could contribute to the mechanistic understanding of the biogeochemical processes that control elevated arsenic prevalence in groundwater, which could help in developing long term in situ mobilization techniques for As-remediation.

Data Availability
The GenBank accession number for the 16S rRNA sequence is MZ477215. The accession number of the whole-genome shotgun project of Citrobacter youngae IITK SM2 registered at DDBJ/ENA/GenBank is JAGIYN000000000. In this study, the version described is JAGIYN000000000.1. Other data are included within the manuscript.

Conflicts of Interest
The authors declare that no competing financial interests exist, and this work has been carried out in compliance with ethical standards.

Supplementary Materials
A Supporting Information file of 24 pages containing six sections, five figures, and six tables is available online. This file contains details of sample collections, analysis techniques, experimental procedure, phylogenetic analysis based on recN sequence, map of sampling location, composition of media used, results from biochemical experiments, details of chemicals and media used, method detection limits of various techniques used, and comparison of strain IITK SM2 with other arsenate-reducers identified in the literature.