Amplicon-Based Analysis of the Fungal Diversity across Four Kenyan Soda Lakes

Microorganisms have been able to colonize and thrive in extreme environments characterized by low/high pH, temperature, salt, or pressure. Examples of extreme environments are soda lakes and soda deserts. The objective of this study was to explore the fungal diversity across soda lakes Magadi, Elmenteita, Sonachi, and Bogoria in Kenya. A new set of PCR primers was designed to amplify a fragment long enough for the 454-pyrosequencing technology. Analysis of the amplicons generated showed that the new primers amplified for diverse fungal groups. A total of 153,634 quality-filtered, nonchimeric sequences derived from the 18S region of the rRNA region were used for community diversity analysis. The sequence reads were clustered into 502 OTUs at 97% similarity cut-off using BLASTn analysis of which 432 were affiliated to known fungal phylotypes and the rest to other eukaryotes. Fungal OTUs were distributed across 107 genera affiliated to the phyla Ascomycota, Basidiomycota, Glomeromycota, and and other unclassified groups refred to as Incertae sedis. The phylum Ascomycota was the most abundant in terms of OTUs. Overall, fifteen genera (Chaetomium, Monodictys, Arthrinium, Cladosporium, Fusarium, Myrothecium, Phyllosticta, Coniochaeta, Diatrype, Sarocladium, Sclerotinia, Aspergillus, Preussia, and Eutypa) accounted for 65.3% of all the reads. The genus Cladosporium was detected across all the samples at varying percentages with the highest being water from Lake Bogoria (51.4%). Good's coverage estimator values ranged between 97 and 100%, an indication that the dominant phylotypes were represented in the data. These results provide useful insights that can guide cultivation-dependent studies to understand the physiology and biochemistry of the as-yet uncultured taxa.


Introduction
Microorganisms have been able to not only colonize but also thrive under unique or extreme environmental conditions, which characterized by low/high pH, temperature, salt, or pressure. Examples of extreme environments are soda lakes, which are characterized by high alkalinity (with pH values ranging between 9 and 12 to the point where Na + concentrations can reach saturation). eir surface area fluctuates due to extensive evaporation attributed to the intense evaporation and low levels of precipitation experienced where they are located. Despite the extreme physicochemical conditions in the soda lake ecosystems, a high level of species diversity has been reported [1][2][3][4]. ese habitats may exhibit higher productivity than freshwater bodies [5].
High-throughput sequencing allows rapid estimation and identification of microorganisms without cultivation [15]. Using this approach, a high prokaryotic and eukaryotic diversity has been reported from several alkaline lakes such as Magadi in Kenya [16,17], Ethiopian soda lakes [1], Central European hypersaline lakes [18], and sediments from the Tibetan Plateau [19]. erefore, a sequence-based approach has made it easier to understand diversity and structure of microbial communities in diverse environments [20,21]. Most of the highthroughput sequencing technologies used in diversity studies have an amplification step. e earliest PCR primers to gain wide acceptance in fungal studies were designed for the nuclear ribosomal internal transcribed spacer (ITS) region and were described by White et al. [22] and thereafter modified by several researchers [23][24][25]. However, the ITS region does not suffice to resolve the species level in some groups of fungi [15,23,[26][27][28].
Besides the ITS region, the small (18S/SSU) and large (28S/LSU) subunits of the rRNA operon have been targeted for amplification in fungal diversity studies [15,29]. Several group-specific 18S rRNA gene primers have been developed [30][31][32]. However, coverage and phylogenetic resolution to lower taxonomic levels are always a challenge, especially when dealing with less explored habitats. In this study, we designed a new set of primers targeting the 18S rRNA gene for high-throughput sequencing and tested them using various samples collected from different soda lakes in Kenya. e main objective was to explore whether fungal diversity varies between the lakes Magadi, Elmenteita, Sonachi, and Bogoria and across each lake due to differences in physicochemical parameters. e study provides new insights into the spatial diversity across various soda lakes and with varying physicochemical parameters.

Description of Study Sites and Sampling Design.
Study sites chosen for the study were the hypersaline lake Magadi (2 o 00′S and 36 o 13′E) at an elevation of 600 m above sea level. It lies in a naturally formed closed lake basin and has an annual rainfall of approximately 500 mm [33]. e lake covers an area of 90 km 2 , and evaporation is intense during the dry season. Lake Elmenteita (0°27′S, 36°15′E) is a moderately saline lake located 1776 m above sea level. It has no direct outlet. e lake is approximately 20 km 2 , but the total surface area changes with seasons. e lake often floods during heavy rains. Lake Bogoria (0°20′N and 36°15′E) lies at an altitude of 975 m. It has a low rainfall of 708 mm, and there are several geysers around the lake. e alkaline, saline crater lake Sonachi, lies in a closed basin on the Eastern Rift valley (0°49′S, 36°16′E).

Sample Collection and Nucleic Acid Extraction.
Wet sediment, water samples, microbial mats, dry sediments, and grassland soil were collected from lakes Bogoria, Elmenteita, Sonachi, and Magadi as described [6]. 1 g of each soil or sediment sample was weighed into a sterile Eppendorf tube. For the water samples, 500 ml was filtered through a 0.22-μm filter, cut into small pieces with a sterile scalpel, and transferred to a sterile 2-ml tube. Total DNA was extracted using the phenol: chloroform protocol [34]. However, proteinase K was substituted with 6 M guanidine isothiocyanate (GITC) for protein denaturation. Our experience is that extraction of high molecular weight DNA from the soda lake samples using kits is problematic due to high salt content in the samples.
ereafter, the primers were modified for pyrosequencing by attaching an adaptor sequence, a key, and a unique 12 nucleotide MID for multiplexing purposes. Each PCR (50 μL) contained forward and reverse primers (10 μM each), dNTP's (10 mM each) Phusion GC buffer (Finnzymes), Phusion high-fidelity polymerase (0.5 U/μL −1 ), and 25 ng of template DNA. Cycling conditions were as follows: initial denaturation at 98°C for 3 minutes followed by 25 cycles of denaturation at 94°C for 30 sec, annealing for 30 sec at 58°C, extension at 72°C for 90 sec, and a final extension step of 72°C for 5 min. Amplification was confirmed by separating 2 μl of the PCR product on a 1% TAE agarose gel (40 mM Tris base, 20 mM glacial acetic acid, 1 mM EDTA, and 1.5% (w/v) agarose) run for 1 h at 100 V. Later, three independent PCR products per sample were pooled in equal amounts, separated on a gel, and extracted using the PeqGOLD gel extraction kit (PeqLab Biotechnologie GmbH, Erlangen, Germany). PCR products were quantified using a Nanodrop (PEQLAB Biotechnologie GmbH, Erlangen, Germany) and a Qubit fluorometer (Invitrogen GmbH, Karlsruhe, Germany) as recommended by the manufacturers. Sequencing of the PCR derived amplicons was performed on a Roche GS-FLX 454 pyrosequencer and Titanium chemistry (Roche, Mannheim, Germany) at Göttingen Genomics Laboratory, Georg August University Göttingen, Germany.
e raw sequence reads have been deposited into the SRA under the accession SRP019052.

Sequence Analysis.
Sequence reads were denoised and evaluated for potential chimeric sequences using UCHIME within the USEARCH package v.11.0 [42]. OTU picking was done from the quality filtered, denoised, and nonchimeric sequences using a sequence identity cutoff of 97%. Representative OTUs were picked using vsearch v2.14 [43]. Taxonomy was assigned to the representative sequences from each cluster by BLAST searches against the SILVA database of ARB version 132 [35,36].

Statistical
Analysis. Rarefaction analysis using the script alpha_rarefaction.py in QIIME [44] was done to assess whether the sampling effort was representative of the microbial diversity in the samples. OTUs were assigned to ecological guilds using the annotation tool FUNGuild [45]. A heatmap to display the most abundant OTUs was created using the ampvis2 R package [46]. Diversity estimates (Good's coverage, Chao1, and Shannon) were generated using the alpha_diversity.py script in QIIME [44]. e package "indicspecies" [47] in R was used to find out whether there were genera that were significantly associated with different sample types. Nonmetric multidimensional scaling (NMDS) of fungal communities was conducted in R using the vegan package [48] based on unweighted UniFrac [49] distance matrices.

Evaluation of the New Primer Set.
e newly designed primers amplified for eukaryotic groups only, and no bacterial sequences were detected in this study. In each sample, the success rate for amplifying for fungal groups was above 90%, which was considered good for environmental DNA. e amplicons could also be assigned to taxonomy with high confidence owing to the sequence length generated using the 454 technology. In addition, the samples used ranged from dry sediments to microbial mats, and therefore, good quality DNA is key to amplification.

Sequence Data.
e clean data from 32 samples comprised 153,634 quality-filtered, denoised, and nonchimeric sequences with no singletons. Of these, 152,834 were fungal amplicons, which represent a primer success rate of 99.48%. ese fungal sequences were clustered into 502 OTUs at 97% similarity of which 417 were affiliated to known fungal phyla. e remaining OTUs represented Bacillariophyta (25), Streptophyta (1 OTU), and other unclassified eukaryotic groups termed Incertae Sedis with 6 OTUs. Fungal OTUs per sample ranged from 13 in the Bogoria wet sediments (sample BWS10) to 68 in the dry sediments from Lake Sonachi (sample BDS10) as shown in Table 1.

Diversity at the Phylum Level.
e 417 fungal OTUs with assigned taxonomy were distributed across 113 fungal genera affiliated to the phyla Ascomycota, Basidiomycota, Glomeromycota, and a smaller percentage to other nonfungal eukaryotic groups (Figure 1). e phylum Ascomycota was the most dominant phylum, and its orders Capnodiales, Pleosporales, Hypocreales, Myrmecridiales, Sordariales, and Xylariales were the most abundant. In this phylum, the order Pleosporales was the most diverse with 13 genera followed by the orders Capnodiales (9 genera), Hypocreales (11 genera), and Xylariales (11 genera). However, we noted that in Elmenteita wet sediments (sample EWS), the nonfungal phylum Bacillariophyta accounted for 45% of the OTUs (Figure 1). Sequences affiliated to Basidiomycota were detected in a few of the samples and were distributed in the orders Agaricales, Boletales, Polyporales, Cystofilobasidiales, Filobasidiales, Sporidiobolales, and Malasseziales. e phylum Glomeromycota was represented by a single species-Diversispora eburnean-that was detected in the Elmenteita grassland soil. Overall, we identified fifteen (15) ascomycete genera that constituted 65.3% of all the reads and these were Arthrinium, Cladosporium, Fusarium, Chaetomium, Monodictys, Myrothecium, Phyllosticta, Coniochaeta, Diatrype, Sarocladium, Sclerotinia, Aspergillus, Preussia, and Eutypa (Figure 2(a)). e genus Cladosporium was detected across all the samples at varying percentages with the highest being water from Lake Bogoria (51.4%). However, at the species level, the most abundant reads were affiliated to uncultured phylotypes in the class Pezizomycotina (Figure 2(b)).

Diversity across the Different Samples.
We evaluated and compared the diversity across the 32 samples. We used different metrics (Richness, Simpson, Shannon, Evenness, Fisher, and Good's coverage) to evaluate the alpha diversity across the samples. Good's coverage estimator values were between 97 and 100% (Table 1). is is an indication that the dominant phylotypes were represented in the data. is was even more evident when the samples were clustered by sample type (P � 0.05). e lowest diversity was found in the brine samples and the highest in the sediment samples ( Figure 3).

Differences in Fungal Diversity across the Lakes.
Bray-Curtis dissimilarity analysis (Figure 4) demonstrated that the samples were separated into 3 clusters. e samples from Lake Bogoria formed a distinct cluster. is could be due to differences in OTU composition. For example, no OTUs related to the genus Cladosporium were detected in samples from Lake Bogoria, whereas the genera Myrothecium, Sclerotinia, Lasiodiplodia, and Peziza were only recovered from Lake Bogoria samples.

Discussion
e overall diversity and significance of fungal communities in the soda lakes have not been understood owing to the limited data available as compared to bacteria. e Kenyan soda lakes are situated in geographically remote areas that experience intense solar radiation; evaporation rates exceed precipitation rates; hence, there is a concentration of salts, which contributes to the elevated salinity levels. is may be one of the reasons why they are not well explored. e diversity reported so far has been based on culture dependent studies [6,50,51] or using molecular approaches [16,17]. Amplicon sequencing provides a better and more detailed understanding of the fungal diversity than does cultivation in these unique habitats. Out of the 432 fungal OTUS, 389 were identified to the genus level, while 320 were tentatively identified to the species level. Additionally, 3% of the fungi detected could not be classified and may represent novel autochthonous soda lake fungal phylotypes. In microbial diversity studies, reports show that a major percentage of the observed species are tagged as uncultured [52][53][54][55]. is necessitates more isolation or genome sequencing efforts to understand the physiology and metabolism of these novel groups.
In this study, the phylum Ascomycota accounted for more than 80% of the reads across the sample, with the most abundant classes being Dothideomycetes, followed by Sordariomycetes, Leotiomycetes, Eurotiomycetes, and Pezizomycetes. Sharma et al. [56] reported that 98% of the isolates recovered from Lonar lake belonged to Ascomycota, subphylum Pezizomycotina. Ascomycetes have also been reported to be dominant in marine sediments of Kongsfjorden, Svalbard [57], constituting 54.8% of the OTUs. In marine sediments of the Arabian Sea, Ascomycota were reported to be the most abundant phylum at 83% and the rest (17%) were Zygomycota [58]. However, Basidiomycota has been reported to be dominant in other hyper-   saline environments [59,60]. e genus Cladosporium was detected across all the samples with the highest relative abundance being in water from Lake Bogoria (51.4%). However, some of the phylotypes (Cladosporium sphaerospermum, Fusarium sp., and Penicillium sp.) have also been observed in marine sediments [58,61]. Chaetomium globosum has been isolated from the Dead Sea as well as saline habitats of Wadi El-Natrun [13], while isolates with close similarity to Sarocladium kiliense were recovered Lake Sonachi sediments [14]. Salinity and pH have an impact on fungal growth and spore formation, which in turn may affect the overall diversity [5]. Production of extremolytes and extremozymes, and accumulation of K ions and compatible organic solutes in the cells are ways of coping with osmotic stress [62][63][64]. Unique features such as the thick mycelium observed in Phoma herbarum are important in stress tolerance, while pigments such as those produced by Zasmidium cellare, Aspergillus keveii, and Cladosporium velox enable them to thrive in the harsh environments [6,65]. Previous culture-dependent studies [6] of Lake Magadi recovered isolates distributed across several fungal genera, namely, Aspergillus, Penicillium, Acremonium, Phoma, Cladosporium, Septoriella, Talaromyces, Zasmidium, Chaetomium, Aniptodera, Pyrenochaeta, Septoria, Juncaceicola, Paradendryphiella, Sarocladium, Phaeosphaeria, Juncaceicola, and Biatriospora. ese isolates grew better when lake water was used in media preparation as compared to synthetic mineral medium. is may be an indication that they are adapted to the haloalkaline environment.

Conclusion
A key ecological question is whether the observed fungal groups originate from the terrestrial environment via runoff or are actual residents exclusive to the soda lake habitats. It is possible that runoff from the surrounding soil introduces spores into the lakes; any such species may over time have adapted to the haloalkaline environment. In summary, the diversity and function of most fungal taxa in the soda lake ecosystem remain poorly understood. erefore, a combination of traditional culture-based method and metatranscriptomics may help to answer important ecological questions.

Data Availability
e raw sequence reads have been deposited into the SRA under the accession SRP019052.  Scientifica 7 August University Göttingen, Germany, for hosting him during the fellowship.