Molecular Fingerprints to Identify Candida Species

A wide range of molecular techniques have been developed for genotyping Candida species. Among them, multilocus sequence typing (MLST) and microsatellite length polymorphisms (MLP) analysis have recently emerged. MLST relies on DNA sequences of internal regions of various independent housekeeping genes, while MLP identifies microsatellite instability. Both methods generate unambiguous and highly reproducible data. Here, we review the results achieved by using these two techniques and also provide a brief overview of a new method based on high-resolution DNA melting (HRM). This method identifies sequence differences by subtle deviations in sample melting profiles in the presence of saturating fluorescent DNA binding dyes.


Introduction
Candida species are opportunistic pathogens which can cause diseases ranging from mucosal infections to systemic mycoses depending on the vulnerability of the host. The major pathogen worldwide is Candida albicans [1,2]. This fungus is detected in the body microbiota of healthy humans [3] and accounts for 75% of the organisms residing in the oral cavity [4]. It is diploid and has a largely clonal mode of reproduction. However, it can undergo considerable genetic variability either by gene regulation and/or genetic changes including chromosomal alterations, mutations, and loss of heterozygosity (LOH). In fact, LOH events lead to MTL homozygosis [5], azole resistance [6][7][8] and microevolution during infection [9][10][11], passage through a mammalian host [12], or in vitro exposure to physiologically relevant stresses [13].
Non-albicans Candida species such as Candida glabrata, Candida parapsilosis, Candida tropicalis, Candida krusei, and Candida dubliniensis are also found with increasing frequency [14][15][16][17]. C. glabrata has been reported to be the second etiologic agent, after C. albicans, of superficial and invasive candidiasis in adults in the United States [18,19], whereas, in Europe and Latin America, C. parapsilosis is the specie responsible for approximately 45% of all cases of candidemia [14,20].
The ability to discriminate Candida isolates at the molecular level is crucial to better understand the spread of these species, particularly in hospitals and to assist in an early diagnosis and initiation of the appropriate antifungal therapy as these organisms show a range of susceptibilities to existing antifungal drugs. C. albicans, C. parapsilosis, and C. tropicalis remain susceptible to polyenes, azoles, and echinocandins [21]. However, C. glabrata and C. krusei show reduced triazole susceptibility [22,23]. In addition, the majority of clade 1 isolates of C. albicans are less susceptible to flucytosine [24]. The faster and more accurate the species and strains can be identified, the greater the impact in the patient clinical response is. Several methods, such as pulsed-field gel electrophoresis, restriction enzyme analysis, Southern-blot assays, random amplified polymorphic DNA, and amplified fragment length 2 BioMed Research International polymorphism, were used to track differences among Candida isolates [25,26]. However, these approaches have limitations such as time consuming, use of radioactive elements, poor reproducibility, and/or discriminatory power [25,26]. In the present review, we summarize the most exact and/or recent DNA-based techniques developed for a better understanding of the epidemiology of Candida species. The availability of the C. albicans genome sequence [27][28][29] facilitated studies in comparative genomics and genome evolution.

Multilocus Sequence Typing
The multilocus sequence typing (MLST) is based on the analysis of nucleotide sequences of internal regions of various independent housekeeping genes. MLST studies for C. albicans, C. glabrata, C. tropicalis, C. krusei, and C. dubliniensis have been reported (reviewed in [30]). MLST of C. albicans was introduced during the early 2000s [31,32]. On the basis of a collaborative work, an international consensus set of seven genes for C. albicans MLST have been proposed [33]. This gene set includes AAT1a, ACC1, ADP1, MPIb, SYA1, VPS13, and ZWF1b (Table 1). MPIb has been renamed PMI1 [34]. Table 1 also shows primers for the amplification and sequencing of the seven gene fragments.
MLST system has proved to be a useful method for epidemiological differentiation of C. albicans clinical isolates [31,32]. Indeed, isolations of C. albicans strains recovered from human patients seem to be specific to the patient but not associated with different anatomical sources or hospital origin [9,10,35,36]. MLST studies also revealed a population structure with five major clades of closely related strain types (numbered 1, 2, 3, 4, and 11) plus various minor clades [37]. Clades do not represent cryptic species as genetic exchange between and within clades is limited [38]. Clade 1 is particularly rich in flucytosine-resistant isolates [39,40]. All clade 1 flucytosine-resistant isolates carry a point mutation (R101C) in the FUR1 gene which encodes uridine phosphoribosyl transferase [40].
A potential weakness of the C. albicans international standard gene set is that three of the chromosomes are not represented and two gene pairs are located on the same chromosome (Table 1). In order to include highly informative polymorphisms, a MLST-biased single nucleotide polymorphism (SNP) microarray has been developed [41]. This system which includes 7 loci from the consensus scheme and 12 additional discrete loci located at intervals along the 8 chromosomes may provide a basis for a standardized system.
MLST schemes have been also reported for C. glabrata [42]. This typing system is based on fragments of six genes ( [42], Table 2). Utilizing this MLST method, several studies have described the population structure of geographically diverse collections of C. glabrata isolates [43][44][45]. Recent MLST analysis of 230 isolates of C. glabrata from five populations that differed both geographically and temporally confirmed that the six unlinked loci provide genotypic diversity and differentiation among isolates of this species [46]. MLST studies also revealed that C. glabrata strains causing bloodstream infections have similar population structures and fluconazole susceptibilities compared to those normally residing in/on the host [47]. When susceptibility testing of colonizing isolates while receiving azole therapy was studied, MLST revealed the occurrence of resistance development far more frequently in C. glabrata than in any other species [48]. This resistance to azole prophylaxis has led to an increased use of echinocandin for primary therapy of C. glabrata infections. However, decreased susceptibility to echinocandin drugs can be observed among C. glabrata isolates with mutations in the FKS1 and FKS2 genes. These genes encode Fks1p and Fks2p subunits of the 1,3--glucan synthase complex, which synthesizes the principal cell wall component -1,3-glucan, target of echinocandin drugs. In light of this, MLST analysis performed on isolates with FKS mutations indicated that the predominant S663P mutation in the FKS2 gene was not due to the clonal spread of a single resistant phenotype [49].
The MLST system for C. tropicalis comprises six housekeeping genes ( [52], Table 2). Data indicate that C. tropicalis phylogenetically resembles C. albicans [53]. Both are diploid organisms, exhibit a predominant clonal mode of reproduction, and support high level of recombination events, which BioMed Research International 3  BioMed Research International 5 mimic sexual reproduction processes [53]. However, unlike C. albicans [35], C. tropicalis shows a clonal cluster enriched with isolates with fluconazole resistant or "trailing growth" phenotypes [54]. The term "trailing growth" describes the growth that some isolates exhibit at drug concentrations above the minimum inhibitory concentration (MIC) after 48 h of incubation, although isolates appear fluconazole susceptible after 24 h of incubation. However, Wu et al. [55] reported that C. tropicalis isolates were unrelated to the fluconazole resistance pattern, suggesting that the antifungal resistance may develop geographically. Association between the MLST type of each isolate and flucytosine resistance has also been observed [40,56,57]. It is interesting that MLST genotypes were only distantly related, thus indicating that flucytosine resistant strains emerged independently in different geographic areas [56].
MLST gene sets for C. krusei and C. dubliniensis have also been described [50,51]. Characteristics of the housekeeping loci used for these species are described in Table 2.

Microsatellite Length Polymorphisms Analysis
Microsatellite length polymorphisms (MLP) analysis identifies microsatellite instability. Microsatellites, also called simple sequence repeats (SSRs) or short tandem repeats (STRs), are tandem repeat nucleotides comprising 1-6 bp dispersed throughout the genome. These sequences undergo considerable length variations due to DNA polymerase slippage and as a consequence are highly mutagenic [58]. In Candida species, this technique has been applied for strain typing [43][44][45][59][60][61][62][63], analysis of population structure [64,65], and epidemiological studies [57,61,[66][67][68]. For C. albicans, several polymorphic microsatellite loci have been identified (Table 3 and references therein). They were located in the promoter sequence of the elongation factor 3 (EF3) [60,69], in coding regions of extracellular-signal-regulated kinase gene (ERK1) [70], downstream of coding sequences of cell division cycle protein (CDC3) [59,60,71] and imidazole glycerol phosphate dehydratase genes (HIS3) [60] and in noncoding regions (CARABEME, CAI, CAIII, CAV, CAVI, and CAVII) [44,66,72]. These markers were used alone or in combination. The best discriminatory powers (DPs) obtained were 0.998 for CAI, CAIII, and CAVI [44] and 0.999 for EF3, CAREBEME, CDC3, HIS3, KRE6, LOC4 (MRE11), ZNF1, CAI, CAIII, CAV, and CAVII [73]. The DP estimates the method ability to differentiate between two unrelated strains. A high DP value (close to 1) indicates that the typing method is able to distinguish each member of a strain population from all other members of that population [74]. It is noteworthy to mention that CA markers were specific for C. albicans [44,66]. In fact, CA microsatellites were named after C. albicans and numbered according to the order of the analysis [44]. These markers are highly polymorphic since they are located outside known coding regions, thus being under inconsequential selective pressures. Recently, an allelic CDC3 ladder has been developed for interlaboratory comparison of C. albicans genotyping data [75]. This ladder proved to be important as an internal standard for a correct allele assignment.
Genotyping systems based on SSR markers have been also described for C. glabrata. In 2005, Foulet et al. [67] adopted three polymorphic microsatellite markers located upstream of the mitochondrial RNase P precursor (RPM2), metallothionein 1 (MTI), and 5,6-sterol desaturase (ERG3) genes to generate a rapid strain typing method with a DP of 0.84. These markers were specific for C. glabrata isolates. Addition of three new microsatellite markers (GLM4, GLM5, and GLM6) generated a typing system with a DP value of 0.941 [76]. However, by combining only 4 microsatellite markers (MTI, ERG3, GLM4, and GLM5), authors achieved a DP value of 0.949. A different set of six different microsatellite markers located in noncoding regions (Cg4, Cg5, and Cg6) and in coding regions (Cg7, Cg10, and Cg11) have been described [68], although the highest DP value, 0.902, was reached by using a combination of only four markers (Cg4, Cg5, Cg6, and Cg10). Another research group adopted eight polymorphic microsatellite markers distributed among different chromosomes [77]. This method has a DP value of 0.97, making it suitable for tracing strains. Studies using this system indicate that C. glabrata is a persistent colonizer of the human tract, where it appears to undergo microevolution [78].
A highly polymorphic CKTNR locus for molecular strain typing of C. krusei has been identified [43]. Such locus consists of CAA repeats interspersed with CAG and CAT trinucleotides. Analysis of the CKTNR allele distribution suggested that the reproductive mode of C. krusei is mainly clonal [43].
MLP analysis also proved to be a reproducible method for molecular genotyping of C. parapsilosis [79]. Seven polymorphic loci containing dinucleotide repeats, most of them located in noncoding regions, were analyzed. The DP calculated for such loci was 0.971. These microsatellites were not amplified with DNA from single representatives of related species, Candida orthopsilosis and Candida metapsilosis [79]. Recently, another research group conducted C. parapsilosis typing studies using one of the previously reported marker (locus B, [79]) and three additional new microsatellite loci located outside known coding regions [80]. This multilocus analysis resulted in a DP of 0.99. These markers were also specific for the molecular typing of C. parapsilosis since no amplification products were obtained with DNA of C. orthopsilosis and C. metapsilosis.

High-Resolution DNA Melting
High-resolution DNA melting (HRM) is a novel technique for SNPs genotyping and for the identification of new genetic variants in real time (Figure 1). First, a PCR method is used to amplify specific DNA polymorphic regions in the presence of saturating DNA fluorophores [81]. The dye does not interact with single-stranded DNA but binds to double-stranded DNA, resulting in a bright structure. After PCR amplification, at the beginning of the HRM analysis, the fluorescence is high. As DNA samples are heated up, the double-stranded DNA dissociates releasing the dye which leads to a decrease in  the fluorescence intensity (Figure 1(a)). The observed melting temperature ( m ) and the shape of the melt curve are characteristics of the specific sequence of the fragment (primarily the GC content and the length). Data can also easily be interpreted by derivative melting curves (Figure 1(b)) and by plotting the fluorescence difference between a sample and a selected control at each temperature (Figure 1(c)) [81]. Some recent studies used HRM to differentiate clinical Candida species [82][83][84]. HRM has been proven to be a sensitive, reproducible, and inexpensive tool for a clinical laboratory but exhibits low DP values. DP for CDC3, EF3, and HIS3 markers was 0.77 [84]. However, HRM can be used along other genotyping methods to increase the resolving power. In fact, the combination of HRM with MLP and SNaPshot minisequencing of the CDC3 locus provided a DP value of 0.88 [83].

Conclusions
The development of DNA sequence-based technologies led to a great progress in understanding the epidemiology of clinical isolates of Candida species. Both MLST and MLP analysis offer a number of technical advantages over conventional typing methods including extremely high DP values and reproducibility, ease of use, and rapid reliable data. The selection of the technique depends on the purpose of the study, the accessibility of genotypic strains archives, the time available to complete the analysis, and the cost. MLST remains the most reliable method for the assessment of population structure, diversity, and dynamics among C. albicans, whereas MLP analysis is most suitable for a rapid and less expensive study of a limited number of isolates.