A Phylogenetic Analysis of Salix ( Salicaceae ) Based on matK and Ribosomal DNA Sequence Data

The genus Salix has proven a fertile taxon for a host of evolutionary and ecological studies, yet much work remains in the development of a definitive phylogenetic context for those studies. We performed phylogenetic analyses, using both maximum likelihood and maximum parsimony techniques, of chloroplast-encoded matK and nuclear-encoded ribosomal DNA (rDNA) gene sequences, gathered from specimens deemed representative of the existing subgeneric classification, with the objective of identifying and elaborating the phylogenetic relationships within Salix. Comparisons between the two phylogenetic hypotheses indicate a high degree of polyphyly in the matK-based phylogeny. This we attribute to the effects of hybridization, introgression, and lineage sorting. Comparisons with previous molecule-based phylogenetic hypotheses indicate a fair degree of congruence and all are unanimous in placing Chosenia arbutifolia within the genus Salix. The phylogenetic analysis of our ITS data set has produced results that generally support the most-recent infrageneric classification.


Introduction
The genus Salix L. has provided opportunity for a wide variety of evolutionary and ecological studies including plant-fungus associations [1][2][3][4], plant-herbivore interactions [5][6][7], the evolution of ecosystems [8], and hybridizationrelated phenomena [9][10][11][12].The robustness of conclusions drawn in studies such as these depend on an accurate understanding of the phylogenetic relationships among the taxa studied.Morphology-based systematic treatments of Salix have varied widely (see Azuma et al., [13] for summary) and to date only two DNA-based studies of the genus have been reported [13,14].
Leskinen and Alström-Rapaport [14] performed a parsimony analysis based on nucleotide sequences of the nuclearencoded ribosomal DNA region (i.e., the entire 5.8S RNA gene and the internal transcribed spacers ITS 1 and ITS 2) to study the relationship between Salicaceae and Flacourtiaceae.The specimens of Chosenia and Salix included in their analysis comprised a well-supported monophyletic group, albeit with little resolution, except a well-supported clade composed of S. alba, S. amygdaloides, S. fragilis, and S. pentandra [14].Azuma et al. [13] performed a parsimony analysis based on nucleotide sequences of the chloroplastencoded rbcL gene.Their results suggest that the species of three commonly recognized genera (Chosenia, Salix, and Toisusu) comprise a monophyletic group, which is in turn composed of two well-supported but poorly resolved clades that did not correspond with any traditional infrageneric taxa.
Our objective was to further elaborate phylogenetic relationships within Salix with additional analyses of DNA sequence data, that is sequences of the chloroplast-encoded matK gene and nuclear-encoded ribosomal DNA (rDNA) genes.
In angiosperms matK evolves approximately three times faster than rbcL [15], potentially providing more phylogenetically informative characters and improved phylogenetic resolution among the study taxa.Extensive hybridization among Salix species is known and chloroplast capture has

Materials and Methods
2.1.Specimen Selection, DNA Extraction, PCR Amplification, and Gene Sequencing.Selected representative species of Salix were chosen to cover the subgeneric classification of the genus.The species choices were made with the assistance of three North American willow experts George Argus, Robert Dorn and Walter Buechler (personal communications).European samples were also obtained from the Finnish willow study group at the University of Joensuu by Jorma Tahvanainen, Heikki Roininen, and Riitta Julkunen-Tiitto.To augment field collections, DNA samples were obtained from Kew Gardens, from a living willow collection in Boise, Idaho, and preserved specimens from the University of Idaho Stillinger herbarium (Table 1).
Total genomic DNA was isolated by the CTAB (i.e., cetyl trimethylammonium bromide) method of J. J. Doyle and J. L. Doyle [18], as modified in Brunsfeld et al. [16], using 1.5 grams of fresh leaf tissue from branch tips.In the field fresh leaves were collected from plants for lab analysis or pressed into voucher samples.The remainder of the field material was deposited as voucher specimens in the Stillinger Herbarium (ID).Total genomic DNA content was estimated using a DNA Fluorometer (Hoefer), and DNA stocks were diluted to 10 ng/µL.
The entire matK gene of the trnK intron and the rDNA internal transcribed spacer (ITS) region were amplified using the polymerase chain reaction (PCR) following Brunsfeld et al. [19] and Hardig et al. [12], respectively.
An ExoSAP-IT (USB) cleaning reaction was performed on PCR products.Sequencing reactions were performed using Big Dye terminator (Applied Biosystems) following Brunsfeld et al. [19] and Hardig et al. [12] for matK and ITS, respectively.Sequence reaction products were cleaned in sephadex columns (Princeton Separations) and run through a 5% Long Ranger (FMC Bioproducts) gel on an ABI 377 automated sequencer.

Phylogenetic Analyses. After a preliminary analysis of the
Salix data set, we pruned the matrix to 30 Operational Taxonomic Units (OTUs) for computational efficiency, including Chosenia and the outgroup Populus deltoides (Table 1).Both matK and ITS sequences were aligned with Sequencher (Genecodes).Both maximum likelihood (ML) and maximum parsimony (MP) techniques were employed for phylogenetic analysis using the program PAUP * (v 4.0b10: [20]).Nodal posterior probabilities were estimated by Bayesian analysis.(See Brunsfeld and Sullivan [21] for details.)A rooted network of matK haplotypes was created using character state polarizations determined in the phylogenetic analysis.

Redundant Sequences.
Three redundant matK sequences were detected in this study.matK Sequence 1 was recovered from all specimens (one each) of S. magnifica, S. acmophylla, S. reticulata, and S. alaxensis, and from one of two specimens each of S. sericea, S. cordata, and S. bebbiana, and from one of four specimens of S. eriocephala examined.matK Sequence 2 was recovered from all specimens (one each) of S. humboldtiana, S. amygdaloides, and S. gooddingii examined.matK Sequence 3 was recovered from one of the two specimens of S. sericea and from one of two specimens of S. bebbiana (Figure 1).
Four redundant ITS sequences were detected in this study.ITS Sequence 1 was recovered from all specimens (one each) of S. exigua, S. interior, and S. exigua var.hindsiana examined.ITS Sequence 2 was recovered from all specimens (one each) of S. humboldtiana, S. amygdaloides, and S. gooddingii.ITS Sequence 3 was recovered from all specimens (one each) of S. magnifica and from one of four specimens of S. eriocephala examined.ITS Sequence 4 was recovered from one of two specimens of S. cordata and from one of four specimens of S. eriocephala examined (Figure 2).
The rooted network of matK haplotypes identified six presumably extinct haplotypes and 27 extant are descendants of two separate lineages.Twenty-three of the 27 extant haplotypes appear to be terminal while the remaining four are nodal (Figure 3).Only two of the extant haplotypes (i.e., IVa and IIIb, Figure 3), both nodal, occurred in more than  2).

Phylogenetic Observations
4.1.1.Polyphyletic and Paraphyletic Species.Five of the seven species represented by multiple specimens in this study appear to be polyphyletic with respect to the matK phylogeny: Salix cordata, S. eriocephala, S. sericea, S. bebbiana, and S. lucida.Four of these species (i.e., S. cordata, S. eriocephala, S. bebbiana, and S. sericea) form a major part of the well-supported matK Clade 5.In the ITS phylogeny all specimens of these species, with the exception of S. lucida, are weakly paraphyletic but potentially closely related (e.g., S. cordata as a recent derivative of S. eriocephala) (Figure 2).We can envision two scenarios to explain the current distribution of chloroplast haplotypes amongst the specimens/species and the polyphyly apparent in our matKbased phylogeny.First, an otherwise monophyletic group would appear polyphyletic, with respect to a uniparentallyinherited molecule of DNA (e.g., chloroplast gene), if extant specimens possess different chloroplast haplotype lines retained from an early polymorphic common ancestor (i.e., lineage sorting) [22,23].For example, the two specimens of S. lucida included in this study are clearly polyphyletic in the matK phylogeny (Figure 1) yet appear to be closely related in the ITS phylogeny (Figure 2). Figure 4(a) shows how this incongruence could occur as a consequence of lineage sorting.The most recent common ancestral haplotype of the  4(a)).Therefore, the contemporary haplotype lines present in S. lucida would predate the origin of S. lucida, and their current taxonomic distribution would suggest that while they have been retained during descent in S. lucida, they have been lost during the origins of other species of Salix.
A second process that could account for the polyphyly of S. lucida in the matK phylogeny is chloroplast capture, that is, the lateral transfer of chloroplasts between extinct and/or contemporary species via hybridization and introgression.Figure 4(b) shows how this incongruence could occur as a result of chloroplast capture.The chloroplast haplotype present in S. lucida no. 2 appears to have been derived from the haplotype (inferred) present in the common ancestor  be responsible for the contemporary pattern of haplotype diversity.For instance, the polyphyly of S. sericea, S. bebbiana, S. cordata, and S. eriocephala, evident in the matK phylogeny could be explained either by the sole effect of lineage sorting (Figure 4(a)) or by the combined effects of introgression in the common ancestor of these species (i.e., the capture of Haplotype IIIb), followed by haplotype diversification and lineage sorting during the subsequent evolution of the common ancestor's descendents (Figure 4(b)).
Comparison of Figures 4(a) and 4(b) illustrate fundamental differences between the two scenarios.One obvious difference is the relative amount of haplotype polymorphism inferred in common ancestors.For example, in the strict lineage sorting scenario (Figure 4(a)) it appears that the common ancestor of the entire genus possessed, at a minimum, three haplotypes, while in the introgression scenario the same common ancestor possesses, at a minimum, two haplotypes; this difference is even more pronounced in the inferred common ancestor of ITS Clade 1, that is a minimum of eight haplotypes in the lineage sorting scenario and a minimum of three haplotypes in the introgression scenario (Figures 4(a) and 4(b), resp.).Another interesting point of comparison between the two scenarios is the difference in the evolutionary history of Haplotype IIIb (aka "Redundant matK Sequence 2") suggested by each.The nodal Haplotype IIIb (Figure 3) is extant in S. amygdaloides, S. humboldtiana, and S. gooddingii while four of its derivative lineages are present in the two principal clades of the ITS phylogeny (i.e., Clades 2 and 5; Figure 2).Given a strict lineage sorting scenario, Haplotype IIIb would have arisen early in the evolution of Salix, perhaps in the common ancestor of ITS Clade 1 (Figure 2).Subsequently, Haplotype IIIb spawned four derivative haplotypes that sorted variously during the evolution of Clade 1, while simultaneously persisting unchanged during the course of the evolution of S. amygdaloides, S. humboldtiana, and S. gooddingii.By contrast, the strict introgression scenario moves the origin and subsequent evolution of Haplotype IIIb to a later point in the evolution of the genus.
The combined effects of introgression and lineage sorting have been invoked to explain phylogenetic incongruities in various plant groups, for example, Stephanomeria [24], Phlomis [25], domestic Phaseolus [26], Achillea [27], and Hieracium [28].Given the known propensity for hybridization amongst species of Salix we are inclined to believe that both introgression and lineage sorting have had some part to play in the current distribution of haplotypes in the genus.

Redundant Gene Sequence Distributions.
Eight separate species were found to possess the redundant matK Sequence 1 (Figure 1).In the ITS phylogeny seven of these species occur in the large polytomy (Clade 2), while the eighth (S. acmophylla) occurs in a potential sister clade (Clade 5.) The presence of matK Sequence 1 in S. magnifica, S. reticulata, S. sericea, S. eriocephala, S. cordata, S. bebbiana, and S. alaxensis may indicate common ancestry, but its presence in the more distantly related S. acmophylla can also be explained in terms of either lineage sorting and/or introgression.In the ITS phylogeny S. acmophylla occurs in a well-supported clade (Clade 6, Figure 2) along with other members of subgenus Salix, yet in the matK phylogeny S. acmophylla occurs in a clade composed primarily of species from subgenera Vetrix, Salix, and Chamaetia (Clade 2, Figure 1).Again, Figure 4(a) depicts how this phylogenetic incongruence could have occurred as a result of lineage sorting and Figure 4(b) depicts how it could have occurred as a result of introgression.
One specimen each of S. sericea and S. bebbiana were found to possess the redundant matK Sequence 3.
Specimens of S. humboldtiana, S. amygdaloides, and S. gooddingii share a redundant ITS sequence (ITS Sequence 1) that appears basal to the remainder of the genus in the ITS phylogeny (Figure 2).The same specimens also possessed   the same redundant matK sequence (matK Sequence 2, Figure 1).This group was composed of three of the five species of subgenus Protitea included in the study.It is possible that these species represent an old and central part of the genus.Presumably then, the redundant matK Sequence 2 might closely resemble the ancestral matK haplotype for the genus, yet it clearly does not assume such a position (i.e., basal to the rest of the genus) in the matK phylogeny (Redundant matK Sequence 2 in Figure 1, Haplotype IIIb in Figure 3).

Comparision with Earlier Phylogenetic
Hypotheses.The sole infrageneric clade recovered in the analysis of Leskinen and Alström-Rapaport [14] consisted of specimens of S. alba, S. amygdaloides, S. fragilis, and S. pentandra.This clade corresponds in part to our ITS Clade 5 (S.alba, S pentandra, S. acmophylla, and S. lucida).Specimens of S. acmophylla and S. lucida were not included in their study and specimens of S. fragilis were absent from ours, thus precluding an exact comparison.However, the relative placement of S. amygdaloides in the two cladograms appears to be incongruent, basal to the rest of the genus (along with S. humboldtiana and S. gooddingii) in ours (Figure 2) but derived and sister to S. alba in theirs (Figure 4 in Leskinen and Alström-Rapaport, [14]).We suggest that our matK Clades 1 and 2 are homologous to the rbcL Clades 1 and 2, respectively, of Azuma et al. [13].Our Clade 2 contains three of the four taxa present in their Clade 2 (S.reticulata, S. bebbiana, and C. arbutifolia), the one exception being S. discolor which occurs in our Clade 1.Our Clade 1 contains four of the five taxa present in their Clade 1 (S.interior, S. amygdaloides, S. pentandra, and S. alba), the one exception being S. chaenomeloides, which occurs in our Clade 2. We believe both of these incongruencies can best be explained in terms of chloroplast capture events (see above discussion and Figure 4(b)).
Chosenia arbutifolia is clearly nested within the genus Salix in both our ITS and matK results, entirely consistent with the findings of Leskinen and Alström-Rapaport [14] and Azuma et al. [13].

Implications for Subgeneric
Taxonomy.Due to the propensity and prevalence of hybridization among species of Salix, demonstrated in the garden and observed in the field, and the potentially confounding effects of chloroplast capture on phylogenetic hypothesizing, we opt to compare our ITS-based phylogeny with the most recent Salix infrageneric taxonomy [17].That being said, we must confess to a certain degree of skepticism as to the accuracy of the ITS phylogeny (with respect to an organismal phylogeny) owing to the potentially confounding effects of concerted evolution acting on rDNA in heterozygous hybrids.Nonetheless, a comparison is needed and the ITS tree is thought to more closely reflect actual species relationships in Salix.
In his treatment of Salix for The Flora of North America, Argus [17] proposes five subgenera: Vetrix, Salix, Chamaetia, Longifoliea, and Protitea, each of which was represented in this study by two or more species.
The relationship among the species of subgenus Chamaetia included in this study (S.herbacea and S. reticulata) was unresolved in the ITS phylogeny, but the possibility that these species ultimately form a monophyletic group cannot be dismissed (Figure 2).
All specimens of subgenus Longifoliae included in this study (S.exigua, S. exigua var.hindsiana, S. interior, S. taxifolia, and S. melanopsis) formed a moderately supported clade in the ITS phylogeny (Clade 3, Figure 2).These same specimens appear to be closely related in the matK phylogeny where specimens of S. exigua, S. exiqua var.hindsiana, S. interior, and S. melanopsis form a clade (Clade 6, Figure 1), to which S. taxifolia could ultimately be sister (Figure 1).
There were five specimens from subgenus Protitea included in our study, one each of S. amygdaloides, S. gooddingii, S. humboldtiana, S. acmophylla, and S. floridana.As discussed previously (see above) specimens of S. amygdaloides, S. gooddingii, and S. humboldtiana possessed identical ITS and matK sequences and in the ITS phylogeny they take a position that is basal (i.e.sister) to the remainder of the genus (Figures 1 and 2).However, the other two specimens of subgenus Protitea, S. floridana, and S. acmophylla, come out in a separate well-supported subclade in the ITS phylogeny composed of species from subgenus Salix (i.e., S. alba, S. lucida, and S. pentandra) (Clade 5, Figure 2) thus making the subgenus pronouncedly polyphyletic.However, Subgenus Protitea may correspond, in part, to a natural and basal group of Salix species.
There were six species from subgenus Salix included in our study and specimens of three of them (S.pentandra, S. lucida, and S. alba), along with a specimen of S. acmophylla (subgenus Protitea) formed a strongly supported clade in the ITS phylogeny (Clade 6, Figure 2).A S. pentandra-alba clade was also evident in the rbcL-based phylogeny of Azuma et al. [13].Four of the six specimens comprising ITS Clade 6 also possessed matK sequences that placed them in a wellsupported clade in the matK phylogeny (Clade 3, Figure 1), reinforcing the idea of common ancestry.A second specimen of S. lucida present in Clade 6 of the ITS phylogeny (Figure 2) appears in Clade 2 of the matK phylogeny, we suggest that this specimen has inherited a "foreign" matK sequence that has jumped species boundaries via hybridization-mediated introgression, that is chloroplast capture (see previous discussion.).The three remaining species of subgenus Salix included in this study (S.chaenomeloides, S magnifica, and S. triandra) occur outside of the ITS Clade 6 (Figure 2), to varying degrees.The placement of these specimens may reflect the influence of chloroplast capture, or they may be indicative of inaccuracies in the current infrageneric taxonomy.
The naturalness of Subgenus Vetrix is uncertain.In our ITS-based phylogeny the evidence of a monophyletic group concordant with subgenus Vetrix is equivocal.All seven of the subgenus Vetrix species included in our ITS analysis came out in the large polytomy (Clade 2, Figure 2) that also contained Chosenia arbutifolia, but in positions that could potentially be resolved into a nearly intact natural group.On the other hand, the matK phylogeny would seem to belie any naturalness of the subgenus due to the apparent polyphyly of S. sericea, S. cordata, S. discolor, and S. eriocephala.However, as discussed previously, we believe that in this case these apparent cases of polyphyly are actually homoplasies caused by hybridization-mediated introgression.4.3.Summary.The phylogenetic analysis of our ITS data set has produced results that generally support the infrageneric classification of Argus [17].All subgenera (i.e., Vetrix, Chamaetia, Longifoliae, Protitea, and Salix, as represented by specimens in our data set appear, for the most part, to be monophyletic, or potentially monophyletic.The phylogenetic analysis of our matK data set is generally consistent with this conclusion.Incongruence between chloroplast and nuclear gene sequence phylogenies probably reveals only a small part of the influence of hybridization, that is, only in cases where actual chloroplast transfers have occurred.Subgenus Salix is particularly rich in taxonomic problems because it is defined by different subsets of ancestral characters.Given the wealth of potential ecological and evolutionary studies afforded by willows we believe a concerted and comprehensive effort should be made, using all suitable forms of macromolecular, morphological, anatomical, and physiological characteristics, to derive the definitive phylogenetic hypothesis for the genus to support future research involving species of Salix.

Figure 3 :
Figure 3: Chloroplast matK haplotype network.Boxes with dashed lines represent inferred ancestral haplotypes, boxes with solid lines represent extant haplotypes, species found to possess each haplotype are listed inside.Tic marks represent mutations and arrowheads indicate the forward direction of mutations (as inferred from outgroup analysis.)* homoplasious substitution.Taxa represented by more than one specimen are color coded.

Figure 4 :
Figure 4: (a) matK haplotype network superimposed on ITS-based phylogenetic outline, assuming vertical transmission of all cp haplotypes.(b) matK haplotype network superimposed on ITS-based phylogenetic outline, assuming lateral transmission (dashed lines) of cp haplotype due to introgression (see Figure 3 for explanation of symbols.).
(21)ecimens from which only matK sequences were obtained.onespecies.All seven of the species represented by multiple specimens in this study (i.e., S. bebbiana, S. cordata, S. sericea, S. eriocephala, S. lucida, S. pentandra, and S. herbacea) were found to possess multiple matK haplotypes (Figure3).3.3.Phylogenetic Relationships within SalicaceaeInferred from ITS Sequences.Both MP and ML analyses of the ITS sequences resulted in nearly identical topologies; therefore we present only the ML results here (Figure2).Maximum Likelihood analysis of the ITS sequence data indicates that a majority of the Salix species in the data set(21)occur in a moderately supported trichotomy (Clade 1, 68%) that is sister to those species possessing the redundant ITS Sequence 2, (i.e., S. humboldtiana, S. amygdaloides, and S. gooddingii.)This trichotomy (i.e., Clade 1) consists of a single specimen of S. triandra; a well-supported clade composed of two specimens each of S. lucida and S. pentandra S. exigua, S. interior, S. exigua var.hindsiana, S. melanopsis, and S. cordata examined (Clade 2, <50%).This polytomy is also home to specimens possessing the redundant ITS Sequences 1, 3, and 4. Three clades are evident within this large polytomy: (i) Salix taxifolia, S. exigua, S. interior, S. exigua var.hindsiana, and S. melanopsis (Clade 3, 79%); (ii) S. eriocephala and S. cordata (three and two specimens each, resp.)(Clade 4, 56%); (iii) S. chaenomeloides and S. bebbiana (Clade 7, 66%) (Figure [17]mum likelihood analysis of matK sequences.Numbers on branches are Bayesian posterior probabilities.Only values greater than 50% are shown.Color codes correspond to the subgeneric classification scheme of Argus[17]Boxed OTUs represent taxa with redundant sequences.
contemporary chloroplast haplotypes present in S. lucida no. 1 and S. lucida no. 2 is the inferred ancestral haplotype of the entire genus (i.e.Haplotype I in Figure