Genome-Wide Identification, Phylogeny, and Expression Analysis of ARF Genes Involved in Vegetative Organs Development in Switchgrass

Auxin response factors (ARFs) have been reported to play vital roles during plant growth and development. In order to reveal specific functions related to vegetative organs in grasses, an in-depth study of the ARF gene family was carried out in switchgrass (Panicum virgatum L.), a warm-season C4 perennial grass that is mostly used as bioenergy and animal feedstock. A total of 47 putative ARF genes (PvARFs) were identified in the switchgrass genome (2n = 4x = 36), 42 of which were anchored to the seven pairs of chromosomes and found to be unevenly distributed. Sixteen PvARFs were predicted to be potential targets of small RNAs (microRNA160 and 167). Phylogenetically speaking, PvARFs were divided into seven distinct subgroups based on the phylogeny, exon/intron arrangement, and conserved motif distribution. Moreover, 15 pairs of PvARFs have different temporal-spatial expression profiles in vegetative organs (2nd, 3rd, and 4th internode and leaves), which implies that different PvARFs have specific functions in switchgrass growth and development. In addition, at least 14 pairs of PvARFs respond to naphthylacetic acid (NAA) treatment, which might be helpful for us to study on auxin response in switchgrass. The comprehensive analysis, described here, will facilitate the future functional analysis of ARF genes in grasses.


Introduction
Auxin, an essential plant hormone, plays vital roles in various aspects of plant growth and development, such as embryogenesis, organogenesis, tropic growth, shoot elongation, root architecture, flower and fruit development, tissue and organ patterning, and vascular development [1][2][3][4][5][6][7][8][9]. Most of these processes are controlled by auxin response genes, which are regulated at transcriptional level by cis-acting DNA elements in their promoter regions, including the auxin response element (AuxRE, TGTCTC), core of auxin response region (AuxRR-core, GGTCCAT), and TGA-element (AACGAC). Of these, AuxREs are reported to be specifically bound and regulated by a class of transcription factors, called auxin response factors (ARFs) [10,11]. ARF proteins generally contain a DNA-binding domain (DBD) in the amino (N)-terminal region, a central region that functions as an activation domain (AD) or a repression domain (RD) [12,13], and a carboxyl (C)-terminal dimerization domain (CTD), which is a protein-protein interaction domain that mediates ARF homo-and heterodimerization and also the heterodimerization of ARF and Aux/IAA proteins, another category of auxin response regulators [12][13][14][15][16].
Because of their important roles in auxin signaling pathways, which are indispensable to plant growth and development, ARF gene families have been studied in many plant species. For example, there are 23 ARF transcription factors in Arabidopsis (Arabidopsis thaliana) [17], 25 in rice (Oryza sativa) [18], 39 in poplar (Populus trichocarpa) [19], 24 in Medicago truncatula [20], and 36 in maize (Zea mays) [21]. In previous studies, ARF proteins were split into three clades (clades A, B, and C) based on phylogenic relationships, which could be traced back to the origin of land plants [22]. In particular, phylogenetic analysis of the ARF gene family in many species has been widely reported. Arabidopsis ARFs were divided into four subgroups, which is in accordance with the phylogenetic classifications of ARFs in rice [18], banana (Musa acuminata L.) [23] and Salvia miltiorrhiza [24]. Maize and poplar ARFs are classified to six subgroups [25], whereas Medicago ARFs were divided into eight subgroups [20]. In general, the wide variety of ARF phylogenetic grouping patterns are based on the diversification of its gene structure and motif locations, which may be the result of gene truncation or alternative splicing [22].
Biochemical and genetic analyses have established the crucial roles of ARF genes in plant growth and development. In Arabidopsis thaliana, AtARF2 regulates floral organ abscission, leaf senescence, and seed size and weight [26][27][28]. AtARF5 affects vascular development and early embryo formation [29]. AtARF8 controls the uncoupling of fruit development from pollination and fertilization, and lossof-function mutations in these genes result in seedless fruit [30]. AtARF7 and AtARF19 redundantly regulate auxinmediated lateral root development [31]. In rice, OsARF1 is required for vegetative and reproductive development [32]. OsARF16 is essential for iron and phosphate deficiency responses in rice [33]. In addition, some ARF genes are involved in the response to abiotic stresses, such as drought, salt, or cold stress [34,35]. Taken together, these studies have shown that the ARF gene family function in multiple signal transduction pathways to regulate multiple aspects of plant growth and development.
Switchgrass (Panicum virgatum L.) is a warm-season C4 perennial grass used as a bioenergy and animal feedstock [36,37]. To avoid competing with food crops for arable fields, a large proportion of switchgrass fields will be located on marginal lands where various abiotic stresses, such as salt, drought, and extreme temperatures. The genome sequence of switchgrass has been published recently [38] and provides a powerful resource to identify ARF gene family members. Considering the value of switchgrass as a bioenergy and animal feedstock, we mainly focused on vegetative organs in this study.
Here, we identified 47 switchgrass ARF genes and comprehensively characterized the physical location, conserved motif architecture, and expression profile of the PvARFs. We also subdivided these 47 PvARF genes based on phylogenetic relationships based on the well-studied ARF genes in other species. To determine which ARF genes potentially work on different developmental processes, the temporalspatial expression pattern in vegetative organs (2nd, 3rd, and 4th internode and leaves) and expression response to auxin treatment in seedlings were determined by real-time PCR (qRT-PCR). Our works provide preliminary information about ARF genes in switchgrass and lays the foundation for the further elucidation of the biological roles of ARF genes in grasses.
For auxin treatments, plantlets grown in tissue culture until 20 days after rooting were incubated for 1, 2, and 3 h in hormone-supplemented 5 μM naphthylacetic acid (NAA) medium [18]. Control plants were grown in hormone-free medium. Whole seedlings were sampled from NAA-treated and control plants at the same time points. All experiments included three biological replicates. All of the samples were stored at −80°C.

Sequence
Retrieval and Identification. The conserved ARF domain based on the Hidden Markov Model (HMM) (Pfam06507) was obtained from the Pfam protein family database (http://pfam.sanger.ac.uk/) and used as a query to search against the switchgrass genome database in Phytozome v11 (http://www.phytozome.net/). Sequences were selected for further analysis if the E value was less than 1e-10. Several coding sequences (CDS) were corrected based on the switchgrass unique transcript sequence database [38]. Peptide length, molecular weight, and isoelectric point of each PvARF were calculated using the online ExPASy program (http://www.expasy.org/).

Phylogenetic Analysis.
The putative PvARF proteins from another fifteen species were used to construct a phylogenetic tree. ARF protein sequences were obtained from the public genome database Phytozome. The BlastP program was used to identify putative ARF proteins from the genomic databases of well-sequenced species, including Arabidopsis, sweet orange (Citrus sinensis), Chinese cabbage (Brassica rapa), poplar, Medicago (Medicago truncatula), cotton (Gossypium raimondii), Grandis (Eucalyptus grandis), soybean, tomato (Solanum lycopersicum), grape (Vitis vinifera), maize, rice, foxtail millet (Setaria italica), sorghum (Sorghum vulgare), and Brachypodium distachyon. Multiple sequence alignments of the full-length ARF sequences were performed using Clustal X1.83, and the edges of the alignments were manually trimmed. An unrooted neighbor-joining (bootstrap value = 1000) tree was constructed using MEGA5 and then

Chromosomal
Locations of PvARF Genes. The lowland switchgrass cultivars are allotetraploid (2n = 4x = 36) and consist of two highly homologous subgenomes, designated as Chr.a and Chr.b [40]. Specific physical locations of each PvARF were obtained from the Phytozome database. Chromosome locations were then determined using MapChart 2.2 based on the genetic linkage map [41,42]. Tandem gene duplicates were defined as paralogous genes located within 50 kb and separated by fewer than five nonhomologous spacer genes [43].  (Phytozome v11) by BlastP and tBlastN program. A total of 47 putative ARF proteins were found, and Pfam analysis confirmed that all of these proteins contain the ARF domain. The putative candidates were designated as PvARF1 to PvARF47, based on the alignments of predicted amino acid sequences. The predicted PvARF proteins ranged from 243 (PvARF34) to 1182 (PvARF13) amino acids (aa) in length and from 27.4 kDa to 130.9 kDa in molecular weight ( Table 1). The isoelectric points (pI) ranged from 5.33 (PvARF8) to 9.52 (PvARF25) ( Table 1), suggesting that different PvARFs might have roles in specific subcellular environments.

Gene
To examine the chromosomal distribution of PvARFs, the physical locations of the PvARFs on chromosomes (Chrs) were obtained through BlastN searches against the switchgrass genome database in Phytozome. Due to the allotetraploidization of switchgrass (2n = 4x = 36), the PvARF genes exist as paralogous gene pairs in the genome with only one exception, PvARF39, which might be lost in the evolutionary process. Of the 47 PvARFs, 42 were putatively anchored onto seven of the nine switchgrass chromosomes (Figure 1), while the other five PvARFs (PvARF19, 20, 39, 42, and 43) are located on unmapped scaffolds. The chromosomal distribution and density of PvARF genes are not uniform. Chr 1, 4, 5, and 7 contain four PvARF gene pairs, respectively. Chr 3 has three pairs, Chr 6 and 9 have only one pair of PvARFs, and no gene is located on Chr 2 and 8. Consistent with expectations, 14 gene pairs obviously exist on the two set of chromosomes (Figure 1), while the other 7 gene pairs were putatively located on the chromosomes based on their sequence similarity. The indeed relationships among these PvARFs need to be explained by phylogenetic analysis. An unrooted neighbor-joining (bootstrap value = 1000) tree was constructed using MEGA5 on the basis of multiple alignments of conserved domain sequences of the ARF proteins from monocot species (switchgrass, foxtail millet, maize, sorghum, rice, and Brachypodium) and dicot species (Arabidopsis, sweet orange, Chinese cabbage, poplar, Medicago, cotton, soybean, tomato, Grandis, and grape). And the detailed information was listed in Table S2. selected ARFs from another 15 species, which have public genome database in Phytozome, to construct a phylogenetic tree together with PvARFs. These species include five monocots (foxtail millet, maize, sorghum, Brachypodium, and rice) and ten dicots (Arabidopsis, sweet orange, Chinese cabbage, poplar, cotton, soybean, Medicago, tomato, Grandis, and grape). Seven separate clusters of ARF proteins were defined based on the NJ tree topology and bootstrap values (higher than 50%) ( Figure 2). As previously reported by Finet et al., three large groups of ARF proteins were classified as clades A, B, and C. In detail, clusters I and II in our study together make up clade C, and clusters III, IV, and V make up clade A. ARF members in these two clades are considered to be more ancient than those in clade B [22], which comprises clusters VI and VII in our study.
Considering that switchgrass is an allotetraploid plant, the gene number of PvARFs in each cluster should be approximately twice than the other monocots, especially in foxtail millet, the most closely relatives to switchgrass among the selected species (Table S2). Cluster VII, which has the largest number (11 out of 47) of PvARFs, contains six foxtail millet ARFs. Cluster I, III, IV, and VI also contain eight switchgrass and four foxtail millet ARFs, and clusters II and V have only two PvARFs, respectively (Table S2).
PvARF39 in cluster VII is most closely related to the foxtail millet ARF protein Seita.8G135700.1, which indicates that the paralog of PvARF39 has been lost or mutated gradually during the evolutionary process of switchgrass genome.
In order to comprehensively clarify the evolutionary process of the PvARFs, we carried out the tandem repeat duplication analysis based on the chromosomal location and phylogenetic analysis of the PvARFs. The results showed that no tandem repeat duplication events were found in PvARFs. In addition, we calculated the Ka/Ks analysis between PvARFs and OsARFs. Compared with rice ARF genes, 18 pairs of orthologs originated from positive selection (Ka/Ks ratio was larger than 1), while 6 orthologs showed purifying selection (Ka/Ks ratio was less than 1) ( Table 2).

PvARF Gene Structures and Locations of Conserved
Motif. To better understand the phylogenetic relationships of the PvARFs, the exon/intron arrangements were determined by aligning cDNA sequences to genomic sequences. Another phylogenetic tree was firstly constructed only using switchgrass ARF protein sequences. The PvARF genes were clearly displayed in the form of gene pairs (Figure 3(a)), which confirmed the previous speculation in the chromosomal distribution (Figure 1). All PvARFs have introns in the coding sequence (CDS), and the number of introns ranges from 2 to 14 (Table 1, Figure 3(b)). In particular, members belonging to clade C (clusters I and II) contain relatively fewer introns (two to four). In contrast, PvARFs in clade A (clusters III, IV, and V) have much more introns (11 to 14), with the exception of PvARF2, which might have lost the exons in N-terminus. The number of introns in clade B (clusters VI and VII) were ranging from 5 to 13. This variability of intron number might be correlated to the multiple functions of clade B ARFs in higher plants. Additionally, we further identified the putative microRNA target sites of ARF genes in switchgrass. 16 out of 47 PvARFs were found to contain the potential microRNA target sites. Eight PvARFs were predicted to be the targets of miR160 and miR167, respectively ( Figure 3(b), Figure 4).
Analysis of motif locations in PvARF proteins was performed to explore structural diversity and to predict their functions. A total of 12 conserved motifs were identified using the MEME program (Figure 3(c), Figure S1). The DNAbinding domain (DBD) (motifs 1, 2, and 9 corresponding to Pfam02362) was lost in four members (PvARF22, 25, 25, and 38). The ARF domain (motifs 3, 5, 8, and 11 corresponding to Pfam06507) exists in all PvARFs. The AUX/IAA domain (motifs 4 and 10 corresponding to cl03528) has been lost in almost all of the PvARFs in clusters I, II, and VI. These results confirm the phylogenetic relationship between the PvARFs in clades A/ C and B and indicate that there has been functional differentiation among PvARFs in different clusters.

Expression Patterns of PvARFs in Different Organs of Switchgrass.
To analyze the expression levels of PvARFs, we firstly acquired the probesets of the PvARFs in switchgrass expression atlas from the public database [38]. The results showed that PvARFs were expressed in root, node, internode,  Figure 5).
Biomass yield is one of the most important criteria used to evaluate the quality of switchgrass. Vegetative organs, especially internodes and leaves, are the primary sources of biomass. Auxin is one of the most important phytohormone, which regulate the plant growth and development. To PvARF47   86   70   74   100   100   99   99   99   85   81   III  III  III  III  III  III  III  III  IV  IV  IV  IV  IV  IV  IV  IV  V  V   VI  VI  VI  VI  VI  VI  VI  VI  VII  VII  VII  VII  VII  VII  VII  VII  VII  VII  VII   II  II  I  I  I  I  I  I  I  I   5′  0  1  2  3  4  5  6   investigate whether and how PvARFs work on vegetative organs, we selected the second, the third, and the fourth internode and the corresponding leaves at the second reproductive (R2) stage to test the expression profile of the PvARFs by qRT-PCR analysis. Eight pairs of PvARF genes and PvARF39 are not expressed or are extremely lowly expressed in the tested tissues, whereas the other 15 pairs of PvARF genes show substantial expression in internodes and leaves. In internodes, 14 pairs of PvARFs have higher expression levels in the upper internode (I4) than the other two internodes (I2 and I3), with one exception (PvARF17/18) having no obvious difference in the expression level in the three internodes ( Figure 6). In leaves, ten pairs of PvARF genes (PvARF3/4, 15/16, 21/22, 23/24, 29/30, 35/36, 37/38, 40/41, 44/45, and 46/47) show no significant changes in expression level in the three tested leaves. There was lower expression of PvARF5/6, 7/8, and 17/18 in the upper leaf (L4) than in the bottom leaves (L2 or L3), whereas PvARF11/12 and 42/43 are more highly expressed in L4 compared to L2, and even lower expression is observed in L3 ( Figure 6). These results suggest that the biosynthesis and transport of endogenous auxin in switchgrass might affect the expression profile of PvARF genes, especially in the internode.

Expression Analysis of PvARFs in Response to Auxin
Treatment. In order to clarify the biofunctions of PvARF proteins, cis-acting DNA elements were analyzed using 2 kb promoter sequence of PvARFs. The results showed that PvARFs in different clades were putatively involved in specific process. For example, PvARFs in clade A might participate in nodule formation, while clade B genes function on wounding response (Table S3). However, all of the PvARF proteins mostly tended to be involved in plant growth and development, such as phytohormone signaling, abiotic stress response, carbon metabolism, pollen development, and so on (Table S3).
As a key component of the auxin signaling pathway, ARF proteins play vital roles in auxin response. It has been reported that auxin induces or represses the expression of some ARF genes in Arabidopsis [31], rice [18], and maize [21]. To examine the response of PvARF genes to the exogenous auxin, one-month-old switchgrass seedlings were treated with 5 μM NAA for 0, 1, 2, and 3 hours, and the expression patterns of the PvARFs were determined. The qRT-PCR results revealed that auxin repressed the expression of eleven pairs of genes (PvARF5/6, 7/8, 11 /12, 15/16, 29/30, 35/36, 37/38, 40/41, 42/43, 44/45, and 46/47) at all three time points, whereas it induced the expression of three pairs of genes (PvARF3/4, 23/24, and 25/26) at 1 hour and then reduced at the latter two time points. In contrast, PvARF21/22 expression was not significantly affected by auxin (Figure 7). These results suggest that exogenous auxin could induce or repress the expression of most PvARF genes to regulate switchgrass growth and development.

Discussion
Extensive studies have shown that ARFs play crucial roles in plant growth and developmental processes [10]. However, a systematic analysis of ARF gene members in switchgrass has not been done. In this study, we identified 47 PvARF genes, almost twice than Arabidopsis (23) and rice (25) [18,31], for the reason of allopolyploidization in switchgrass evolutionary process [41]. Gene structure analysis showed that the number of exons in PvARF genes ranged from 3 to 14, while similar results were found in Arabidopsis [31], rice [18], and tomato [9], which indicates that the plant ARF gene family has highly conserved structures and potentially similar functions across dicotyledonous and monocotyledonous plant species. Based on phylogenetic analysis, 47 switchgrass ARF genes were assigned to seven separate clusters, which was similar to the previous studies [22]. The number of ARFs of switchgrass in each cluster was about twice than those of five other monocots (maize, rice, sorghum, foxtail millet, and Brachypodium) but not consistent with the number of ARFs in ten dicot species (Arabidopsis, citrus, Chinese cabbage, poplar, cotton, soybean, Medicago, tomato, Grandis, and grape), which indicates that differences in the evolution of ARF genes in monocotyledonous and dicotyledonous plants. According to the phylogenetic tree of ARFs from different species, the orthologous relationship was found to dramatically divorce. In cluster V (PvARF17/18-AtARF5-CiARF5-BrARF5-1/5-3-PoptrARF5.1/5.2-GrARF5a/5b-EgrARF5-GmARF40/47-Sl-AF5-VvARF18-ZmARF4/29-OsARF11-Seita.3G028100.1-Sobic.006G255300.1-Bra-di5g25157.1), the ratio of orthologous gene number between species is 1 : 1 which suggests that the functions might be well-conserved across species. Orthologous clusters with ratios greater or smaller than 1 : 1 were also found, indicating the functional diversity of ARF genes in switchgrass.
In general, the members of a subgroup are characterized by the presence of conserved domains. According to previous studies, the ARF genes contain several conserved domains, such as motifs 1, 2, and 9 made up the DNA-binding domain,  Switchgrass is an important resource for bioenergy and feedstock materials, and biomass yield is the most important target in molecular breeding of switchgrass. Comprehensive analysis of PvARF gene expression patterns helped us screen for candidate PvARF genes with potentially distinct functions in regulating vegetative organ growth and development. Taken together, 15 pairs of PvARF genes were detected to have high levels of expression in vegetative organs. Similar patterns of expression were also found in tea plants [48], apple [47], and tomato [9]. 13 of the 15 CsARF genes were expressed in root, stem, leaf, flower, and fruit [24]. Eight of the 31 MdARF genes were expressed in stem, leaf, flower, and fruit [46], and 17 SlARF genes were expressed in root, stem, leaf, flower bud, and ovary [9]. In our study, 14 pairs of PvARFs were more highly expressed in the I4 than in the I2 and I3, which suggests that these genes might play vital roles during the formation of young stems, and these results are consistent with the reported function of their homologous genes in Arabidopsis [26,49]. However, in leaves, the expression level of most PvARFs did not change significantly in different developmental stages. PvARF11/12 is most highly expressed in the fourth leaf, suggesting that these genes may play roles in leaf development like their Arabidopsis   Figure 7: The expression of PvARF genes in response to treatment with 5 μM NAA solution for 1, 2, and 3 hours. Control plants were grown in hormone-free medium. Error bars represented variability of qRT-PCR results from three replicates. Data are means ± SE of three separate measurements. Statistically significant differences were assessed using Student's t-tests ( * * P < 0 01).
homologs, ARF7 and 19 [31]. Of the 47 PvARF genes, 17 were not expressed in the tested tissues, which indicates that they might not function in these organs, or that the functions of these genes may have been lost during evolution. In general, most PvARF genes have different expression profiles in the internodes and leaves, indicating that they might be regulated by the distribution and concentration of endogenous auxin. The in-depth studies will be needed to confirm these results in future. Because ARF proteins are transcription factors that regulate the expression of auxin response genes, we determined the response of PvARF genes to NAA treatment. The regulation of gene expression in response to auxin has been reported in Arabidopsis [3,31], rice [20,32,34], maize [21], tomato [2], Medicago [20], and so on. In this study, we found that at least 14 pairs of PvARF genes were responsive to NAA treatment in seedlings but showed diverse expression patterns. Eleven pairs of PvARF genes (PvARF5/6, 7/8, 11 /12, 15/16, 29/30, 35/36, 37/38, 40/41, 42/43, 44/45, and 46/47) were downregulated by exogenous auxin treatment across all time points, indicating that their expression was negatively regulated by NAA, similar to their homologs in rice and maize (OsARF5, 14, and 21 and ZmARF5 and 18), of which expression levels decreased marginally in response to auxin [21,32]. In contrast, the other three pairs of PvARF genes (PvARF3/4, 23/24, and 25/26) were upregulated by auxin treatment at 1 hour point and then downregulation at later time points, indicating that NAA significantly induced the target gene in a short period of time, as their homologs in Arabidopsis, rice, and maize (AtARF4, 19; OsARF1, 23; and ZmARF3, 8, 13, 15, 21, 27, and 30). In brief, expression of these ARF genes increased slightly in response to auxin [21,31,32], implying that these genes are potential primary auxin responsive genes. Generally, the expression level of the ARF genes was directly regulated by auxin. Considering that the endogenous auxin concentration is sufficient for plant growth and development, the extra auxin (NAA) applied exogenously might act as inhibitor of auxin response genes in our study, and the further study will be carried out in the future to clarify the mechanism of auxin response in grasses.

Conclusions
We identified 47 switchgrass ARF genes and established the evolutionary relationship between these genes using phylogenic, gene structure, and conserved protein motif analyses. Expression analyses revealed the potential role of PvARF genes involved in growth and development of switchgrass internodes and leaves and in response to NAA treatment in seedlings. These data provide a solid foundation for future functional characterization of ARF genes and ARF-mediated signal transduction pathway in switchgrass.