Evolutionary Pattern and Regulation Analysis to Support Why Diversity Functions Existed within PPAR Gene Family Members

Peroxisome proliferators-activated receptor (PPAR) gene family members exhibit distinct patterns of distribution in tissues and differ in functions. The purpose of this study is to investigate the evolutionary impacts on diversity functions of PPAR members and the regulatory differences on gene expression patterns. 63 homology sequences of PPAR genes from 31 species were collected and analyzed. The results showed that three isolated types of PPAR gene family may emerge from twice times of gene duplication events. The conserved domains of HOLI (ligand binding domain of hormone receptors) domain and ZnF_C4 (C4 zinc finger in nuclear in hormone receptors) are essential for keeping basic roles of PPAR gene family, and the variant domains of LCRs may be responsible for their divergence in functions. The positive selection sites in HOLI domain are benefit for PPARs to evolve towards diversity functions. The evolutionary variants in the promoter regions and 3′ UTR regions of PPARs result into differential transcription factors and miRNAs involved in regulating PPAR members, which may eventually affect their expressions and tissues distributions. These results indicate that gene duplication event, selection pressure on HOLI domain, and the variants on promoter and 3′ UTR are essential for PPARs evolution and diversity functions acquired.


Introduction
Peroxisome proliferators-activated receptors (PPARs) are transcription factors belonging to the ligand-activated nuclear receptor superfamily, which play key roles in regulating metabolism, inflammation, and immunity. In vertebrates, the gene family of PPAR consisted of PPAR , PPAR (also called PPARb/d or PPAR ), and PPAR [1]. Recently, a considerable number of papers have reviewed their importance in functions within various physiological and biochemistry processes [2][3][4][5]. Their special effects and functional manners of depending on a ligand-activated way even have attracted some scientists to consider them as a drug target for therapy of some metabolic disorders, such as the type 2 diabetes mellitus and atherosclerosis [6].
It has been well established that the PPARs can be divided into five distinct functional regions, which include DBD (DNA-binding domain), LBD (ligand-binding domain), AF1 (activation function 1), AF2 (activation function 2), and a variable hinge region. The DBD and LBD consist of a highly conserved DNA-binding domain and a moderately conserved ligand-binding domain, respectively. The AF1 and AF2 are two ligand-independent activation function domains. All these regions except the variable hinge region are highly conserved among PPAR members and are responsible for keeping their functions [3]. Although the PPARs share high similarities with each other in structures, they exhibit distinct patterns of distribution in tissues and differ in functions [7]. It has been summarized that PPAR mainly is involved in the oxidation process of hepatocytes, PPAR mainly targets within the adipocyte proliferation, and PPAR plays essential roles in origination and fate determination of preadipocyte. In adult rat, it has shown that PPARs had different expression patterns [8]. Definitely, PPAR is highly expressed in hepatocytes, cardiomyocytes, enterocytes, and the proximal tubule cells of kidney, PPAR is expressed ubiquitously and often at higher levels than PPAR and PPAR , and PPAR is expressed predominantly in adipose tissue and the immune tissues [4].
It is interesting to investigate why PPARs exhibit distinct patterns of distribution in tissues and differ in functions even if they share high similarity of regions. There may be at least two main aspects of molecular reasons accounting for their differences. Firstly, it could be explained by the molecular evolutionary process, for example, the gene duplication event and the selective patterns. PPAR gene family as one of the nuclear hormone receptor (NHR) superfamilies evolves together with other NHR members. It has been demonstrated that a large number of NHR members are likely to result from two waves of gene duplication events. The first wave occurs before the arthropod/vertebrate divergence and has generated the ancestors of the NHR subfamilies, for instance, PPARs, RARs, and RXRs. The second wave of duplication is vertebrate-specific and leads to a diversification inside the subfamilies, with the emergence of the presently known isotypes such as PPAR , PPAR , and PPAR [3,7]. However, it is still unknown which one is the common ancestor gene in PPAR members, and what the impacts of PPARs divergence on their functions are. Secondly, the special transcriptions factors binging in the promoter regions and the miRNAs target at 3 UTRs of PPARs may be responsible for the distinct patterns of distribution in tissues. Numerous reports have established the basis for gene expression patterns in distribution by predicting and comprising the transcription factors and miRNAs of interested genes [9].
Therefore, in this present study, we took advantage of the availability of gene sequence data to analyze the PPAR gene family based on a view of molecular evolutionary relationship by deducing the possibility of evolution in PPAR gene family, as well as by predicting and comparing their transcription factors and miRNAs to primarily understand the reasons for diversity functions and distinct patterns of expressions in tissues of PPAR members. These analyses may contribute to a comprehensive understanding for the functions of PPAR gene family.

PPAR Gene Homology Sequence Collection. The Genomic
Blast function (http://blast.ncbi.nlm.nih.gov/Blast.cgi) was used for collecting homologous sequences of PPAR gene family members in species. The parameters were set as the default value. For the minority of the PPAR gene sequences unfound by blast, we separated supplement in the website of Nucleotide database (http://www.ncbi.nlm.nih.gov/nuccore/) by manually using keywords. Through blasting the homology sequences of PPAR , PPAR , and PPAR on NCBI, we finally obtained 63 homology sequences that belong to 31 species (Table S1 in Supplementary Material available online at http://dx.doi.org/10.1155/2015/613910). Most of these sequences were from mammals, and a few of them were obtained from fish and birds. These collected sequences were edited and aligned by the MegAlign in DNAStar (Madison, Wisconsin, USA).

Search for Protein Domains.
The open reading frames (ORF) of PPAR sequences in different species were predicted using online software (http://www.ncbi.nlm.nih.gov/gorf/or- fig.cgi). Next, these ORF sequences were confirmed by Pfam (http://pfam.sanger.ac.uk/). Only if there were homology amino acid sequences blasted, Pfam would show the ORF sequences being correctly predicted. Furthermore, the correct amino acid sequences were entered into SMART (http://smart.embl-heidelberg.de/) platform for a prediction of protein structure domain.

Construction of Phylogenetic
Tree. The format of each PPAR homologous protein sequence was edited by BioEdit software [10]. Then, the protein sequences were used for constructing phylogenetic tree through a model of maximum likelihood method (ML) by Mega 5.1 [11]. The topological stability of the maximum likelihood tree was evaluated by 1000 bootstrap replications. The Atlantic salmon PPAR protein sequence (NM 001123546.1) was selected as the outgroup of the protein phylogenetic tree.

Amino Acid Site Selection Pressure Analysis.
The sequences of two conserved protein domains (ZnF C4 and HOLI domains) were chosen and compared by BioEdit, and then they were classified and merged. According to the analysis of Bayesian tree phylogeny, we used the site model in PAML software package in Codeml program [12] to analyze these two domains.
The site model was constructed to test whether PPAR gene is subjected to positive selection ( > 1) or negative selection ( < 1) [13]. This model allows different sites to have different selection pressure, while there is no difference in different branches of the phylogenetic tree. The models named M1a (neutral) and M2a (selection) [13,14] in the current study were used twice the log-likelihood difference (2Δ ) following 2 distribution of likelihood ratio test (LRT), the difference degree of freedom for the two parameters of the model number.

Analysis of Transcription Factors.
By using Gene (http:// www.ncbi.nlm.nih.gov/gene/) of the NCBI, the location of the PPAR gene was determined on the chromosome corresponding species. And then, we confirmed the first exon of the PPAR gene transcription initiation site on a chromosome. Sequence about 1000 bp was selected to use as the predicted promoter regions from the upstream of the first exon. On the TRANSFAC, the Alibaba (http://www.gene-regulation.com/ pub/programs/alibaba2/index.html) can estimate transcription factor binding sites (TFBS) in unknown DNA sequences.  region of PPAR members of human was searched for miR-NAs. The search results were sorted in the miR2Disease Base (http://www.mir2disease.org/) for predicting functions of the predicted conservative miRNAs. Table S1, the coding regions of all PPAR nucleotides were in average length of about 1400 bp, which encode about 466 amino acids. The average length of nucleotides of PPAR coding domain is 1406 bp, whereas the average length of nucleotides of PPAR is 1284 bp which is lower than the average value of the entire PPAR family. The nucleotide of PPAR is 1479 bp which is obviously higher than the average value.

The Unique Homology and Conserved Domains in PPAR Gene Family. As it was shown in
The protein domains were predicted corresponding to each sequence in the coding region through SMART. The PPAR coding domain sequences in 7 representative species including human, xenopus, zebrafish, chicken, dog, pig, and mouse were obtained for a further analysis ( Figure 1). The data demonstrated that all PPARs family members contained the ZnF C4 and HOLI domains, which are conserved among species. In addition to the conserved domains, low complexity 1 and low complexity 2 regions (LCRs) were in great differences among PPAR members and species. In PPAR , it was found that LCR2 widely existed in most species, and LCR1 only existed in mice. It is also worth noticing that more than half of the studied species contained the LCRs domains in PPAR , except for the absence of LCR2 in xenopus. In PPAR , the LCR2 domain was only found in zebrafish, whereas the LCR1 domain was absent in all studied species.

The Phylogenetic Tree of PPAR Gene Family.
In order to investigate the homologous relationships among PPAR gene family members, we constructed phylogenetic tree based on the amino acid level. The phylogenetic tree was constructed based on the 63 amino acid sequences from 31 species (Table  S1), and the results were shown in Figure 2. The orthologs of PPAR members from fishes were placed at the base of the three branches of the tree. Furthermore, the PPAR genes were spitted into three lineages (support value = 100%). Through the branches and distances of the phylogenetic tree, PPAR and PPAR were clustered together. The branch of PPAR stood alone and was closer to the outgroup than the other two branches. PPAR might be the earliest ancestor form of the PPAR gene family. According to the classification, it suggested that the first independent duplication event may occur in bony fishes before separation from the birds and mammals during the whole evolutionary process of PPAR gene family. And after a second duplication event, the isolated types of PPAR and PPAR may emerge as the paralogs of PPAR .

Selection Pressure of Amino Acid Residues in PPAR Gene.
To determine the selection states of each amino acid site in conserved structure of PPARs during the evolution process, the tools of selective pressure were used for investigating the different selection patterns based on the conserved motifs of ZnF C4 domain and HOLI domain, which were widely included and conserved in PPAR gene family. In branchsite models (Table 1), we found the estimated value ≥ 1 with the M2a model for HOLI domain and ZnF C4 domain. It suggests that PPAR genes were under positive selection. By the LRT test, M1a and M2a were compared with their corresponding null models (M0), respectively. The results suggested that M2a ( < 0.05) was more in coincidence with the data than M1a ( > 0.05). What is more, the LRT tests of all PPAR members were different. The HOLI domain could be accepted by M2a, indicating a positive selection pressure of HOLI domain during the molecular evolution process, whereas the ZnF C4 domain was rejected.
In a 95% posterior probability, the results (Figures 3(a)  and 3(b)) showed that the positive selection sites in PPAR HOLI domain were 118G, 137S, and 143I, in PPAR HOLI domain 20S, 21S, 58S, and 117P, and in PPAR HOLI domain 16S and 75G, whereas in the ZnF C4 domain, there were no positive selection sites observed in all PPAR members, except for only one suspected amino acid residue with value between 0.5 and 1 observed in ZnF C4 domain of PPAR and PPAR , respectively. In PPAR ZnF C4 domain, there were no positive selection sites observed either.

Prediction of Transcription Factors.
The transcription factors and their binding sites in promoter regions of PPAR gene family were predicted in human and chicken, respectively, and the results were listed in Table S2. In chicken, 45, 44, and 39 transcription factors were predicted and targeted at the promoter regions of PPAR , PPAR , and PPAR , respectively. In human, only a total of 31, 36, and 40 transcription factors have been predicted at promoter regions of PPAR , PPAR , and PPAR , respectively, which were different from it in chicken.
Through comparing transcription factors, we found that numerous common transcription factors existed among PPAR members. Then they were compared pairwise among the three PPAR members, and the results were listed in Table 2. The PPARs shared 9 common transcription factors which were targeted at the promoter regions, including Sp1, CPE bind, CP1, Oct-1, GATA-1, AP-2 , NF-1, GR, and C/EBP in human, while in chicken, 11 common transcription factors were predicted and targeted at the promoter regions of chicken PPARs, which included CREB, SRF, ICSBP, Ftz, AP-1, Oct-1, GATA-1, AP-2 , NF-1, GR, and C/EBP . However, the binding sites for each common transcription factor were varied among PPAR members.
Finally, we quantified the coexisting transcription factors among PPAR members (Table 3). In human, the amount of the identical transcription factors between PPAR and PPAR was 18, while the amount between PPAR and PPAR is 16. The number of identical transcription factors of PPAR and PPAR was 12. In chicken, the group of PPAR / and PPAR / shared 20 and 15 identical transcription factors, respectively.

Prediction of miRNAs Target at the 3 UTR Region of PPAR
Members. The miRNAs in 3 UTR of PPAR members were predicted in human. The results (Table S3) showed that, in the 3 UTR region of PPAR , a total of 23 conserved binding sites of miRNAs were predicted in vertebrates, and 4 conserved sites of miRNA families were predicted in mammals. In the 3 UTR region of PPAR (Figure 4(b)) and PPAR (Figure 4(c)), 5 and 3 conserved sites of miRNA families were predicted in vertebrates, respectively. Notably, the miR-17 and miR-9 were predicted in both 3 UTR regions of PPAR and PPAR , and the miR-27abc and miR-128 were predicted in both 3 UTR regions of PPAR and PPAR (Figure 4(a)).
The functions of these miRNAs were enriched in PUB-MED online. Among the 27 miRNA families, the vast majority were closely related with cancer. For example, the miR-142-3p [15], miR-19a [16], and miR-124 [17] were reported to be involved in hepatocellular carcinoma; the miR-9 [18] targeting to the 3 UTR region of PPAR was associated with Hodgkin's lymphoma. In the 3 UTR region of PPAR , the miR-138 [19] were reported to be linked to anaplastic thyroid carcinoma; the miR-17 [20] was related to B-cell lymphoma; the miR-29c [21] was interrelated with chronic lymphocytic    Note: selection pressure on amino acid sites of the inspection is based on the calculation of / ( ), where is nonsynonymous coding sequences of each base mutation rate (nonsynonymous substitution rate) and is a synonymous mutation rate (synonymous substitution rate). When the > 1, the gene is by positive selection; = 1, no selection pressure; < 1, by purifying selection. leukemia. In the 3 UTR region of PPAR , the miR-128 [22] was associated with glioma.

Discussions
One new gene is mainly generated by the gene or genome duplication event [23]. PPARs as one of the NHR superfamilies evolve together with other NHR members, and after it has undergone twice time of gene duplication events, the vertebrate-specific PPAR is eventually diverged into three different isotypes [3,7]. The phylogenetic tree of PPARs in the present study demonstrated that PPAR gene family may have yielded a gene duplication event, which first occurs in bony fishes before separation from the birds and mammals during the whole evolution process. PPAR is closer to the outgroup than the other two branches, supporting that PPAR might be the original ancestor gene in PPAR gene family. After being firstly duplicated in fish, PPAR begins to divide into two subtypes, including the PPAR and the common ancestor of PPAR and PPAR . These findings are consistent with the previous studies by Michalik et al., which depicted an evolutionary process of PPARs. Moreover, PPAR and PPAR were clustered closer than others, supporting that they may originate from a homology ancestor gene, and their divergence may result from another gene duplication event in vertebrates; however, there is no sufficient evidence to support this hypothesis currently.
Following the gene duplication event in PPARs, the newly emerging receptors would have acquired the ligand binding capacities in an independent fashion [24]. Once such capacity was acquired, each receptor of PPARs may begin to further evolve and refine its specificity for a given ligand. Each PPAR isotype may then evolve by mutations, which lead to a more specific range of ligands across species. These hypotheses could be supported by the sequence variants among PPARs across species in the present study. Our results showed that all PPAR members contained the conserved HOLI and ZnF C4 domains, which are important for keeping the functions of PPAR gene family. HOLI domain located in N-terminal of the PPAR protein is also known as ligand binding domain of hormone receptors [25]. It belongs to the LBD region that acts in response to ligand binding, which caused a conformational change in the receptor to induce a response, thereby acting as a molecular switch to turn on transcriptional activity [26].
In addition, ZnF C4 domain is also called C4 zinc finger in nuclear hormone receptors. This domain was the DBD region, which recognizes specific sequences, connected via a linker region to a C-terminal LBD. Both HOLI and ZnF C4 domains are highly conserved among PPAR members and are responsible for keeping their basic functions for PPAR family members.
In addition to the two conserved domains, PPAR family contained low complexity regions (LCRs). LCRs located near the left of ZnF C4 domain are in great differences among PPAR members across species. Studies suggested that the positions of LCRs within a sequence might be important to both determine their binding properties and maintain biological functions [27]. There are no LCRs existing in PPAR , suggesting that PPAR might only keep the basic function of PPAR family. The number of LCRs in PPAR and PPAR is similar and obviously more than PPAR , indicating differential functions of PPAR and PPAR from PPAR . The results showed that the variants in LCRs might be involved in the diversity functions of PPAR members and supported a common origin of PPAR and PPAR .
Due to the reason that ZnF C4 and HOLI domain are important for keeping roles of PPAR members, we used patterns of selection pressure to analyze the adaptive evolution of the conserved protein sequences. The results showed that the HOLI domain was selected under a natural pressure in the evolutionary process, whereas the ZnF C4 domain was not. It showed that ZnF C4 domain was more conservative than HOLI domain in PPAR family, supporting a more important role of PPAR zinc finger in keeping PPARs' functions [28]. The HOLI domain in PPAR with the most amounts of positive selection sites among PPAR members suggested that the variations in these positive selection sites were more beneficial for PPAR phylogenetic towards diversity functions. Studies have confirmed that these chemical properties of amino acid residues were important to sustain normal protein folding and keep functions [29]. For instance, sulfhydryl groups of the peptide chain of two cysteines (cysteine, referred to as S) form two disulfide linkages with oxidation reaction. Whether it breaks or reshapes into a new one, it also could adjust protein to perform certain function [30]. Therefore, it can be inferred that the nucleotide variants in HOLI domain could be responsible for diversity functions of PPAR members. In a 95% posterior probability, the positive selection sites were 118G, 137S, and 143I in PPAR HOLI domain, were 20S, 21S, 58S, and 117P in PPAR HOLI domain, and were 16S and 75G in PPAR HOLI domain. It is interesting to point out that the positive selection sites in HOLI domain of PPAR and PPAR share more similarity in locations and amino acid residues, supporting a homology function of PPAR and PPAR .
The regulatory mechanism of gene expressions plays an important role in tissue distribution and distinct biological functions of genes. In eukaryotes, most genes are initiated and transcribed by lots of specific transcription factors targeting at their promoter regions [31]. Through predicting the transcription factors and their binding sites in promoter region of PPARs, we found that the transcription factors were varied among PPAR members in human and chicken, which may account for the specific tissue expression and distinct functions of PPARs. Some of these predicted transcription factors and their regulatory effects on PPARs are consistent   with the previous reports; for example, the transcription factors AP1 and NF-kB were proved to enhance the expression of PPAR activity [32]. Some of these transcription factors are also tissue specific, for example, the SP1 expressed in adenocarcinomas of the stomach [33], CP1 highly expressed in liver, kidney, and intestine but weakly expressed in adrenals and in lactating mammary glands [34,35], and NF-1 detected in brain, peripheral nerve, lung, colon, and muscle [36], and so forth. It can be speculated that the variants in the promoter regions of PPAR and PPAR result into differential transcription factors binding on them that eventually influence their expressions and tissues distributions. Additionally, there are 18 common transcription factors between PPAR and PPAR , whereas the PPAR shared the least amount of common transcription factors with the other two members, which may contribute to the similarity in expression characteristics between PPAR and PPAR .
The miRNA can combine with the target mRNA by base pair, which leads to degradation or inhibition of the quantity levels of the target mRNA, thereby regulating gene expressions [37]. The regulation of miRNA on gene expressions is another path shaping gene expression patterns and biological processes [38]. In the present study, the miRNAs and their targets sites in 3 UTR region of PPARs were predicted, and it was observed that the quantity of miRNAs was obviously differential in PPAR members. The number of miRNAs predicted in PPAR was significantly more than the other two members. Moreover, it was worth noticing that most of the miRNAs were predicted in PPAR , only a minority of them predicted in at least two PPAR isotypes; for example, only miRNA-128 was found in PPAR and PPAR and miRNA-9 was found in PPAR and PPAR . These differences may be correlated with the distinct functions of PPAR isotypes, and PPAR may be regulated by miRNAs in a much more complex way than the other two PPARs.

Conclusions
In the present study, the evolutionary pattern and regulation characteristics of PPARs were analyzed. The three isotypes of PPAR gene family may emerge from twice times of gene duplication events. PPAR might be the original ancestor gene in PPAR gene family. The conserved domains of HOLI domain and ZnF C4 are essential for keeping basic roles of PPAR gene family, and the variant domain of LCRs may be responsible for their divergences in functions. The positive selection sites in HOLI domain are beneficial for PPARs to evolve towards diversity functions. The variants in the promoter regions and 3 UTR of PPARs resulted into differential transcription factors and miRNAs involved in regulating PPAR members that may eventually influence their expressions and tissue distributions. Conserved sites for miRNA families broadly conserved among vertebrates miR-19ab miR-21/590-5p