Role of Positive Selection in Functional Divergence of Mammalian Neuronal Apoptosis Inhibitor Proteins during Evolution

Neuronal apoptosis inhibitor proteins (NAIPs) are members of Nod-like receptor (NLR) protein family. Recent research demostrated that some NAIP genes were strongly associated with both innate immunity and many inflammatory diseases in humans. However, no similar phenomena have been reported in other mammals. Furthermore, some NAIP genes have undergone pseudogenization or have been lost during the evolution of some higher mammals. We therefore aimed to determine if functional divergence had occurred, and if natural selection had played an important role in the evolution of these genes. The results showed that NAIP genes have undergone pseudogenization and functional divergence, driven by positive selection. Positive selection has also influenced NAIP protein structure, resulting in further functional divergence.


Introduction
Increasing interest has recently focused on the importance of nucleotide binding and oligomerization domain (Nodlike receptors (NLRs)), as important pattern recognition receptors (PRRs) [1,2]. In humans, NLR family is composed of 22 intracellular PRRs and carboxy-terminal leucine-rich repeats (LRRs) regions [3][4][5]. NLR families were divided into three different subfamilies based on their phylogenetic distribution, each characterized by a specific molecular structure [6,7]. NALPs represent the largest NLR subfamily, for which 14 genes have been identified in humans. NALP proteins harbor a NACHT domain, some LRRs, and an amino-terminal pyrin domain (PYD) domain. The second subfamily of NLRs includes the caspase recruitment domain (CARD), such as NOD1, NOD2, NOD4, and CIITA. This subfamily also contains a slightly separated group, containing NOD3 and NOD5/NLRX1 [1,5]. NOD5/NLRX1 has not a defined N terminal domain, whereas at least one splice variant of CIITA was reported to harbor a CARD.
The NAIP belongs to the third subfamily, which is characterized as the baculovirus inhibitor of apoptosis repeatcontaining protein 1 (BIRC1). This group is characterized by one or three baculovirus inhibitors of apoptosis repeatcontaining domains (BIRs). In humans, NAIP is a member of the inhibitor of apoptosis protein (IAP) family and has been cloned as a candidate gene for the neurodegenerative disorder spinal muscular atrophy (SMA) [8][9][10]. Some NAIPs, either alone or in combination with ICE-proteaseactivating factor (IPAF), have been shown to be involved in the formation of inflammasomes [11,12]. NAIPs also appear to be involved in the pathogenesis of Alzheimer's disease, Down's syndrome, multiple sclerosis, and Parkinson's disease [13][14][15][16]. NAIPs, but not IAPs in general, have been identified as cancer targets [17][18][19][20][21]. A potential role of some NAIPs in innate immunity has also been reported in mouse where NAIP polymorphisms determined whether macrophages restrict or support intracellular replication of Legionella pneumophila (L. pneumophila) and whether  Figure 1: Manually corrected alignments. (a) Alignments before manual correction; (b) Alignments after manual correction. Alignment were constructed using Clustal X and modified with Boxshade (http://www.ch.embnet.org/software/BOX form.html). Functional domains are designed as BIR, P-loop NTPase, and LRP. " * " indicated highly conserved residues. Functional (f) or structural (s) residues (f indicates highly conserved and exposed, s indicates highly conserves and buried); " # " indicates amino-acids had changed during the evolution. mice are resistant or (moderately) susceptible to Legionella infection [22].
Overall, human NAIPs genes play important role not only in innate immunity, but also many inflammatory diseases. However, although NAIP genes are strongly associated with many inflammatory diseases in humans, no similar situation had been found in other mammals [23][24][25]. Furthermore, some NAIP genes have undergone pseudogenization or been lost during their evolution of higher mammals. In this context, the present study aimed to determine whether

Sequences.
The protein sequences of all known NAIPs were retrieved from GenBank (Reference Proteins, Refseq protein) by PSI-BLAST, using human NAIP isoform 1 protein sequences as queries [26]. Coding gene sequences for NAIP proteins were retrieved from GenBank (http://www .ncbi.nlm.nih.gov/).

Phylogenetic Analysis.
Redundant sequences were removed by DAMBE software. Multiple sequence alignments were performed using Clustal X [27] then were manually corrected, and the alignments were shaded using the Boxshade program (http://www.ch.embnet.org/software/ BOX form.html). All alignment gaps were removed before phylogenetic analysis. Phylogenetic trees were reconstructed using neighbor-joining (NJ) and minimum-evolution (ME) methods with MEGA4.0 [28]. Bootstrap values were estimated from 1000 replicates.

Analysis of Functional
Divergence of NAIP Genes. DIVERGE 2.0 program [29] was used to estimate type I functional divergence [30] between different clusters of NAIP genes. Type I sites represented amino-acid residues conserved in one cluster, but highly variable in another, which suggests that these residues had been subjected to different functional constraints. Statistically, the functional divergence between two clusters was measured as the coefficient of functional divergence, θ (ranging from 0-1). A null hypothesis of θ = 0 indicated that the evolutionary rates of two duplicate genes at each site were virtually identical [31,32]. If the null hypothesis was rejected, a site-specific profile was used to predict the critical amino-acid residues most likely to be responsible for the functional divergence detected.
The phylogenetic tree used for DIVERGE 2.0 was reconstructed by MEGA 4.0 with the ME method [28]. The coefficients of functional divergence (θ) between gene clusters were calculated by model-free estimation (MFE) and maximum-likelihood estimation (MLE) under a twostate model (MLE) to detect amino-acid residues reflecting functional divergence.

Test for Selective Force.
Patterns of molecular evolution were assessed for NAIP gene subfamily using MEGA 4.0 [33]. ClustalX alignments of nucleotide sequences were used as inputs in the analysis. The dN and dS values were calculated within every cluster, using modified Nei and Gojobori's [34,35]. A dN/dS > 1 suggested that the gene had undergone positive selection.

Model Prediction for NAIPs. Models prediction for
NAIPs were conducted using I-TASSER server. (http://zhanglab.ccmb.med.umich.edu/I-TASSER/).   After removal of redundant sequences using DAMBE (data not shown), the 16 remaining sequences were aligned using Clustal X. The alignments were manually corrected ( Figure 1). The 16 sequences were included in the final dataset and included two humans, two rats, one horse, one cow, seven mice, two chimpanzees, and one monkey ( Table 1). Alignments of the 16 NAIP sequences showed that all NAIP sequences included three BIRs, except for human NAIP isoform 2, chimpanzee BIRC1 isoform 1 and monkey BIRC1, NACHT domains, and several LRRs (Figure 1 and Table 1). NACHT belongs to the AAA + NTPase superfamily, a sister group of another family of ATPases. NACHT domain included seven conserved motifs, the ATP/GTPase-specific p-loop, the Mg 2+ -binding site (Walker A and B motifs, resp.), and five specific motifs (motifs I-V).

NAIP Genes
Phylogenetic trees for NAIPs were shown in Figure 2. Phylogenetic trees were reconstructed by two different methods (ME and NJ) with high bootstrap values. Two major clusters (I and II) of NAIPs were statistically supported. The plant NBS LRR was used as the root. Cluster I mainly included rodent NAIPs, namely, mice NAIPs 1, 2, 5, 6, and 7, rats NAIPs 2 and 5. Cluster II contained cow, horse, monkey, chimpanzee and human NAIPs. NAIPs 1, 6 and 7 were mice-specific and could be related to the reproduction-related NAIP 2 or NAIP 5, which indicated the possible origin of the ancestral NAIP 2 and 5 genes. Mouse NAIP 5 included three copies, NAIP 5, NAIP 6, and NAIP 7, each with high bootstrap values. This indicated that these gene copies were produced by gene duplication and shared similar functions. Mouse NAIP genes had been extensively duplicated, and duplicates were located on mouse chromosome 13. No other instances of gene duplication were detected in higher mammals [36,37]. The present study aimed to determine whether the gene duplications were lost or pseudogenized in higher mammals. Human NAIP genes were used as queries against the human genome Note: C-score is a confidence score for estimating the quality of predicted models by I-TASSER. It is calculated based on the significance of threading template alignments and the convergence parameters of the structure assembly simulations. C-score is typically in the range of [−5, 2], where a C-score of higher value signifies a model with a high confidence and vice versa. TM-score and RMSD are known standards for measuring structural similarity between two structures which are usually used to measure the accuracy of structure modeling. TM-score >0.5 indicates a model of correct topology.

Functional Divergence of NAIP Genes during Evolution.
Functional divergence was estimated using DIVERGE 2.0 to determine whether functional divergence had occurred between different species during NAIP evolution, functional divergence was estimated by DIVERGE 2.0 program. After alignments and removal of the gaps, a total of 1324 aminoacid sites were included in the analysis. Pairwise comparisons of the two NAIP clusters were conducted, and the rate of amino-acid evolution at each sequence position was estimated. The MFE θ was 0.69 ± 0.08, and MLE θ was 0.67 ± 0.09. Functional divergence was significant between the comparison of the two clusters (θ > 0 with P < 0.001) indicating that site-specific selective constraints on change might contribute to the functional evolution of NAIP genes. Furthermore, the important amino-acid residues responsible for functional divergence were predicted by calculating the site-specific profiles based on posterior analysis of all pairs of clusters with functional divergence. Cutoff values were established to extensively reduce false positives, by progressively removing the highest scoring residues from the alignments until θ dropped to zero. RFD NO was 319, covering 24.1% of the total of 1324 aligned sites with a cutoff of 0.93. RFD was generally detected in all functional domains of NAIPs, implying that shifts in functional constraints might have acted on every protein domain.

Functional Divergence of Mammals NAIP Genes Driven by Positive Selection.
To determine whether positive selection drove the functional divergence, MEGA 4.0 was used to investigate a model of the selective force acting on certain codons within every cluster during their evolution using the modified Nei and Gojobori's method. The dN/dS value in cluster I was 2.46, while it was 2.47 in cluster II. This showed that the NAIP genes had undergone positive selection ( Table 2).

Functional Divergence Led to Structural Changes in NAIPs.
To further investigate whether the functional divergences of NAIPs resulted from selective forces acting on structures, models of human and mouse NAIPs were predicted using the I-TASSER server. The C-scores, TM-scores, and RMSDs of the two models were as follows: human NAIP −1. 24

Discussion
NAIPs belong to the IAP family, which are responsible for sequestering activated caspases [38]. Published data showed that NAIP genes were strongly associated with innate immunity. In addition, human NAIPs contribute to inflammatory diseases and cancer development. However, no similar results have been reported in mice. Furthermore, NAIP has also an expanded gene number in the mouse, with five tandem copies reported in C57BL/6J and at least seven in the 129 strain [39]. Reported data showed no incomplete functional overlap among mouse NAIP loci [40]. It was suggested that NAIP was encoded within a region undergoing rapid evolution. However, it was not clear whether NAIP gene was derived by evolutionary force, whether NAIP gene underwent functional divergence during evolution, or if functional divergence was caused by selective forces. Clarification of the evolutionary process driving changes in the NAIP genes is thus important for understanding its genetic trend and exploring its new functions.. The present study analyzed 16 NAIPs genes from different mammals using phylogenetic analysis and showed that NAIP genes experienced lineage-specific duplications in rodents. Mice NAIP genes, especially, have been extensively duplicated, and the duplicates were located on mouse chromosome 13. The duplication of NAIP genes might have occurred before and after separation. Gene duplication is thought to be a major driving force facilitating the evolution of tissue specialization. However, lineage-specific duplications in NAIPs genes appear not to be universal in mammals, other than rodents. It was speculated that these duplications might have been lost or undergone pseudogenization during the evolution of higher mammals (Figure 1) [41]. To test this hypothesis, human NAIP genes were used as queries against human genome, which identified nine pseudogenes on chromosome 5. However, the biological significance of the loss of NAIP genes remains unknown. It was hypothesized that NAIP gene duplications generated functional redundancy in higher mammal. In other words, mutations disrupted the structure and function of one of the two genes. However, the duplications were not deleterious, and they were not removed by selection. Gradually, the genes containing mutations became pseudogenes, which were either unexpressed or functionless.
To determine whether the evolution of NAIP genes was driven by positive selection. MEGA 4.0 was implemented to investigate a model of selective force acting on NAIP genes during their evolution. The results suggested that NAIP genes were driven by positive selection, in accordance with other reports indicating that genes in duplication blocks could be maintained by positive selection (Table 2) [42,43].
Gene duplication driven by positive selection has also been thought to be an essential source of novel genes with new or altered functions, as evidenced by the widespread existence of gene families. The present work shows that functional divergence between clusters of NAIP genes could have occurred, and the role of selective forces acting on the tertiary structure of NAIP proteins was also investigated. Predicted models of human and mouse NAIP were constructed using I-TASSER server, and structural differences in binding sites suggest that the expressional divergence of NAIP during evolution may lead to functional specialization.

Conclusions
NAIP genes have undergone pseudogenization and functional divergence during evolution. Functional divergence has been driven by positive selection forces, which have also influenced NAIP protein structure, leading to further functional divergence.