Sequencing and Characterization of Mitochondrial Protein-Coding Genes for Schizothorax niger (Cypriniformes: Cyprinidae) with Phylogenetic Consideration

The present study was conducted to get more information about the genome and locate the taxonomic position of Schizothorax niger in Schizothoracinae through mitochondrial 13 protein-coding genes (PCGs). These PCGs for S. niger were found to be 11409 bps in length ranging from 165 (ATPase 8) to 1824 bps (NADH dehydrogenase subunit 5) and encode 3801 amino acids. In these PCGs, 4 genes overlap on the similar strands, while one shown on the opposite one: ATPase 6+8 and NADH dehydrogenase subunit 4+4L overlap by 7 nucleotides. Similarly, ND5-ND6 overlap by 4 nucleotides, while ATP6 and COIII overlap by 1 nucleotide. Similarly, four commonly used amino acids in S. niger were Leu (15.6 %), Ile (10.12 %), Thr (8.12 %), and Ala (8.7 %). The results presented that COII, COIII, NDI, ND4L, and Cytb had substantial amino acid conservation as compared to the COI gene. Through phylogenetic analysis, it was observed that S. niger is closely linked with S. progastus, S. labiatus, S. plagiostomus, and S. nepalensis with high bootstrap values. The present study provided more genomic data to know the diversity of the mitochondrial genome and its molecular evolution in Schizothoracinae.


Introduction
Nowadays, mtDNA has been frequently used for species identification; it has phylogenetic, evolutionary, and population studies [1][2][3]. The vertebrate mtDNA is 16-20 kb in size and consists of 37 genes that coded 13 PCGs, 22 transfer RNAs, 2 ribosomal RNAs, and a d-loop region to check its replication and transcription [4,5]. Mitochondrial genome and its gene contents in fish are quite conserved with few exceptions in the rearrangement of genes [6,7]. The genome or gene-based studies are helpful for the better understanding of evolutionary and phylogenetic relationships among fishes [3,8].
Recently, molecular and phylogenetic studies are helpful to solve some phylogenetic questions and persistent discrepancies among teleosts, for example, in the Cyprinidae [9]. Similarly, the evolutionary background of higher teleosts has also been explored through mitogenome studies [10]. For the analysis of phylogenetic trees, the information obtained through a single gene is mostly insufficient [11].
In the cyprinids, the phylogenetic approaches are based on the whole mitochondrial genome or the functional genes (13 PCGs) that are helpful for a better understanding of speciation and divergence [12]. Schizothoracinae (Cyprinidae) fishes represent the largest and most diverse taxon possessing more than 100 species and subspecies (http://www.fishbase .org) with a worldwide distribution [13][14][15]. Schizothoracinae are economically and commercially essential species living in fast-flowing and snowy rivers and streams including the Neelum and Jhelum Rivers in Azad Jammu and Kashmir [13,16]. Because of its tender flesh and delicious taste, the Schizothorax fish has become an important economic fish and been strongly targeted and overexploited by commercial fishermen, which have led to the decline of Schizothorax fish [17]. Moreover, in recent years, the Schizothorax species have suffered a dramatic decline due to overfishing, polluted water, and destruction of their spawning grounds that resulted in fragmentation of the habitat and impeded the migration of fishes [18,19].
To overcome the declining fish population, the release of captive breeders into the wild has been helpful for effective conservation strategies to enhance the natural fish populations [20]. It is a widely accepted method to enhance the local populations of some fish species like Percocypris pingi and Chinese sturgeon (Acipenser sinensis) [21]. Similarly, for threatened Schizothorax species, artificially propagated individuals from hatcheries have also been released into the river to improve populations in the wild. However, in this genus, the discrepancies about morphology-based identification, molecular phylogeny, and evolutionary history were frequently observed [22,23]. Similarly, the available mitochondrial genome data are insufficient for Schizothorax species and especially for S. niger.
Therefore, genetic characterization of S. niger is an essential step for both fundamental science and its conservation strategy. The main purpose of the current study is to get more information about its genome and locate the taxonomic position of S. niger within Schizothoracinae. For this purpose, we characterize the mitochondrial protein-coding genes for S. niger and its phylogenetic relationship with other Schizothoracinae.

Materials and Methods
2.1. Sample Collection and DNA Extraction. The specimens of S. niger were collected from the Jhelum River (34°19 ′ 46.3 ″ N 73°30 ′ 44.8 ″ E) with cast nets. All the fish samples were carefully handled to avoid the damages during studies. The Board of Advanced Studies and Research, University of Azad Jammu and Kashmir, Muzaffarabad, permits to conduct this study in Jhelum and Neelum Rivers of Muzaffarabad city. The collected fishes were anesthetized by immersion in 1% benzocaine in water and euthanized with an overdose of benzocaine. Following the analysis, the voucher specimens were preserved in 70% ethanol and deposited to the Zoological Museum Hall at the University of AJK, Pakistan. By following the classifications of Mirza [24] and Jhingran [25], these specimens were identified. All the collected specimens were obtained in compliance with the animal welfare laws, national policy, and local guidelines in Azad Jammu and Kashmir, Pakistan.
Approximately 0.1 g of tissue was sterilized with ethanol and washed three times with distilled water. The total DNA was isolated by a standard phenol-chloroform extraction method of Sambrook and Russell [26]. The extracted DNA was run on 1% agarose gels and visualized with a UV transilluminator. The results were recorded with a gel documentation system and quantified with a spectrophotometer at the 260/280 nm wavelengths. 16 sets of overlapping primers (Table 1) were used for the amplification of 13 proteincoding genes of the mitochondrial genome of S. niger with the use of the Primer-3 program. The mitochondrial DNA was amplified with polymerase chain reactions (PCR), which were performed in 25 μl reaction volumes containing 14 μl DMSO water, 3 μl template DNA, 2.5 μl Taq buffer, 0.5 μl dNTPs, 1 μl of each primer (forward and reverse), 2.5 μl magnesium chloride, and 0.5 μl Taq polymerase. For thorough mixing, the reaction mixture was vortexed and centrifuged for 30 s at 8000 rpm. The thermal cycling profile was 95°C for 3 min; 39 cycles of 95°C for 30 s (denature), 53°C for 30 s (anneal); and 72°C for 1 min (extension) and followed by a final extension at 72°C for 10 min. The PCR products were purified with Exo-SapIT (Affymetrix purification kit) before cycle sequencing.
Bidirectional nucleotide (nt) sequencing was performed on an ABI Prism 3100 Genetic Analyzer (PE Applied Biosystems; Foster City, CA, USA) using gene-specific forward and reverse primers. Sequence editing was performed using the BioEdit program (http://www.mbio.ncsu.edu/BioEdit) to determine nucleotide and amino acid variants. The S. niger sequences were aligned by using the ClustalW algorithm of the MegAlign program in the LaserGene software package (DNAStar, Inc., Madison, WI). The sequence analyses were carried out using MEGA 6.06 [27] and DnaSP v5 [28] software. The nucleotide sequences with accession numbers ( Table 2) are also available on GenBank. PCGs from 23 Schizothorax species retrieved from GenBank were concatenated and aligned using Sequencher and corrected by eye, yielding a total alignment of 11409-12 nucleotides to determine the sequence divergence among them. The nucleotide and amino acid composition, nucleotide substitutions, codon usage pattern, and relative synonymous codon usage (RSCU) of the 13 PCGs were examined with MEGA 6.0. Nucleotide compositional skew was calculated according to the formula: AT-skew = ðA − TÞ/ðA + TÞ and GC-skew = ðG − CÞ /ðG + CÞ [29]. The DAMBE software (v7.2.14) was used to calculate the entropy-based substitution saturation and its critical value [30]. The transitions and transversions against the genetic distance were also calculated through this software.
The Maximum Parsimony (MP) method and BEAST v2.6.2 [31] were used to compute the phylogenetic tree. The BEAST XML input file was generated using BEAUti v2.6.2 (part of the BEAST v2.6.2 package) with strict molecular clock approach and Yule process in tree prior. The MCMC chains were run for 10,000,000 generations, parameters were sampled every 1000 generations, and an initial 10% of the samples were discarded as burn-in. The tree results were    3 BioMed Research International similar strands, while one shown on the opposite one: ATPase 6+8 and ND4+4L overlap by 7 nucleotides. Similarly, ND5-ND6 overlap by 4 nucleotides, while ATP6 and COIII overlap by 1 nucleotide. Among these, ATPase 6 and 8 genes' overlap is widespread in other vertebrate genomes; however, its size in mammals (40-46 bps) is larger as compared to fish (7-10 bps) [33].

Results and Discussions
Of the13 PCGs, 12 genes used ATG as start codon, while the COI gene is started with GTG. In particular, in teleosts, the start codon is ATG in all PCGs excluding the COI gene [34,35]. Similar findings were also observed in S. niger PCGs, depicting that COI could be an ancient gene in the evolutionary process of mitochondria. Many studies also reported ATG as an initiation codon of COI gene in many animals, such as Collichthys niveatus, Larimichthys crocea, Charybdis feriata, and Collichthys lucidus [36][37][38][39]. In the case of termination codon, five genes (ND1, ND4L, ND5, COI, and ATP8) were terminated with TAA, ATP8, and COIII with TA, while the rest of six genes (ND2-ND3, ND4, ATP6, COII, and Cytb) have the incomplete stop codons (T-) ( Table 3). The stop codons seem to have an ability to be changed in fish mitogenomes, suggesting that it might undergo a rapid evolutionary process [5,40]. It is widespread in vertebrate mitochondrial PCGs, and it is suggested that the incomplete termination codons are likely due to posttranscriptional modifications such as in
Gene-wise codon usage patterns of S. niger are depicted in Table 5(a). For amino acids with 4-fold degenerate 3 rd position, codons ending with A are higher in S. niger as compared to those codons ending with C or T. In 2-fold degenerate codons, the proportion of C is greater compared to that of T. Similarly, G is the least common 3 rd position base excluding glycine and arginine (here, G is equal to T and C but quite lower than A). These patterns are generally similar across vertebrate groups [10,33,48,49]. In S. niger, the total length of PCGs is 11409 bps, showing the similar length to other Schizothoracinae (Table S2), indicating that the mtDNA is quite conserved in cyprinids.
The most frequently used codon is CUA (5.41%) in 13 PCGs. Furthermore, it was observed that these PCGs coded twenty amino acids in S. niger, and the commonly used amino acid is leucine (596); however, the least commonly used amino acid is tryptophan (17). The hydrophobic amino acids (Ala, Ile, Leu, Phe, and Val) were greater compared to polar amino acids (Tyr, Cys, Ser, Asp, and Glu) in 13 protein-coding genes of S. niger. The RSCU of the 13 PCGs suggested that the overall 3803 codons were observed in S. niger and the most overused codon is GCC (1.54%), while the less frequently used codon is GCG (0.41%) (Table 5(b)).

Combined Analysis.
In the present study, 11410 bps of 13 protein-coding genes' sequences were obtained from 26 (3 in-group and 23 retrieved from NCBI) highly specialized Schizothoracinae fishes. Details regarding Schizothorax species retrieved from NCBI are available in Table S1. Sequence alignments of 24 haplotypes were observed in Schizothoracinae, showing that 8095 sites out of 11410 (71%) were conserved while 3315 sites were mutated. Out of the mutated sites, 1975 (60%) were parsimony informative polymorphic sites while 1340 were singleton. Transition mutation was higher compared to transversion. The outnumbered transition mutations follow other reports on mitochondrial DNA in teleost fish [51][52][53][54].
The number of haplotypes ranged from one (in all the NCBI sequences) to three (in three studied sequences) within the species. The 0-fold degenerate sites are 7114 (62.34%), while the 4-fold degenerate sites are 1402 (12.28%) out of 11410. The average haplotype diversity (Hd) across all samples was 1.00, while the nucleotide diversity (Pi) was 0.232.
A+T and G+C contents of S. niger protein-coding genes were calculated and then compared to other members of Schizothoracinae. The overall base composition was 28.36 % A, 28.64% C, 26.01% T, and 16.98% G, with 54.38% AT, respectively, shown in the similar nucleotide composition with the genus Schizothorax (Table S1). In addition, the proportion of conserved amino acids of 13 PCGs of Schizothorax species was also calculated. Five genes (ND1, COII, COIII, ND4L, and Cytb) were found to have more conserved amino acid sites as compared to others. Among these genes, COI was less conserved as compared to the abovementioned genes; moreover, 73.11% amino acid sites of this were invariable (Table S2).
The interspecific K2P distances ranged from 0.06 to 7.77% (mean 1.105%) for the protein-coding gene. Because of the sequence similarity, the intraspecific distances were 0.00 in studied samples of S. niger (SN-01 to SN-03), which could not represent the intraspecific distances of this species, while these sequences show the 0.06% intraspecific K2P distance with S. niger (NC-022866.1) retrieved from NCBI. The interspecific divergence of S. waltoni to other species is the maximum (5.82-7.77%) as compared to other interspecific differences. The details of species genetic distances are displayed in Table 6.
The best fit model to sequence evolution was selected in GTR+I+G by the Akaike information criteria (AICc) (90532.960). Most of the mutation events were transitions    (Table S3). Similarly, transition/transversion bias (R) is 11.21.

Mitochondrial Genome Evolution.
Phylogenetic analysis was used to estimate relationships among S. niger, 23 other Schizothorax species, and one outgroup (retrieved from NCBI) to assess historical information content of mitochondrial genomes.  [54] and Bibi and Khan [55]. The tree generated in BEAST v2.6.2 through FigTree v1.4.4 (Figure 2) was not identical to the MP in the branching order of Schizothoracinae fishes. As shown in Figure 2, S. richardsonii and S. esocinus form the separate group from S. plagiostomus, S. progastus, S. labiatus, and S. nepalensis and remain at the base of S. niger.
Using DAMBE, the substitution saturation was assessed for protein-coding genes of S. niger. In these genes, no saturation was observed as shown by a linear correlation when BioMed Research International the transitions and transversions were plotted against genetic distance (Figure 3). Similarly, the rate of transitional substitutions was higher than that of transversions; similar findings were also reported from Barik et al. [56]. It was also confirmed from a significantly higher (P < 0:001) Iss.c value of symmetrical (0.849) and asymmetrical (0.631) as compared to Iss values (0.085). These results depicted that the nucleotide substitutions are not saturated and the data is suitable for phylogenetic study also reported by Li et al. [57] while studying the problematic Cytb gene sequences of fishes from NCBI.
Through the molecular analysis, the branching time is also estimated along the branching order. The molecular clock estimates of Schizothoracinae fishes are based on a sequence divergence rate of approximately 2.0% per MY [46]. Applying the abovementioned sequence divergence rate, the divergence between S. niger and other Schizothorax species (S. plagiostomus, S. progastus, S. labiatus, and S. nepalensis) is 0.04%. Similarly, other Schizothoracinae species also show the sequence divergence rate ranging from 0.04 to 0.06 MY as shown in Figure 4. Bars around each node represent 95% confidence intervals which were computed using the method described in Tamura et al. [27]. Our study did not show any conflict with the geological event that causes the uplifting of Himalaya in the late Pliocene to middle Pleistocene (0.5 MY BP) [58].   This is the first study to report genetic data on S. niger from the cold water bodies of Azad Jammu and Kashmir, where there is a need to devise conservation and management plans for the exploited cold-water fish species. Through this study, the genetic diversity and phylogenetic relationships among cyprinids and especially of the focal endemic S. niger are well explained. Their low intraspecific divergence may cause a threatening condition, and policy actions promoting conservation must be taken immediately. Natural populations need to be maintained at a size sufficient to retain genetic diversity, as this helps to minimize their risk of extinction. Hence, it needs to be spotlighted to conserve the data of genetic diversity. It is mandatory to prevent overfishing, particularly to prohibit fishing throughout the reproductive season. The authors recommend using proteincoding and other mtDNA regions to investigate the genetic divergence and phylogenetic studies of other fish species in freshwater ecosystems in the world.

Conflicts of Interest
The authors alone are responsible for the content and writing of the paper. The authors report no conflicts of interest. Table S1: composition and skewness in the PCGs of Schizothorax mitogenomes. Table S2: proportion of conserved amino acid sites in 13 PCGs of Schizothorax species.