miR-1322 Binding Sites in Paralogous and Orthologous Genes

We searched for 2,563 microRNA (miRNA) binding sites in 17,494 mRNA sequences of human genes. miR-1322 has more than 2,000 binding sites in 1,058 genes with ΔG/ΔG m ratio of 85% and more. miR-1322 has 1,889 binding sites in CDSs, 215 binding sites in 5′ UTRs, and 160 binding sites in 3′ UTRs. From two to 28 binding sites have arranged localization with the start position through three nucleotides of each following binding site. The nucleotide sequences of these sites in CDSs encode oligopeptides with the same and/or different amino acid sequences. We found that 33% of the target genes encoded transcription factors. miR-1322 has arranged binding sites in the CDSs of orthologous MAMLD1, MAML2, and MAML3 genes. These sites encode a polyglutamine oligopeptide ranging from six to 47 amino acids in length. The properties of miR-1322 binding sites in orthologous and paralogous target genes are discussed.


Introduction
Interest in microRNAs (miRNAs) is constantly growing, and new data supplement existing knowledge about the role of these molecules in key biological processes. The main objective of these studies is to identify miRNA binding sites and evaluate their binding affinities. The characteristics of binding sites shed light on the biological role of miRNAs and have practical applications. It is possible to predict interactions between miRNAs and mRNAs and their properties by using computational methods [1]. It has been established that miRNAs bind to mRNAs predominantly in 3 -untranslated regions (3 UTRs) [2]. They can also bind to 5 -untranslated regions (5 UTRs) and coding domain sequences (CDSs) [3,4]. Moreover, some miRNAs have binding sites in 5 UTRs, CDSs, and 3 UTRs [5]. For example, miR-3960 binding sites are mainly in CDSs, and many are positioned adjacent to each other (through one, two, three, or more nucleotides) [6]. Such mRNA fragments can consist of 2-17 binding sites. Discussed in this paper is miR-1322 which also contains multiple sites in CDSs. Clusters of miRNAs binding sites located in the CDS of genes are unexpected because proteins have specific amino acid sequences that are evolutionarily conserved. The presence of multiple binding sites in close proximity significantly increases the probability of interactions between miRNAs and mRNAs, even if mutations occur. Many miRNAs regulate the expression of genes involved in tumorigenesis [7][8][9][10][11]. For example, changes in miRNA concentrations are observed during the development of lung cancer [7,8], breast cancer [9], gastrointestinal cancer [10], and other cancers [11]. The serum level of miR-1322 is a potential diagnostic biomarker for squamous cell carcinoma of the esophagus [12]. We studied the arrangement and evolution of miR-1322 binding sites in genes involved in disease.
Target genes for miR-1322 were determined using the MirTarget program [6], which was developed in our laboratory. This program defines the following features of binding sites: (a) the start position of an miRNA binding site with respect to the mRNA sequence; (b) the localization of miRNA binding sites in 5 UTRs, CDSs, and 3 UTRs of genes; (c) the free energy of hybridization (Δ , kJ/mole); and (d) the schemes of nucleotide interactions between miRNAs and mRNAs. The ratio Δ /Δ (%) was estimated for each binding site, where Δ is equal to the value of free energy of an miRNA binding with its perfect complementary nucleotide sequence. One family of miRNAs have nucleotide sequences with the level of homology of 85% or more. Therefore we used the Δ /Δ ratios of 85% or more. We also noted the positions of the binding sites on the mRNA, beginning from the first nucleotide of the 5 UTR. The MirTarget program predicts interactions between the nucleotides of miRNAs and those of target gene mRNAs. It found bonds between adenine (A) and uracil (U), guanine (G) and cytosine (C), and G and U, as well as between A and C via a hydrogen bond [14]. The TmiROSite program was used to identify mRNA fragments that have miRNA binding sites and to define the corresponding amino acid sequences [15].

Features of miR-1322-3p
Binding Sites. miR-1322 has a length of 19 nucleotides (nt) and a GC-content of 53%. The maximum free energy of miR-1322 binding with mRNAs is −101.9 kJ/mole. We found that miR-1322 has 2,264 binding sites on 1,058 target mRNAs with a Δ /Δ ratio of 85% or more. Of those, 160 miR-1322 binding sites are located in the 3 UTRs of 130 genes, 215 binding sites are located in the 5 UTRs of 109 genes, and 1,889 binding sites are located in the CDSs of 819 genes. The average number of binding sites in the CDS of a single gene is 2.3, which is almost two times higher than the average number of binding sites in 3 UTRs.
The maximum number of sites observed in 3 UTR is eight in CACN1A and five in PDYN and S100A16. The maximum number of sites in 5 UTR was 13 in MAB21L1, and the AMOT, BACH2, CAPNG, PIM1, RBM39, and STC1 genes have five sites. Characteristics of the clusters of five or more binding sites located in CDSs are shown in Table 1. The start points of several miR-1322 binding sites are located through three nucleotides of each other. Several such sites in mRNA form a cluster and increase the probability of binding and the ability to inhibit protein synthesis. Oligonucleotides of binding sites located in CDSs can encode polyglutamine, polyalanine, or polyserine depending on the open reading frame (Table 1). These data indicate the importance of conserved nucleotide sequences of miR-1322 binding sites and not only the amino acid sequence corresponding to oligopeptides of the encoded protein.
The arranged nucleotide sequences of the CDSs contain binding sites for miR-1322 ( Figure 1). The conservation of binding sites relative to the adjacent regions of CDSs is shown in Figure 2. It is important to establish the presence of miR-1322 binding sites for paralogous and orthologous mRNA sequences. Additionally, the properties of binding sites were studied for mRNA sequences of both human and other animal species.
The Δ /Δ ratio for all miR-1322 binding sites of the ANO2 gene is 95.8%. The nucleotide fragment alignments of the CDSs containing miR-1322 binding sites for 38 genes are shown in Figure 1. Characteristics of the binding sites with start points located through three nucleotides in 5 UTRs and 3 UTRs are shown in Table 2. The number of binding sites in 5 UTRs ranged from five to 13. Consequently, these untranslated regions have an increased probability of binding with miR-1322. The Δ /Δ ratio ranged from 85.4% to 91.7% (Table 2). Therefore, expression of these genes can be controlled extensively by miR-1322.
Transcription factors represent 33% of all target genes in this study (Figure 1 and Tables 1 and 2). Inhibition of the synthesis of proteins can cause diseases, including cancer. Unfortunately, experimental data on miR-1322 binding sites are insufficient; however, some previous studies confirm the high efficacy of the predictions of the MirTarget program  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37 38 39    developed in our laboratory. For example, downregulation of ECRG2 and TCA3 is associated with squamous cell carcinoma of the esophagus (ESCC) via miR-1322 [12]. ECRG2 can act as a tumor suppressor, regulating protease cascades during carcinogenesis and the migration and invasion of esophageal cancer cells [16].

Binding Sites in Paralogous and
Orthologous mRNAs of the MAML Gene Family. The relationship between paralogous and orthologous mRNAs of the MAML gene family was considered an example of adaptation of gene expression to the action of miR-1322. MAMLD1 encodes a mastermind-like domain-containing protein, which can act as a transcriptional coactivator [17]. Both MAML2 and MAML3 stabilize the DNA-binding complex RBP-J/CBF-1 and the Notch intracellular domains that are signaling intermediates [18]. Higher MAML2 expression is observed in several B cell-derived lymphoma types, including classical Hodgkin's lymphoma cells, more than in normal B cells [19].
Various paralogous genes are targets for miR-1322. Two regions contain multiple miR-1322 binding sites in MAMLD1 (Figure 3). The first region consists of eight sites and the second region consists of four sites. They were in domains (oligopeptides) consisting of 11 and 10 glutamine residues in the corresponding proteins, respectively.
The number of amino acids in orthologous proteins depends on the species (Figure 3). For example, for the first region, there are 28 glutamine residues in Ggo and nine residues in Hgl. Ten glutamine residues of Hsa, Ggo, and Ptr mRNAs to six of Eca mRNA were identified in the second region. In this case, the binding site of horse mRNA encoded proline in the associated protein.
miR-1322 binding sites in orthologous MAML mRNAs are highly conserved. Orthologous MAML proteins have conserved amino acid sequences containing polyglutamine (Figures 3-5). Orthologous miRNAs are not identified in most animals except Pan troglodytes (chimpanzee) and Pongo pygmaeus (orangutan); however, some other miRNAs are identical or very similar to the corresponding human miR-NAs. Therefore, human miRNAs were used for the subsequent identification of conserved binding sites. Oligonucleotides containing CAG repeats represent the miR-1322 binding site of the mRNA that encoded a long polyglutamine sequence in the corresponding protein. Oligonucleotides encoding polyglutamine are located in the conserved protein domain.
The CDS of the human MAML2 gene also has two regions with miR-1322 binding sites and encodes oligopeptides containing 47 and 27 glutamine residues (Figure 4). The number of glutamine residues in the oligopeptides is varied depending on the species. For example, there are six glutamine residues in the first oligopeptide region of the cow protein and 24 residues in the second region of the rat protein.
The CDS of the human MAML3 gene has three regions that contain miR-1322 binding sites, and it encodes oligopeptides containing 21, 18, and eight glutamine residues. Some amino acids were lacking in the domains of MAML3, depending on the species (Figures 5(a)-5(c)).
The presence of multiple miR-1322 binding sites in MAMLD1, MAML2, and MAML3 demonstrates their interactions. The expression of these genes has become increasingly important because the studied organisms were separated by tens of millions of years. The presence of multiple regions containing miR-1322 binding sites in MAMLD1, MAML2, and MAML3 genes shows a strong dependence of their expression via miR-1322.
The glutamine-containing regions play an important role in the development of different diseases, according to previous literature. It is possible that changes in the dependence of the interactions between miR-1322 and MAMLD1, MAML2, and MAML3 are interconnected.