Partial Characterization of Immunoglobulin Cμ Gene of Water Buffalo (Bubalus bubalis) Predicts Distinct Structural Features of C1q-Binding Site in Cμ3 Domain

1 Department of Molecular and Cellular Biology, SCIE4248, University of Guelph, 50 Stone Road E, Guelph, ON, Canada N1G 2W1 2National Specialist Meat Processing and Microbiology, Meat Programs Division, Canadian Food Inspection Agency, 1400 Merivale Road, Ottawa, ON, Canada K1A 0Y9 3Department of Veterinary Microbiology, Punjab Agricultural University, Ludhiana 141 004, India 4Microbiology Section, Division of Fish Health Management, Central Institute of Freshwater Aquaculture, Bhubaneswar, Orissa 751 002, India


Introduction
The water buffalo (Bubalus bubalis), member of family Bovidae domesticated approximately 5000 years ago in Asia, is raised for milk, meat, and draught purposes.Approximately, 170 million water buffaloes are mainly found in Asia (97%), but their number is growing across Africa, Australia, Europe and South America [1].India possesses the best of dairy breeds (Murrah, Nili-Ravi, and Surti) that produce 72 million tones of milk annually, 5% of world's total milk output [1,2].Buffalo milk is rich in fat, protein, and minerals but low in cholesterol [3] and is, thus, perfect source of good quality dairy products, especially the traditional Italian mozzarella cheese [4].The demand for buffalo meat is high as it is relatively lean with low fat and high mineral content as compared to beef or pork.Buffaloes provide an excellent source of draught power in more than 50 countries [1,2].Buffalo utilizes poorly digestible feeds better than cattle and, therefore, can be maintained on low quality fodder and crop [1].Importantly, buffaloes are resistant to common diseases, ticks, and external parasites that commonly afflict cattle [5].
Little is known about the structural and functional features of the immune system of this economically important species.The immunoglobulin genetics of other domestic species has been extensively studied [6] including humans and mice [7].Limited sequence divergence is noted in phylogenetically close cattle [8][9][10] where somatic hypermutations [11,12] and generation of exceptionally long third complementarity determining regions of heavy chains (CDR3H) [12][13][14] provide the required antibody diversity.Within the preponderant -light chain expression in cattle, a restricted V1-J3-C3 recombination encodes the most -light chain repertoire in cattle [15,16].Immunoglobulin heavy chain constant regions genes that encode IgM, IgD, IgG, IgA, and IgE isotypes have been analyzed in many species [6], including cattle [16][17][18][19][20].The immunoglobulin gamma heavy chain gene has been mapped to buffalo chromosome 20q23-q25 by in situ hybridization [21].Buffalo IgG, IgM, and IgA immunoglobulin isotypes have been serologically characterized [22] where two subclasses of buffalo IgG (IgG1 and IgG2) are identified [23].To advance genetic and structural understanding about buffalo immunoglobulins, we partially characterized buffalo germline C gene that encodes IgM, an immunoglobulin that appeared first during vertebrate evolution and is the first to be expressed on developing Blymphocytes.The buffalo germline C gene sequence from Niliravi breed shares high amino acid sequence similarity with cattle and, also, the predicted distinct C1q-binding structural characteristics.

Materials and Methods
2.1.Genomic DNA.Peripheral blood collected from a water buffalo of Niliravi breed, kept at the dairy farm of Punjab Agricultural University, Ludhiana, India, was used to extract genomic DNA as described [9].

PCR and Sequencing.
The buffalo germline C gene, spanning codons from 201 to 550, was PCR amplified using sense (5  -GTGTGCGAAGTCCAGCA-3  ) and antisense (5  -AGACTAGTTACCGGTGGACTTGTCC-3  ) primers from conserved C1 and C4 exon sequences, respectively, [17,24] under conditions that did not permit PCR artifact [18].The PCR steps involved a hot start at 95 ∘ C for 2 min, denaturation at 95 ∘ C for 1 min, annealing at 65 ∘ C for 1 min, and extension at 72 ∘ C for 1 min up to a total of 30 cycles.The PCR conditions included 1.5 mM MgCl 2 , 0.8 M of each primer, and 2.5 U of Taq polymerase (Perkin-Elmer, Branchburg, NJ, USA) in a 100 L volume.The PCR product (∼1.5 kb) was gel fractionated and purified using GeneClean II (Bio 101, Vista, CA, USA) and subjected to automated DNA sequencing in both directions (MOBIX, McMaster University, Hamilton, ON, Canada).The internal sequencing primers were synthesized from the determined buffalo C gene sequence (5  -TGAGGCCTCGGTCTGCT-3  ), corresponding to codons from 401 to 407.The buffalo C codons are numbered according to [7] following Ou index [31].The DNA sequence was analyzed using Geneious Pro 5.6.4 program (http://www.geneious.com/)and the predicted protein secondary structure determined using the original Garnier-Osguthorpe-Robson algorithm (GOR I) provided by the EMBOSS suite [30].

Results and Discussion
The nucleotide sequence and the deduced amino acid sequence of water buffalo germline C gene, spanning between codons 201 and 550, are presented in Figure 1.The water buffalo C gene shares a high nucleotide (95.52%) and amino acid (94.28%) sequence similarity with C of cattle, closest ruminant species of family bovidae.Analysis of the buffalo germline C gene sequence revealed that it encodes part of C1 domain (codons 201-221) and all of C2 (codons 221-333), C3 (codons 334-438), and C4 (codons 439-549) domains of IgM.When compared with other species, the overall amino acid identity of water buffalo IgM was most similar to sheep (91.71%) followed by pig (64.00%), rabbit (63.14%), human (61.71%), horse (60.57%), and mouse (56.28%).High amino acid sequence similarity of buffalo IgM with cattle (94.28%) and sheep (91.71%) is expected given the close phylogenetic relationship in ruminant species.Similar to cattle and sheep, buffalo IgM has unique amino acid substitutions at 10 positions (Leu-239, Ser-246, Ile-274, Glu-279, Arg-303, Lys-319, Ser-367, Gly-370, Ala-421, and Lys-442) noted to be conserved in non-ruminant species (Figure 2).Buffalo IgM has four distinct amino acid replacements (Met-301, Val-310, Asn-331, and Thr-432) spread across C2, C3 and C4 that diverge from conserved amino acids in cattle and sheep IgM.As compared to cattle, buffalo C gene has a codon deletion at position 507 (GTG encoding valine present in cattle) and insertion of GGC encoding glycine at position 532 in the C4 domain (Figure 1).Nucleotide deletions, insertions, and substitutions are also noted in the intron sequences between the buffalo C exons.
The conserved cysteines in buffalo IgM, essential for domain structure formation via intrachain disulfide bridge, are noted at position 202 (C1 domain which would interact with another cysteine residue within the C1 domain; not investigated here), 252-313 (C2 domain), 360-418 (C3 domain), and 466-528 (C4 domain; Figure 2).Similarly, cysteine amino acids responsible for interchain disulfide bridges between the heavy chains of the monomeric (position 330) or polymeric (position 406) IgM [31] are conserved.Like most other species, buffalo IgM has two tryptophan residues in each of the C2, C3, and C4 domains (Figure 2).These findings are consistent with the critical role of conserved cysteine and tryptophan amino acids in maintaining the domain structure of immunoglobulin [32].

T S S P A P E P Q D P S V . Y F V H S I P T V A E E D W S K G E T Y T C CGTGACCAGCAGCCCAGCGCCCGAGCCCCAGGACCCCAGCGTG...TACTTTGTGCACAGCATCCCGACGGTGGCCGAGGAGGACTGGAGCAAAGGGGAGACCTACACCTGCG ----------GC--GC-
The dash (-) indicates identical nucleotide or amino acid, and the dot (⋅) indicates a lack of nucleotide or amino acid.The gaps are inserted to optimize homology.
Similar to cattle, sheep, and goat, buffalo IgM has five prolines in the C2 domain, the lowest number of prolines in this region that acts as hinge in contrast to other species, such as, pig and rabbit (7), human (8), and horse and mouse (9).The buffalo IgM has only six hydrophilic threonine amino acids in the C2 domain, the lowest number in this region, unlike cattle (7), sheep (8), rabbit (9), pig and human (10), horse (12), and mouse (13).Similar to cattle (19) and sheep (20), buffalo IgM is rich in serine (19) in the C2 domain.Other species like mouse (9), human (12), and horse (13) have fewer serine residues in the C2 domain, however.It seems that fewer hydrophilic threonine in C2 domain of ruminant species is compensated by higher number of hydrophilic serine residues.Presence of fewer proline in the C2 domain will provide structural rigidity that may restrict segmental flexibility of Fab arms.The higher number of hydrophilic threonine and serine amino acid residues in C2 of buffalo IgM is likely to augment its ability to extend into the solvent, however.We earlier reported similar findings for cattle IgM where structural constraints imposed by the restricted segmental flexibility of Fab arms are compensated by exceptionally long CRD3H (>50 amino acids) region [13,18].It is possible that such a long CDR3H exists in buffalo antibodies as well.
The C1q-binding site buffalo IgM, spanning positions 408-428 in C3 domain, has 12 conserved residues across species (positions 408-Glu, 409-Asp, 410-Trp, 411-Ser, 418-Cys, 419-Thr, 420-Val, 422-His, 424-Asp, 425-Leu, 426-Pro, and 428-Pro).Of the twelve conserved amino acids, three notable exceptions exist in humans (Glu replaced by Asp at position 408), mouse (Ser replaced by Asn at position 408), and goat (Trp replaced by Arg at position 409 (Figure 3)).The conserved protein motif "Thr-Cys-Thr-Val-Ala-His" provides protein signatures of C1q-binding region in ruminant species.The predicted protein secondary structure of C1q binding site reveals its distinct structural features in buffalo and cattle IgM where a long alpha-helical structure is predominant (Figure 3), unlike other species, followed by a short turn together with a coiled structure common to all species.The C1q binding site in cattle and buffalo IgM also lacks beta-strand altogether unlike other species.These structural features deviate from IgM of other ruminant species, like sheep and goats, where turns and/or coils are evident in this region similar to other species.By contrast, the alpha-helical structure is altogether absent in C1q binding site of human IgM.These configurational differences in the conserved C1q binding region of IgM across species appear to be relevant to complement fixation and activation by classical pathway.It is possible that increased structural flexibility in the C1qbinding site compensates for the structurally rigid C2 domain of buffalo and cattle IgM.Overall, buffalo C domain shares high amino acid sequence similarity with C of other ruminant species like cattle and sheep.The buffalo IgM has fewer proline residues in the C2 acting as hinge that would restrict the segmental flexibility of Fab arms.High hydrophilic threonine and serine amino acid content in C2 domain will likely enhance its ability to extend into the solvent.The secondary protein structure of C1q binding site reveals its distinct structural features in buffalo and cattle IgM where a long alpha-helical structure is predominant (Figure 3), unlike other species which seems to be of functional significance.