Allelic Variation of Wheat Flour Allergens in a Collection of Wheat Genotypes

Wheat is the most widely grown crop in the world and provides 20% of the daily protein and food calories for 4.5 billion people. Together with rice, it is the most important food crop in the developing world. In the last decades, various symptoms have been recorded across the population due to the consumption of wheat products, also summarized as “wheat allergy.” Wheat allergy is usually reported as a food allergy but can also be a contact allergy as a result of exposure to wheat. Several important wheat allergens have been characterized in the last years through biochemical, immunological, and molecular biological techniques. In the present work, the identification of allelic variation of genes involved in wheat allergy was reported. A collection of wheat genotypes was screened in order to identify new alleles. A total of 14 new alleles were identified for profilin, triosephosphateisomerase, dehydrin, glyceraldehyde-3-phosphate-dehydrogenase, α/β gliadin, GluB3-23, and Glutathione transferase allergen genes (located on chromosomes 1B, 3B, 6A, and homoelogous groups 5 and 7), potentially related to a minor allergenicity and useful in breeding programs.


Introduction
Wheat is one of the most important crops for human diet, because of its characteristics of high nutritional value and technical properties and for the long shelf life of the kernels.Nevertheless, it is a potent allergen source, causing a series of different clinical manifestations of IgE-mediated allergy [1].These manifestations include wheat food allergy, respiratory allergy to wheat pollen, and sensitization to inhaled wheat flour among bakers and persons processing wheat flour [2,3].Usually, the allergy is limited to the wheat seed storage proteins.
As reviewed by Shewry et al. [4], mature wheat grains contain 8-20% proteins.The gluten proteins, gliadins and glutenins, constitute up to 80-85% of total flour protein.These two classes of proteins are responsible for dough elasticity and extensibility, essential for functionality of wheat flours.On the basis of their solubility, Osborne [5] classified wheat grain proteins in four groups: albumins (water soluble), globulins (salt), gliadins (aqueous water), and glutenins (dilute acid or alkali).In the next years, it was discovered that the fraction identified by Osborne was actually heterogeneous and containing protein types overlapping each other, leading to an improvement of protein fractionation methods [6].The protein classifying system now used is based on biological characteristics of the proteins together with their chemical and genetic relationship [7,8].
Wheat allergy may be an inaccurate term since there are many allergenic components in wheat, for example, serine protease inhibitors, glutelins and prolamins, and different responses are often attributed to different proteins.
The IgE binding in the salt insoluble fractions (gliadin and glutenin) of wheat flour has been studied for a long time [9][10][11].In particular, low-molecular-mass glutenin, alphagliadin, and -gliadin have been recognized as allergens for patients with clinical histories of allergies after expression in Escherichia coli [12].
To date, 27 potential wheat allergens have been successfully identified [13].The most severe response to a wheat allergen is exercise/aspirin induced anaphylaxis attributed to one Ω gliadin that is a relative of the protein that causes celiac disease [14].Other more common symptoms include nausea, urticaria, atopy [15].One of the most common occupational diseases due to wheat allergens is baker's asthma [16][17][18].
As reported by Sander et al. [19] the most relevant allergenic wheat fraction for baker's asthma consists of water/saltsoluble albumins and globulins exhibiting about 70% to 80% of the specific IgE-binding activity.They identified some of the most important wheat allergens and found that IgE recognition of certain wheat allergens seems to be associated with defined clinical manifestations of wheat allergy.
The aim of this work was to study some of the most important wheat allergen genes (Thioredoxin, Peroxiredoxin, Profilin, Triosephosphate-isomerase, Dehydrin, Glyceraldehyde-3phosphate-dehydrogenase, / gliadin, low molecular weight glutenin GluB3-23 and Glutathione transferase genes) in a collection of wheat genotypes in order to look for allelic variations potentially related to a different allergenicity level useful in breeding programs.

Plant Material.
A collection of 230 tetraploid wheat genotypes (Triticum turgidum ssp.), including durum wheat cultivars, landraces, and wild accessions [20], was screened for allergens coding genes allelic variants.The collection was grown in the experimental field of the University of Bari at Valenzano (Bari, Italy) in 2009 and 2010 and in Foggia (Italy) in 2009, in a randomized complete block design with three replications and plots consisting of 1-m rows, 30 cm apart, with 50 germinating seeds per plot.
Genomic DNA was isolated from fresh leaves using the methods previously described [21] and subsequently purified by phenol-chloroform extraction.
Isoallergens protein and cDNA sequences of some of the wheat allergens, available in public databases, were used to design random primer combination amplified in the previously described wheat collection.
Single PCR fragments were directly purified with Euro-Gold Cycle Pure Kit to be digested and the polymorphic ones sequenced by BMR Genomics Srl (http://www.bmrgenomics.it/).
Polymorphisms in the wheat collection genotypes were examined using conventional PCR analysis.DNA amplifications were carried out in 25 L reaction mixtures, each containing 100 ng template DNA, 2 M of SSR primer, 200 M of dNTP, 2.5 mM MgCl 2 , 1x PCR buffer (10 mM Tris-HCl, pH 8.3, 10 mM KCl), and 1 U of Taq DNA-polymerase.The following PCR protocol in a Perkin Elmer DNA Thermal Cycler was used: initial denaturation at 95 ∘ C for 3 min, followed by 35 cycles of 95 ∘ C for 1 min, 50 ∘ C/65 ∘ C for 1 min, 72 ∘ C for 2 min with a final extension at 72 ∘ C for 10 min.The amplification products were resolved on 2% agarose gels and stained with GelRed (Biotum).

Digestion with CEL1 and Revelation
Fragments.In order to discover mutations in the allergens genomic sequences, single nucleotide polymorphisms (SNPs) were detected using Surveyor nuclease kit, following manufacture's instruction as reported in Nigro et al. [22].Surveyor nuclease cleaves with high specificity at the 3  side of any mismatch site in both DNA strands, including all base substitutions and insertion/deletions up to at least 12 nucleotides.

Heteroduplex Formation and CEL1 Digestion.
Allergen specific primer pairs were used to amplify target DNA from the 230 genotypes of the collection using PCR combinations reported in Table 2. Svevo was chosen as reference cultivar (control) for comparison with the other genotypes; the ones showing a polymorphism compared to it were defined as "mutants." An aliquot of each PCR product was used for hybridization to form heteroduplexes between Svevo and  2.5.Gel Analysis.Bromophenol blue was the preferred loading dye as it run out of the polyacrylamide gel.We were running 3% polyacrylamide gels for PCR fragments of over 1000 bp.This allowed sufficient resolution of PCR products ∼1500 bp.The 3% polyacrylamide gel was made with 15 mL 5x TBE, 120 mL dH2O, 11 mL 40% bisacrylamide, 110 L TEMED, and 1 mL 10% APS.
In order to confirm the polymorphisms within genome specific genes, the heteroduplex hybridization digestion pattern was compared to the ones obtained in each single genotype.Moreover, the PCR product giving a digestion pattern after CEL1 treatment was reamplified and sequenced (http:// www.bmr-genomics.it/).

Physical Mapping.
Nulli-tetrasomic lines (NTs) of Triticum aestivum cv.Chinese Spring (Sears, 1954;1966) were used to physically localize markers on wheat chromosomes.These lines have been amplified with the same primer combinations reported in Table 2 and then sequenced.

Results and Discussion
Owing to its extensive use in human diet, wheat is among the most common causes of food-related allergies and intolerances.Although several wheat flour allergens have been identified in the last 25 years and well characterized by biochemical, immunological, and molecular biological methods and for clinical manifestation, few pieces of information are available about the allelic variation of the allergen coding genes.The aim of the present work was the identification of genotypes with differences in the alleles coding for proteins with toxic epitopes, allowing designing strategies for wheat breeding.For this purpose a collection of 230 wheat genotypes (Triticum turgidum ssp.), including durum wheat cultivars, landraces, and wild accessions, was screened for allergens allelic variants.Tetraploid wheat is genetically and morphologically diverse and its evolution under domestication has not been fully elucidated.Almost all of the studies conducted to date have considered the subspecies of the tetraploid wheat (Triticum turgidum L.) separately for the analysis of genetic diversity.So far, durum ssp., dicoccum ssp., polonicum ssp., and dicoccoides ssp.have only rarely been analyzed together for allergen genes variation.The tetraploid wheat (T.turgidum L., 2 = 4 = 28; AABB genome) collection was classified according to van Slageren and MacKey, who considered all forms as subspecies of T. turgidum.In particular, the collection was described by Laidò et al. [20] and consists of 230 accessions classified into seven subspecies: durum ssp.(128), turanicum ssp.( 20), turgidum ssp.( 19), polonicum ssp.( 20), carthlicum ssp.( 12), dicoccum ssp.(19), and dicoccoides ssp.(12).The collection was analyzed with a set of primer combinations designed on allergen genes reported in Table 2; DNA was amplified and single fragments digested with CEL1 enzyme in order to find polymorphisms and detect allelic variations.Molecular markers developed on the polymorphisms such as SSRs, indels, and SNPs which could be really useful in breeding programs.In particular, SNPs are nucleotide variations in the DNA sequence of individuals and constitute the most abundant molecular markers in the genome.SNPs are widely distributed throughout genomes [23], vary in occurrence and distribution among species, and are usually more prevalent in the noncoding regions of the genome.There are several methods to discover SNPs, for example, Sanger sequencing, DGCE (denaturing gradient capillary electrophoresis), denaturing-HPLC, and enzymatic mismatch cleavage.One of the most efficient and reliable enzymatic cleavage based methods is TILLING (targeting induced local lesion in genomes) [24,25].This approach requires a treatment of the amplified DNA with CEL1 endonuclease or any of a number of single strand endonucleases, after heteroduplex formation between the DNA of lines to be investigated.CEL1 is a glycoprotein from celery and many green plants.It cuts a DNA heteroduplex that contains a base substitution or a DNA loop at the 3' most phosphodiester bond of the mismatched nucleotides and produces two complementary sized fragments from the original amplified product (up to ∼1,500 bp amplicons).Electrophoresis sizeseparation on polyacrylamide or agarose gels is required to visualize any cleaved fragment (Figure 1).
With this strategy, we were able to quickly analyze partial gene sequence looking for SNPs.Mutations were confirmed by sequencing.Two of the ten analysed genes, Thioredoxin and Peroxidoxin genes, showed no polymorphism in the coding region, nor in intronic ones or UTRs.
On the contrary, we identified several mutations for the other analysed genes: Profilin, Triosephosphate-isomerase, Dehydrin, and Glyceraldehyde-3-phosphate-dehydrogenase showed several polymorphisms in their sequences, even  though not in coding region.On the other hand, / gliadin, low molecular weight glutenin GluB3-23, and Glutathione transferase genes mutations were localized in exons regions, resulting in an amino acid substitution in the predicted protein sequence (Figures 2 and 3).Profilin gene showed four different single nucleotide polymorphisms, two of them located in homoeologous chromosome 7A and two in homoeologous chromosome 7B.C > T transitions, as well as a transversion G > T, localized in 3  UTR region of 7A homoeologous profilin gene; two transitions, respectively, A > G and C > T, were located in intronic region on 7B homoeologous profilin gene.The only SNPs identified in Triosephosphate-isomerase gene was a transition C > T located in 3  UTR region.
In Dehydrin gene sequence we identified two SNPs, both transitions (G > A and C > T) located in intronic region.Glyceraldehyde-3-phosphate-dehydrogenase gene showed one SNP in coding region, a C > T transition in coding region, which was a silent mutation as it did not result in any amino acid substitution.
The most interesting situation was found for / gliadin, low molecular weight glutenin GluB3-23, and Glutathione In / gliadin gene, the SNP identified in the coding region, (A/C transversion) resulted in a change in the amino acid sequence, in particular a Lysine (A)/Glutamine (C) substitution in position 26 of the mature protein sequence of 281 aa.
The SNP identified in the coding region of low molecular weight glutenin GluB3-23 gene was instead represented by a transition G > A, resulting in an amino acid substitution in position 267 of the mature protein sequence of 369 aa: Valine (G)/Isoleucine (A).
Finally, in Glutathione transferase gene, we identified three SNPs, physically mapped on 5A chromosome.One was an A/C transversion, resulting in an Leucine/Phenylalanine substitution at position 155 of the mature allergen protein (222 aa).The other two SNPs, both G > A transitions, were localized in intronic region.A set of nulli-tetrasomic lines described in material and methods section have been amplified and then sequenced.The identification of the mutation in these lines allowed us to physically map the mutations identified in the collection.The principal chromosomes found to be involved in wheat allergens were homoeologous groups 7 and 5, although allergen genes were found also located on chromosomes 1B, 3B, and 6A (Table 3).Allelic variation in these genes will allow further investigation in order to relate them to a different allergenicity level.The identification of SNPs in candidate genes for wheat allergenic protein integrated with proteomic data analysis will permit us to identify genotypes with low toxic epitopes.

Conclusions
The current report describes the identification of some of the most important wheat allergens in a collection of 230 wheat genotypes.Thioredoxin, Peroxidoxin, Profilin, Triosephosphate-isomerase, Dehydrin, Glyceraldehyde-3-phosphate-dehydrogenase, / gliadin, low molecular weight glutenin GluB3-23, and Glutathione transferase genes were analysed in order to find mutants potentially related to a minor allergenicity level.
In order to quickly screen the gene sequence looking for SNPs between our durum cultivars, we followed a "PCR/CEL1 strategy" similar to a TILLING approach.In this approach, CEL1 cleaves at the site of heteroduplex indicating mismatches in the sequences.This allowed us to identify only SNPs between our cultivars avoiding the sequencing of the complete genes.A total of fourteen SNPs were found in 7 of the genes considered.
This study identified several allelic variants, especially for / gliadin, low molecular weight glutenin GluB3-23, and Glutathione transferase genes: single nucleotide polymorphism detected in these genes resulted in an amino acid substitution in the mature protein sequences.These mutations in the allergen protein could have some effect on their allergenicity, so clinical trials and allergenic tests are needed to further investigate their potential role.Then breeding programs could be established in order to obtain new varieties with a lower or null allergenicity.
other single lines, following thermal cycler program: 95 ∘ C for 2 min; loop 1 for 8 cycles (94 ∘ C for 20 s, 73 ∘ C for 30 s, reduce temperature 1 ∘ C per cycle, ramp to 72 ∘ C at 0.5 ∘ C/s, 72 ∘ C for 1 min); loop 2 for 45 cycles (94 ∘ C for 20 s, 65 ∘ C for 30 s, ramp to 72 ∘ C at 0.5 ∘ C/s, 72 ∘ C for 1 min); 72 ∘ C for 5 min; 99 ∘ C for 10 min; loop 3 for 70 cycles (70 ∘ C for 20 s, reduce temperature 0.3 ∘ C per cycle); hold at 8 ∘ C.After annealing, DNA has been treated with Surveyor nuclease to cleave heteroduplexes by adding 0.2 L of enzyme and 1.3 L of buffer 1x.The digestion step was done at 5 ∘ C for 90 minutes and stopped immediately by adding 5 L of 0.225 EDTA and 2 L of bromophenol blue loading dye per sample and mixing thoroughly.

Figure 1 :
Figure 1: Electrophoretic pattern on a 3% acrylamide gels of Tri a 21 M1 and Tri a 31 M1 allergen fragments amplified in the two different genotypes and digested with Surveyor nuclease for SNPs detection.(M: Marker 100 bp; Mt: mutant; C: control (referred to Svevo cv.); Mt + C: heteroduplex from both mutant and control.)

Figure 2 :Figure 3 :
Figure 2: Partial cDNA (a) and complete protein sequence (b) alignments for Glutathione transferase (EU584497) genes.Highlighted in yellow, SNPs mutation in genomic sequence and amino acid substitution.

Table 1 :
Wheat allergens from SDAP website.Highlighted in bold, the allergen selected for molecular screening.

Table 2 :
Wheat allergens accession numbers and primer pairs combinations used for allelic variation detection.

Table 3 :
Polymorphism detected for wheat allergen alleles in a collection of 240 wheat genotypes.Chromosomal localization, SNPs type, number of lines carrying each allele, SNP localization, and amino acidic substitution detection were reported.