Expression Pattern of the Alpha-Kafirin Promoter Coupled with a Signal Peptide from Sorghum bicolor L. Moench

Regulatory sequences with endosperm specificity are essential for foreign gene expression in the desired tissue for both grain quality improvement and molecular pharming. In this study, promoters of seed storage α-kafirin genes coupled with signal sequence (ss) were isolated from Sorghum bicolor L. Moench genomic DNA by PCR. The α-kafirin promoter (α-kaf) contains endosperm specificity-determining motifs, prolamin-box, the O2-box 1, CATC, and TATA boxes required for α-kafirin gene expression in sorghum seeds. The constructs pMB-Ubi-gfp and pMB-kaf-gfp were microprojectile bombarded into various sorghum and sweet corn explants. GFP expression was detected on all explants using the Ubi promoter but only in seeds for the α-kaf promoter. This shows that the α-kaf promoter isolated was functional and demonstrated seed-specific GFP expression. The constructs pMB-Ubi-ss-gfp and pMB-kaf-ss-gfp were also bombarded into the same explants. Detection of GFP expression showed that the signal peptide (SP)::GFP fusion can assemble and fold properly, preserving the fluorescent properties of GFP.


Introduction
Bioengineering cereal plants as biofactories for the production of valuable proteins in seeds has been widely reported [1]. This has been encouraged by the rapid development of reproducible and efficient transformation systems coupled with extensive research investigating the potential of seed-specific promoters. Endosperm tissue represents an ideal platform for the production of recombinant proteins. Availability of simple seed proteome offers the advantage of easier recombinant protein purification [2]. Presence of chaperones and disulfide isomerases in the developing seed and absence of proteases especially in the endosperm tissue facilitate proper protein folding [3]. As a consequence, the proteins expressed in seeds are highly stable. For example, single-chain antibodies expressed in seeds of rice and wheat showed high biological activities and remained stable for several years [1]. Only 50% loss of functional antibodies after eighteen months in storage was reported [4]. Long-term storage and easy transportability of seeds are possible due to the desiccated nature of the mature seeds. Finally, proteins restricted to the seed facilitate biological containment as they limit adventitious contact with nontarget organisms such as microbes and leaf-eating herbivores, while not normally interfering with vegetative plant growth [5]. The recombinant seeds also extend the possibility for direct use as an edible vaccine [2].
The genes that encode for the prolamin storage proteins are an ideal source for the isolation of seed-specific promoters, as these proteins are exclusively synthesized in the endosperm, and are expressed at high levels during seed development in most cereals. Like other prolamin genes, kafirins are produced in the developing sorghum endosperm, are cotranslationally transported into the lumen of rough endoplasmic reticulum (ER), with simultaneous cleavage of the signal peptide, and are ultimately deposited into protein bodies [6]. Kafirin is the most abundant sorghum seed protein, constituting 70-80% of the total protein. Among all the kafirin subunits, α-kafirin is expressed at high levels, accumulating at approximately 80% of the total kafirins in sorghum seeds [7]. The α-kafirin promoters have potential for directing high levels of seed-specific protein expression in sorghum.
Protein bodies of sorghum seeds have great potential for the deposition and storage of large amounts of recombinant proteins. Sorghum seeds have higher protein content compared to the other major cereals such as maize [8], and more than 80% of the total grain proteins are deposited as storage protein in protein bodies [9]. Structurally, αkafirin is deposited into the inner core of protein bodies surrounded by βand γ-kafirin [9] and therefore protected from proteolytic enzyme degradation [6]. It is anticipated that targeting the protein of interest (POI) to the same compartment will prevent its proteolytic degradation, hence ensuring abundant production of the protein in sorghum seeds. This can be facilitated by the use of signal sequences as the fusion of SP::POI which targeted the POI to the ER and are thus deposited in the inner core compartments of the protein bodies [6,10].
Repeat transformation or crossings of independent transformants have been the conventional strategy for the production of transgenics. This strategy is time consuming and labour intensive. Recently, a multigene transformation system was used for simultaneous introduction of several genes through the construction of one expression vector [11,12]. To keep pace with this technology, suitable promoters are required to drive multiple gene expression. Lack of suitable promoters is a critical limiting factor for such research. Hence, sourcing promoters with desired specificity is an important prospect in transgene regulation, as well as reducing the possibility of homology-based transgene silencing [13].
Cauliflower mosaic virus 35S and maize Ubiquitin-1 are strong constitutive promoters, widely used in plant genetic engineering [14,15]. These promoters continuously express high levels of a foreign gene in all tissues throughout development. This is wasteful in terms of the host plant's energy and may be detrimental to the host plant [16]. Moreover, levels of target gene expression in the desired tissue are frequently unsatisfactory [17]. Use of an endosperm-specific promoter can therefore overcome this situation. These types of promoters regulate gene expression from the mid to late stage of seed maturation, and there is either no or much lower expression in other tissues. Furthermore, endospermspecific promoters drive the expression of stable foreign protein better than ubiquitous promoters [18]. Endospermspecific promoters have been isolated and characterised from other cereals including rice (Oryza sativa L.) [19], wheat (Triticum aestivum L.) [20], maize (Zea mays L.) [21], and barley (Hordeum vulgare L.) [18].
Expression patterns of an α-kafirin promoter have previously been demonstrated in stably transformed transgenic tobacco [22], while transient expression of the γ-kafirin promoter was investigated in sorghum, coix (Coix lacrymajobi), and maize tissues [23,24]. In all studies, the isolated kafirin promoter was translationally fused with a coding sequence from uidA (GUS). GFP is another popular reporter gene used in plant transformation for assessing promoter activity [25].
The aim of this paper was to isolate an α-kafirin promoter and ss from Sorghum bicolor and evaluate its ability to direct GFP expression into sorghum seeds. It is hoped that this will serve as a platform for the future seed-specific expression of foreign genes. The results from this study will be useful for providing alternative choices of promoters for the production of high-value recombinant proteins in sorghum and other cereal crops.

Plant Material and DNA Isolation.
Seeds of the Sorghum bicolor Indian inbred line 296B were provided by Queensland Department of Employment, Economic Development and Innovation (DEEDI) breeding program. Sweet corn cobs (Zea mays L.) were purchased from the local supermarket. Genomic DNA was isolated from the etiolated leaves of sorghum using a modified CTAB protocol [26].

Cloning and Sequencing of the α-Kafirin Gene Promoter
and Signal Sequence. Amplified putative α-kafirin gene promoters (α-kaf) and signal sequence (ss) were purified using Quantum PrepTM Freeze 'N Squeeze DNA Gel Extraction Spin Columns (Bio-Rad), cloned into pCR4 TOPO sequencing vector (Invitrogen, USA) and used to transform E. coli TOP10 competent cell. Transformants were screened via colony PCR with F kaf and R kaf primers. Recombinant plasmids were isolated using a QIAprep Spin Miniprep Kit from Qiagen (Valencia, Calif) and sequenced at the Australian Genome Research Facility (AGRF), University of Queensland using M13 forward and reverse primers (Invitrogen, USA). Sequences were analysed using Geneious Pro 5.1 beta software and submitted to NCBI. Promoter regulatory DNA motifs were identified using PLACE data-base (http://www.dna.affrc.go.jp/PLACE/signalscan.html). Putative signal sequence was verified using localization prediction programs TargetP 1.1 (http://www.cbs.dtu.dk/services/TargetP/).

Construction of Promoter-Signal Sequence-GFP Chimeric
Genes. The superbinary vector pMB-Ubi-gfp was used as a backbone for the generation of all constructs used in transformation of sorghum and sweet corn explants. Maize Ubiquitin promoter [27] was included as a positive control for constitutive expression. Overlapping PCR [28] with Pwo SuperYield DNA Polymerase (Roche) was used to generate transcriptional fusions between α-kaf (with or without the ss) and the gfps65T gene sequence [29,30], referred to as gfp. The promoter gfp and the promoter ss-gfp chimeric were directionally cloned into the AvrII and Sbf I sites, replacing the Ubi-gfp fragment of pMB-Ubi-gfp. This generated pMBα-kaf-gfp, pMB-α-kaf-ss-gfp and pMB-Ubi-ss-gfp constructs suitable for microprojectile mediated transformation. All constructs were assessed by restriction enzyme digestion analysis and DNA sequence analysis [31].

Transient Expression Assays by Microprojectile Bombardment.
Explants from sorghum used for microprojectile bombardment-mediated transformation were (1) immature embryos (IEs), which were excised from seeds at 15 DPA (0.8-1.4 mm size), (2) young leaf segments, size approximately 20-25 mm, which were harvested 2 to 3 weeks after seed sowing and (3) seeds that were obtained at 20 DPA. Explants from sweet corn cobs were (1) IEs excised from seeds and (2) seeds longitudinally sliced in half. Explants were bombarded using a particle inflow gun [32]. Each plasmid construct was precipitated onto 1.0 μm gold particles. To prepare microprojectiles for bombardment, 25 μL of gold particle suspension was vortexed for 30 s and then mixed with 2.0 μL of 1.0 μg/μL plasmid DNA, 25 μL of 2.5 M CaCl 2 , and 5.0 μL of 0.1 M spermidine-free base. All solutions were kept on ice. The mixture was kept in suspension for 5 min by vortexing every 20-30 s, allowed to precipitate for 10 min on ice, and then 22 μL of the supernatant was removed. The remaining suspension was vortexed immediately prior to using 5.0 μL aliquots of the mixture for each bombardment. The particle/DNA mixture was placed in the centre of the syringe filter unit. In each bombardment, target explants were arranged in a 3 cm diameter circle at the centre of a 9 cm petri dish. Target explants were placed 12.5 cm from the point of particle discharge and covered with a stainless steel mesh baffle with a mesh size of 50 μm. Helium pressure was either 1500 or 2200 kPa, and the chamber vacuum was −85 kPa. Following bombardment, the explants were kept for 24 h in the dark. GFP expression was observed using
The cis-element from the putative promoter which could be involved in the regulation of endosperm-specific expression was identified using the PLACE database [33]. The motifs were identified in a 0.5 kb proximal region of the promoter ( Table 1). The putative α-kafirin ss was also investigated ( Figure 1). The putative ss which precedes 237 bp of partial α-kafirin encoding regions (DNA sequence obtained from Phytozome v5 database) was translated into the corresponding predicted amino acid sequence using Geneious Pro 5.1 beta software. The deduced amino acid sequence was used as an input for the plant version of TargetP [34], a subcellular localization of amino acid predictors. MAAKIFSLIMLLALFASAATA was recognized as an SP, targeting the adjacent α-kafirin protein into a secretory pathway. Further, SignalP-HMM (version 2.0) [35] confirmed that SP was cleavable, and the cleavage site was between amino acid positions 21 and 22 ( Figure 2).

Transient GFP Expression in Sorghum and Sweet Corn
Explants. As anticipated, the Ubi promoter, through pMB-Ubi-gfp construct, constitutively drives GFP expression in all explants ( Table 2). Images corresponding to Ubi-directed GFP expression ( pMB-Ubi-ss-gfp showed GFP expression in endosperm and the embryos of sweet corn (Figure 4, (r) and (v)) and sorghum ( Figure 4, (b) and (f)) as well as callus derived from sorghum IE (Figure 4, (j)). However, no GFP spots were detected on leaf transformed with this construct.

Discussion
The α-kaf promoter was shown to be functional in regulating seed-specific GFP expression. GFP foci were observed on the seed tissues (embryo and endosperm) of sweet corn and sorghum when bombarded with pMB-α-kaf-gfp and pMBα-kaf-ss-gfp but not in sorghum leaf and sorghum IE-derived callus. These results suggest that the isolated 1.17 kb 5flanking regions of the α-kafirin gene were sufficient for seed specificity.
To date, most endosperm-specific promoter studies have focused on identifying the cis-elements in promoters. Studies of promoter regions from prolamin genes have shown multiple cis-acting elements and transcriptional activators. Essential motifs for prolamin gene expression were mostly reported within the 0.5-1.5 kb upstream of the translation start codon [36,37] which is in agreement with our findings. From the PLACE database [33], we identified P-box, O2box, CAAT-box, TATA-box, and their positions from the start codon of the α-kafirin gene. Based on this analysis, the fragments contained regulatory motifs featuring sorghum prolamin promoters [22,24,38]. P-box, located about 300 bp upstream of the translation start codon, was conserved in all promoter elements of cereal prolamin genes reported so far [39][40][41]. It comprised an endosperm (TGTAAAG) and GCN4-like motif (GLM) (A/G) TGAGTCAT [42]. Though present in many cereals, little is known about the role of the P-box in the regulation of gene expression. Functional analysis of the −300 bp region of the zein promoter has indicated that the P-box can stimulate the expression of prolamin genes, responsible for endosperm-specific expression, and that this effect is dependent upon the position and orientation of the promoter [43][44][45]. Unlike the Opaque 2-(O2-) box which is present in certain classes of the zein gene, the prolamin-box mediates and coordinates the activation of all classes of zein genes during endosperm development as well as many storage protein genes from related cereals [46] which exemplifies the significant role of these promoter motifs in regulating seed-specific expression. Besides the Pbox [41,47], O2-boxes 1 and 2 [48,49] were also found within −300 bp upstream of the coding regions for 22 kDa zein genes in maize, 22-kDa coixin genes in Coix lacrymajobi [50], and 22-kDa kafirin genes in sorghum [51,52]. Our results show that the O2-box 1 (TAACATGTGT) [38] is located adjacent and upstream from the P-box motif of the α-kaf fragment. The P-box motif was recognized by P-box binding factors (PBFs) that regulate prolamins gene expression. Due to the short spacing between the two motifs, PBF may interact with the O2 protein as well, to activate prolamin gene expression [41,46]. Unlike gamma and beta zein/kaifrin gene promoters, all alpha zeins/kafirins promoters do not have GCN4-like motifs [42]. This is in agreement with our results which show an absence of GCN4like motifs in α-kaf. Our results indicate that the sequence displayed homology with promoters from 19 kDa α-zein of maize. This result is not surprising since many features of sorghum suggest it is closely related to maize. Genes encoding kafirins are related with zein in terms of sequence and size [52]. Further, promoter motifs that regulate αkafirin/zein genes share common conserved P-box promoter elements [42].
Prolamins of rice, maize, sorghum, and millets are produced by the secretory pathway. They accumulate within the lumen of ER, giving rise to the formation of discrete protein bodies surrounded by a membrane of ER origin [53]. By default, proteins tagged with N-terminal transient SP and/or transmembrane domains will be targeted to the ER [54], navigating to the downstream compartment of ER along the secretory pathway, particularly in the Golgi, and finally move either to the vacuole or are secreted from the cell. This will in turn help to avoid the fusion protein from proteolytic enzymatic exposure and degradation that can    lead to the formation of nonfunctional truncated protein or, in the extreme case, no production of the desired protein. N-terminal peptides are typically 15-30 amino acids long, which are cleaved off during translocation of the protein across membrane [34]. In this study, 63 bp of putative αkafirin ss was isolated.
Reports relating to kafirin gene expression patterns are rare. De Rose et al. [22] evaluated α-kafirin promoter efficiencies in dicot tissue. 855 bp of 5 promoter region and the signal sequence of a 22 kDa α-kafirin seed protein from sorghum were investigated. Constructs containing translational fusion between the α-kafirin promoter of sorghum and the β-glucuronidase (gus) coding region from E. coli gene uidA were evaluated in transgenic tobacco seeds. The promoter drove seed-specific GUS expression over the period of 10-15 DPA. Dissected endosperm tissue and embryos were positive for GUS expression with slightly greater expression in endosperm. No expression was detectable in dissected seed coats or vegetative tissues. Mishra et al. [24] investigated 575 bp of γ-kafirin promoter sequence in directing transient seed-specific GUS expression in various sorghum explants. Blue foci were observed in endosperm tissue obtained at 20 days after anthesis (DPA) and the embryo axis. No GUS expression was noted on other seed parts or callus. Further, absence of GUS expression in endosperm tissue obtained at 40 DPA indicates potential temporal regulation from this promoter. Freitas et al. [23] demonstrated the ability of γ-kafirin promoters to drive GUS expression in sorghum, maize, and coix. The 1.19 kb intact γ-kafirin promoter was shown to be endosperm specific. However, regulation of  temporal expression from the promoter was not reported. Further, histochemical analysis of GUS activity in different tissues indicates that the element(s) responsible for tissue specificity is probably located in the 285 bp proximal region of the promoter, while the remaining promoter sequence seems to carry the element(s) responsible for the quantitative response. From the GFP expression profile of the pMB-Ubi-ssgfp and pMB-α-kaf-ss-gfp constructs, we can suggest that SP::GFP fusion proteins can assemble and fold properly while preserving the properties of GFP. Drakakaki et al. [10] demonstrated in rice that the N-terminal SP::Phytase fusion was directed to different compartments in the cells as a function of the tissue in which it was expressed. Fluorescence and electron microscopy were used to analyse subcellular localization of the recombinant phytase in stably transformed transgenics. Phytase was present in the apoplast of callus tissues, while intercellular localization of phytase was observed in leaf. On the other hand, the protein was restricted to ER-derived protein bodies of the endosperm tissue and was absent in the intercellular space. In this study, no GFP expression was seen in leaves using the pMB-Ubiss-gfp. This may suggest that α-kafirin ss has a role in either inter-or intracellular localization of SP::GFP on the leaf, callus, and endosperm cells of sorghum. It is possible that the SP facilitates cotranslational import of GFP into the secretory pathway. In leaf cells, GFP may then be secreted to the cell surface, while in seeds, GFP may be directed into intracellular domains of the endomembrane system. However, there is also the possibility that the putative SP contributes to the degradation of the protein in nontarget tissues; therefore, no GFP foci are detected in the leaf tissue when the SP is tagged. The main focus here was to identify seed-specific expression of GFP under the control of the α-kafirin promoter, with or without the SP. The exact mechanism of this expression was beyond the scope of the present study.
In conclusion, our results show that the α-kaf promoter drives endosperm-specific expression. The SP::GFP preserves the fluorescing properties of GFP, and we hypothesize that the ss of the α-kafirin gene influences localization of GFP in a tissue-dependent mode for leaf, callus, and endosperm in sorghum. This will be determined in stable sorghum transgenics and transgenic progenies we have regenerated with the different constructs used in this study. Consequently, the αkaf promoter has potential in biotechnological applications for seed-specific protein expression. Fusion of proteins with sorghum α-kafirin SP may potentially be subcellularly Journal of Biomedicine and Biotechnology 7 targeted into intracellular domains of the endomembrane system in a given cell.