Selection of Housekeeping Genes for Transgene Expression Analysis in Eucommia ulmoides Oliver Using Real-Time RT-PCR

In order to select appropriate housekeeping genes for accurate calibration of experimental variations in real-time (RT-) PCR results in transgene expression analysis, particularly with respect to the influence of transgene on stability of endogenous housekeeping gene expression in transgenic plants, we outline a reliable strategy to identify the optimal housekeeping genes from a set of candidates by combining statistical analyses of their (RT-) PCR amplification efficiency, gene expression stability, and transgene influences. We used the strategy to select two genes, ACTα and EF1α, from 10 candidate housekeeping genes, as the optimal housekeeping genes to evaluate transgenic Eucommia ulmoides Oliver root lines overexpressing IPPI or FPPS1 genes, which are involved in isoprenoid biosynthesis.


Introduction
Genetic transformation of plants is widely used to study plant physiology (biochemical pathways, resistance to pathogens, reaction to stresses) and to obtain commercial crops with improved agronomic characters (herbicide tolerance, insect resistance, etc.).More recently, it has also been used to develop new types of plants as bioreactors (pharmaceuticals, vaccines, nutraceuticals, etc.) [1].Regardless of the transformation's purpose, when new transgenic plants are obtained, an early and essential step is to evaluate the transgene expression [1].For many years the vast majority of gene expression studies have used nonquantitative or semiquantitative RNA gel blots and RT-PCR analysis [2].Recent advances in PCR instrumentation and fluorescence chemistry have made the precise quantification of specific amplification products possible.Quantitative real-time (RT-) PCR technology quantifies products by detecting fluorescence emitted from specific double-stranded DNA binding dyes or fluorophore-labeled probes that hybridize with target sequences during the exponential phase of the PCR reaction [3].However, this process can be affected by various experimental variations, including the amount of starting material, efficiencies of enzymes in the (RT-) PCR reaction, and differences between tissues or cells in overall transcriptional activity.It is crucial, therefore, to amplify a housekeeping gene (also known as an "endogenous reference" gene) alongside the target gene to calibrate for experimental variability [4,5].In these expression assays, the target concentration in each sample is calculated relative to the housekeeping gene and the result is expressed as the target/housekeeping ratio.
Many studies on housekeeping gene expression have dealt mainly with human tissues, bacteria, and viruses [6].Only a few studies have focused on plants' vegetative and floral tissues, at different stages of development or under biotic or abiotic stress, for example, barley [7], rice [8], poplar [2], potato [6], Arabidopsis, and tobacco [9].Except for a study which reported that NelF-4A is an ideal constitutively expressed control gene in the analysis of the transgenic tobacco expressing NelF-4A fused to the GUS reporter gene (the transgene expression of GUS matched the expression patterns of NelF-4A mRNA and protein) [10], there have been no reports on the evaluation of housekeeping genes for calibration of transgene expression, particularly with respect to the influence of transformation with the foreign gene on stability of endogenous housekeeping gene expression in transgenic plants.
In this study, we chose 10 commonly used housekeeping genes and analyzed their expression levels in several transgenic root lines of Eucommia ulmoides Oliver using real-time RT-PCR.By combining statistical analyses of their (RT-) PCR amplification efficiency, expression stability, and transgene influences, we outlined a reliable strategy to select the optimal housekeeping genes for accurate calibration of real-time (RT-) PCR results in this E. ulmoides system.
E. ulmoides is a deciduous, dioecious woody plant, and is a Tertiary species that survives only in China [11].It produces a trans-polyisoprene known as Eu-rubber in the leaves, root, bark, and pericarp [12,13].Eu-rubber has several specific properties that differ from those of natural rubber (cis-polyisoprene), including hard "plasticity."It is an excellent nonconductor and has an extremely low coefficient of thermal expansion/contraction that could be exploited in the manufacture of insulated cables, moulds, shoe soles, adhesives, medical or scientific appliances, and sports goods.Our ultimate goals were to isolate and characterize the genes related to Eu-rubber biosynthesis so that we can enhance or improve the quantity or quality of Eu-rubber products using gene transformation techniques.Genetic transformation and transgene expression analysis protocols developed in this study provide the basis for further genetic alteration of E. ulmoides.

Sample Preparation.
A proliferated root line from a 4week-old germfree seedling of E. ulmoides by suspension culture was, respectively, infected with three lines of Agrobacterium tumefaciens LBA4404 [14], each harboring one of three types of binary vector, namely, pOEB1, pOEB5, or pOEB9 (Figure 1), all derived from pMSH1 [15].The fulllength cDNAs of IPPI and FPPS1 (described in Results and Discussion) constructed in pOEB5 and pOEB9 were amplified by RT-PCR from E. ulmoides mRNA.After selection and differentiation, three transgenic root lines transformed with each type of Ti-plasmid were obtained.The nine transgenic root lines (3 transgenic types × 3 lines) and the proliferated root line (wild-type, nontransformed negative control) were sampled in triplicate (3 transgenic types × 3 lines × 3 repeats + 1 wild type × 1 line × 3 repeats = 30 samples).

Total RNA Extraction.
Total RNA was extracted from each sample using the RNeasy Plant Mini Kit (Qiagen) according to the manufacturer's instructions.To eliminate residual genomic DNA in the preparation, RNA samples were treated with the RNase-Free DNase I (Qiagen) and were tested by real-time PCR using 50 ng RNA as template in the same conditions as described below.Then, the RNA was adjusted to 20 ng/μL (all sample qualities were assessed by A 260/280 ratios, >1.9).Six diluting concentrations of RNA (400, 100, 25, 6.25, 1.56, 0.39 ng/μL) were prepared and were used to construct a standard curve.
2.3.Primer Design.We selected 10 candidate housekeeping genes as follows: ACTα, ARPT, CYP, EF1α, EIF1α, GAPD, rbcL, TUBα, TUBβ, and UBQ.The IPPI and FPPS1 genes, which are involved in trans-polyisoprene (Eu-rubber) biosynthesis (described in Results and Discussion), were chosen as the target genes of interest.Primers were designed according to E. ulmoides EST sequences using Primer Express (Applied Biosystems) with melting temperatures of 59-60 • C. All primer pairs (Table 1) were initially tested by standard RT-PCR using the conditions described below for real-time RT-PCR.Amplification of single products of expected size was verified by electrophoresis on 3% agarose-LE (Nacalai Tesque).

Two-
Step Real-Time RT-PCR.Sample cDNA (including the samples for standard curves) was synthesized from 10 μL total RNA in a 20 μL volume using the High Capacity Reverse Transcription Kit (Applied Biosystems).Real-time PCR was performed in a 25 μL volume containing 150 nM of each primer, 5 μL cDNA sample (≈10 ng/μL) and 1 × SYBR Green PCR Master Mix (Applied Biosystems) on the ABI Prism 7300 Sequence Detection System (Applied Biosystems).PCR reactions were carried out in a 96-well reaction plate using the parameters recommended by the manufacturer (50 • C for 2 minutes, 95 • C for 10 minutes, 40 cycles of 95 • C for 15 seconds and 60 • C for 1 minutes, and a dissociation stage of 95 • C for 15 seconds, 60 • C for 1 minutes, and 95 • C for 15 seconds, 60 • C for 15 seconds).Each PCR reaction was performed in triplicate and a no-template control was included.

Data Acquisition.
The Ct value was defined as the cycle in which there is a significant increase in the amount of PCR product.Relative quantities were determined by interpolation from standard curves to create linear values for each sample.Both Ct value and relative quantity of each sample were acquired from the ABI Prism 7300 Sequence Detection System.

Statistical Analyses. The variability of absolute Ct value of each target (IPPI, FPPS1
) and candidate housekeeping gene was calculated from all tested samples (total data number of each gene = 3 transgenic types × 3 lines × 3 repeats + 1 wild type × 1 line × 3 repeats = 30).The PCR amplification efficiency of each target and candidate housekeeping gene was calculated from the slope of the standard curve (total data number of each gene = 6 standard curve concentrations × 3 repeats = 18) according to the equation: PCR efficiency = (10 −1/slope − 1) × 100%.
To validate the relatively equivalent of (RT-) PCR efficiencies between the target and the candidate housekeeping gene, the Ct values of a target gene and a candidate housekeeping gene from the samples for the standard curve were used (total data number of one gene pair (a target gene and a housekeeping gene) = 6 standard curve concentrations × 3 repeats × 2 genes = 36).The concentrations of the standard RNAs were transferred to logarithmic values (log400, log100, log25, log6.25,log1.56,log0.39) and the ΔCt values (Ct target gene -Ct housekeeping gene of each sample) were plotted   versus log concentrations to create a semi-log regression line for calculating the line slope.
The stability of candidate housekeeping gene expression was evaluated using the geNorm method [16].The relative quantities of all candidate housekeeping genes in all samples (total data number = (3 transgenic types × 3 lines × 3 repeats + 1 wild type × 1 line × 3 repeats) × 10 candidate housekeeping genes = 300) were used to calculate M and V pairwise variation parameters (described in Results and Discussion).
Variance analysis of each candidate housekeeping gene expression was performed using variance components and mixed model ANOVA/ANCOVA (Statsoft).The relative quantities of each candidate housekeeping gene in the samples (total data number of each candidate housekeeping gene = 3 transgenic types × 3 lines × 3 repeats + 1 wild type × 1 line × 3 repeats = 30) were used to compare the variation of each candidate gene expression among the 4 transgenic types (3 transgenic types + 1 wild type) and the variation among the transgenic lines within each transgenic type.

Evaluation of Target and Candidate Housekeeping Genes' PCR Primer Specificity and Variability of RNA Expression
Level among All Tested Samples.By comparison to E. ulmoides EST sequences, primers were designed to be as specific as possible for the selected gene family member.All primer pairs were initially tested by standard RT-PCR using the same conditions as for real-time RT-PCR and by addition of a dissociation stage (95 • C for 15 seconds, 60 • C for 1 minute, and 95 • C for 15 seconds, 60 • C for 15 seconds) after real-time RT-PCR.The RT-PCR product of each gene verified by electrophoresis on 3% agarose gel showed only a single band (Figure 2).Dissociation curve analysis also revealed that the real-time RT-PCR product of each gene had a unique melting peak (Tm, Table 1).The results indicated that each primer pair was specific and had no mismatch or false priming to the selected gene.Except GAPD (75.07%) and TUBβ (124.50%),all PCRs displayed amplification efficiency between 80% and 120% (Table 1).
To compare different RNA expression levels of target and candidate housekeeping genes over all tested samples,  the variability of absolute Ct value was calculated.The results (Table 1, Figure 3) revealed that all genes presented the median Ct values between 15 and 25, excepted rbcL (median Ct <15) and CYP (median Ct >25).The lowest RNA expression range (the difference between the maximum and minimum Ct values) could be observed for the ARPT followed by UBQ and rbcL (Table 1).The coefficient of variation (CV) was <10% for all target and candidate housekeeping genes (Table 1).experimental conditions.However, this can lead to incorrect interpretation of results, as often the gene with low variations in expression is low in abundance compared to target mRNA transcripts.Also, the use of 18S or 28S rRNA as housekeeping genes makes it difficult to accurately calibrate experimental variations as rRNA molecules are almost absent from purified mRNA samples, but make up the bulk of total RNA samples [16].In practice, all genes will show some variation in expression under different conditions.Our analysis of absolute Ct value also revealed different variations over all target and candidate housekeeping genes (Table 1, Figure 3).In order to bypass this potential source of variation and more accurately evaluate the housekeeping gene expression stability, we decided to use the geNorm method.This method relies on the principle that the expression ratio of two perfect housekeeping genes would be identical in all samples in all experimental conditions.Variation in the expression ratios between different samples reflects the fact that one or both of the genes are not stably expressed.Vandesompele et al. [16] defined two parameters to quantify housekeeping gene stability: M (the average pairwise variation of a particular gene compared with all other tested housekeeping genes; genes with the lowest M values have the most stable expression) and V (the pairwise variation V n /V n+1 between 2 sequential normalization factors, NF n and NF n+1 .A large V value means that the added gene had a significant effect and should probably be included for calculation of the normalization factor).In Pair-wise variation (V )  the present research, the relative quantities of all candidate housekeeping genes in all samples were used to calculate the two parameters.As shown in Figure 4, ACTα and EF1α were the most stable genes with the lowest M values.Since all the pairwise variations V n /V n+1 (Figure 5) were below the 0.15 cutoff value [16], an additional housekeeping gene did not contribute significantly to improving accuracy.Therefore, using the average of ACTα and EF1α was sufficient for accurate calibration, and a third housekeeping gene (V 2/3 = 0.143 < 0.15) was not required as an internal control.The geNorm analysis is independent of the difference in abundance between the genes and independent of variation among samples.It is equally affected by any outlying or extreme ratio (i.e., outliers for a sample with low or high overall expression, or outliers caused by an upregulated or downregulated gene that have an equivalent increase in pairwise variation, V ) [16].Therefore, the geNorm analysis has been used in several recent studies, as it is a robust method to evaluate stability of gene expression and to determine optimal housekeeping genes for calibration of the experimental variations [4,5,17].

Examination of Transgene Influence on Housekeeping
Gene Expression by ANOVA F-Test.In the pairwise comparison method (geNorm), special attention must be paid not to select housekeeping genes whose transcript expression can be influenced (or regulated) by the transgene in the transgenic plant, which theoretically should influence the relative target gene's expression (target/ housekeeping ratio) if the housekeeping gene and transgene are coregulated [18].In the present research, we used variance components and mixed model ANOVA/ANCOVA analyses (Statsoft) to compare variations in expression levels of each housekeeping gene in different transgenic types and lines (Table 3).The results of ANOVA F-tests showed that four of the candidate housekeeping genes, ACTα, EF1α (the two top-ranked candidates in the pairwise comparison approach), CYP, and TUBβ, had no significant variation in expression among the three transgenic types and the wild-type, as well as among the transgenic lines within each transgenic type (P > .05).This result indicated that expression of these four genes was not influenced by gene transformation.In contrast, the other genes, ARPT, EIF1α, GAPD, rbcL, TUBα, and UBQ, exhibited highly significant variations (P < .05 or .01) in expression among the three transgenic types and the wildtype, indicating that the gene transformation had influenced their expression.
It is not surprising that overexpression of IPPI and FPPS1 in E. ulmoides will influence expression of some candidate housekeeping genes.In the plant isoprenoid biosynthesis pathway, IPPI catalyzes the interconversion of IPP to DMAPP, which is an essential starter moiety for the condensation reactions.IPP is sequentially condensed to DMAPP to yield the short-chain isoprenoid precursors GPP, FPP, and GGPP, which are further metabolized to monoterpenes (C10), sesquiterpenes (C15), diterpenes (C20), and polyisoprenes (C > 5000) [13,19].Plant isoprenoids are essential for numerous physiological and developmental processes in plants (photosynthesis, respiration, membrane fluidity, pathogen defense, and modulation of growth and stress responses via isoprenoid-derived plant hormones) [20].Therefore, the changes of IPPI and FPPS1 expression levels in transgenic E. ulmoides are likely to influence other genes' expressions.In this case it may be prudent to examine the transgene influence on housekeeping gene expression rather than only ranking their expression stability by geNorm pairwise comparison approach.
In conclusion, we outline a reliable strategy to identify the optimal housekeeping genes for calibrating our realtime (RT-) PCR system by combining analyses of candidate housekeeping genes' (RT-) PCR amplification efficiency, expression stability, and transgene influences.We used the strategy to select ACTα and EF1α as the optimal housekeeping genes in our analysis of transgene expression in transgenic E. ulmoides root lines overexpressing IPPI or FPPS1 genes, which are involved in isoprenoid biosynthesis.ACT is one of the major components of cytoplasmic microfilaments in eukaryotic cells.It plays an important role in diverse cellular functions, such as cytoplasmic streaming, changes in cytoarchitecture, and distribution of plasma membrane proteins in response to interna1 and externa1 signals [21].EF1α is a ubiquitous protein that binds aminoacyl-transfer RNA to ribosomes during protein synthesis [22].Both are housekeeping genes that are frequently used to calibrate target gene expression level [6,23].This study provides a more reliable strategy to evaluate the appropriate housekeeping genes in transgene expression analysis.

Figure 1 :
Figure 1: Schematic structure of T-DNA regions of the binary vectors pOEB1, pOEB5, and pOEB9.RB: right border; LB: left border; I: intron of castor bean catalase gene CAT-1.

Figure 3 :
Figure 3: The variability of absolute Ct value in each target (IPPI, FPPS1) and candidate housekeeping gene among all tested samples.Grey bars indicate the 25/75 percentiles, whisker caps indicate the maximum and minimum, the line marks the median.

Figure 4 :
Figure 4: Average pairwise variation (M) of candidate housekeeping genes plotted from least stable (left) to most stable (right).

Figure 5 :
Figure 5: Pairwise variation (V n /V n+1 ) between the normalization factors NF n and NF n+1 to determine optimal number of housekeeping genes.

Table 1 :
Primers for real-time RT-PCR and their characteristics.
Data are based on analyses of absolute Ct values from all tested samples and standard curve of each target (IPPI, FPPS1) and candidate housekeeping gene.Tm: melting temperature of real-time RT-PCR product; SD: standard deviation; Range: difference between the maximum and minimum Ct values; CV: coefficient of variation; Slope: slope of standard curve; R 2 : correlation coefficient of standard curve; PCR efficiency = (10 −1/slope − 1) × 100%.
ACTα and EF1α, had absolute values of the slope of ΔCt versus log concentrations <0.1, indicating that ACTα or EF1α and the target genes IPPI or FPPS1 had similar or relatively equivalent (RT-) PCR efficiencies.In other words, using ACTα or EF1α as housekeeping genes would calibrate the experimental variations in the IPPI or FPPS1 expression analyses more reliably.Another advantage of the housekeeping gene and the target gene having equivalent (RT-) PCR efficiencies is that we can calibrate experimental variations by the comparative Ct method (also known as the ΔΔCt method) without standard curves.This is particularly useful when only a few target genes are being studied, or when limited amounts of RNA are available.
3.3.Ranking of Candidate Housekeeping Genes with Respect to Expression Stability.Many studies have proposed that the optimal housekeeping gene to calibrate for experimental variation is one with the lowest variation in expression or one whose expression remains constant under different

Table 2 :
Validation of the relatively equivalent of (RT-) PCR amplification efficiencies between the target and the candidate housekeeping gene.Data are based on linear regression analyses using ΔCt values (Ct target gene -Ct housekeeping gene) from the samples for standard curve versus log concentrations of standard curve; * : slope of semilog regression line < 0.1.

Table 3 :
ANOVA F-test of candidate housekeeping gene variances among three transgenic types and wild-type as well as among transgenic lines within each transgenic type.Data are based on variance components and mixed model ANOVA/ANCOVA using relative quantities of each candidate housekeeping gene in all samples; * , P < .05;* * , P < .01.