The threat of heat stress on crop production increased dramatically due to global warming leading to the rise on the demand of heat-tolerant crops and understanding their tolerance. The leguminous forage crop Guar [
The extreme high temperatures, and other changes in the global climate, have resulted in destructive damage to crop production [
Heat stress damages plant cellular components through different mechanisms as a consequence of temperatures that far exceed ideal growth conditions [
The breakthroughs in next-generation sequencing, especially for Illumina RNA-Seq, is rapidly becoming the method of choice for transcriptional profiling experiments and offered new opportunities for comprehensive transcriptomic analyses in nonmodel plants. RNA-seq has been widely used to research the responses of plants to abiotic stress [
Guar [
The molecular mechanisms involved in plant heat stress response need to be determined in order to understand how plants react and adapt to heat stress and to produce crops with improved thermotolerance ability to survive and thrive to grow under heat stress conditions. In the current study, gene expression profiling was conducted in Guar leaves in response to heat treatment (42°C) at preflowering stage for leaves of heat-tolerant guar accession “PWP 5595” compared with the control (25°C) using RNA-seq to investigate metabolic adjustment and identify the genes that may play a vital role in heat adaptation in Guar. The present work provides important data for understanding the heat tolerance mechanism of this crop plant and establishes an important transcriptomic database for further study.
Seeds of
Total RNA from leaf samples was extracted following the manufacturer’s guidelines of the RNeasy® Plant Mini Kit (QIAGEN). DNase I, Bovine Pancreas (Biomatik) was utilized to remove DNA contamination. Agilent 2100 Bioanalyzer was used to assess the RNA integrity number (RIN). RNA samples with recommended purity and integrity were shipped to the Macrogen Inc. (
The quality of the raw sequence data was checked using FASTQC V 0.11.5 [
In our previous work [
The correlation and relationships among the stress conditions were investigated using the script “PtR” (Perl-to-R) by comparing and generating a correlation matrix for each condition replicates. Principal Component Analysis (PCA) (
Differentially expressed genes were identified and clustered according to expression profiles through EdgeR V3.08 (Empirical Analysis of Digital Gene Expression Data in R) Bioconductor package [
The professional version of Blast2GO software suite v4.1 (
The TF database PlantTFDB v4.0 (
High-quality RNAs of heat stress and control conditions were sequenced. In total, 110.7 million paired-end raw reads (~22 Gbp) with an average read length of 100 bp were generated from the targeted samples (Table
Raw data statistics and quality assessment.
Treatment/replicate | Total read bases (bp) | Total reads (pairs) | GC (%) | Q30 (%) |
---|---|---|---|---|
C1 | 3,682,737,548 | 18,231,374 | 44.77 | 95.08 |
C2 | 3,746,599,646 | 18,547,523 | 44.55 | 94.58 |
C3 | 3,881,717,446 | 19,216,423 | 44.49 | 94.98 |
H1 | 3,589,147,716 | 17,768,058 | 44.2 | 94.97 |
H2 | 3,541,905,774 | 17,534,187 | 44.11 | 94.93 |
H3 | 3,922,224,910 | 19,416,955 | 43.98 | 95.3 |
GC (%): GC content; Q30 (%): ratio of reads having a Phred quality score of over 30; C: control; H: heat treatment.
Quality control of biological replicates showed high homogenization between replicates either in the control or in the heat stress as shown in the Supporting Figures
The output from running the DE analysis resulted in identifying of logFC, logCPM,
The total of 1551 upregulated genes was subjected to analysis using BLAST2GO. Examples of upregulated DE heat-responsive genes in guar are listed in Table
Examples of upregulated heat-responsive genes in guar.
Putative gene name | Description | Length (bp) |
---|---|---|
TRINITY_DN35257_c0_g1 | Heat shock 70 kDa | 2645 |
TRINITY_DN13155_c0_g1 | Heat-stress-associated 32 | 2460 |
TRINITY_DN25460_c0_g1 | kDa class II heat shock-like | 908 |
TRINITY_DN19883_c1_g15 | Heat stress transcription factor A-4c-like | 1740 |
TRINITY_DN8056_c0_g1 | kDa class III heat shock | 1364 |
TRINITY_DN5986_c0_g1 | kDa class I heat shock-like | 1098 |
TRINITY_DN18122_c0_g2 | kDa class I heat shock-like | 896 |
TRINITY_DN18254_c0_g4 | kDa class IV heat shock | 1178 |
TRINITY_DN4388_c0_g1 | Small heat shock chloroplastic-like | 1126 |
TRINITY_DN14463_c0_g1 | Heat shock cognate 80 | 2759 |
TRINITY_DN15735_c0_g2 | Small heat shock chloroplastic | 2292 |
TRINITY_DN11506_c0_g1 | Heat shock cognate 70 kDa 2 | 2942 |
TRINITY_DN3531_c0_g1 | kDa class IV heat shock | 1024 |
TRINITY_DN20694_c0_g1 | Heat shock 83 | 2644 |
TRINITY_DN2570_c0_g1 | Class I heat shock-like | 812 |
TRINITY_DN16702_c1_g1 | Heat shock factor HSF30 | 2237 |
TRINITY_DN15088_c0_g1 | kDa class I heat shock | 2499 |
TRINITY_DN19733_c0_g1 | Heat shock 83 | 2536 |
TRINITY_DN25024_c0_g1 | Heat stress transcription factor B-2a-like | 423 |
TRINITY_DN12431_c0_g1 | Small heat shock | 1291 |
TRINITY_DN15088_c1_g1 | kDa class I heat shock | 562 |
TRINITY_DN12794_c0_g2 | Heat stress transcription factor B-2b | 2433 |
TRINITY_DN38964_c0_g1 | kDa class II heat shock | 895 |
TRINITY_DN18629_c0_g2 | Heat shock HSP 90-beta | 2381 |
TRINITY_DN8034_c0_g1 | Class I heat shock | 849 |
TRINITY_DN19698_c0_g1 | Heat shock 90- chloroplastic | 3176 |
TRINITY_DN9349_c0_g1 | kDa class II heat shock-like | 940 |
TRINITY_DN10493_c0_g1 | Heat shock cognate 71 kDa | 2054 |
TRINITY_DN7499_c0_g1 | kDa class I heat shock-like | 834 |
TRINITY_DN26877_c0_g1 | Heat stress transcription factor B-2a-like | 743 |
TRINITY_DN24551_c0_g1 | Activator of 90 kDa heat shock ATPase homolog | 1771 |
TRINITY_DN8550_c0_g1 | Heat shock factor HSF24-like | 1680 |
TRINITY_DN11358_c0_g2 | Heat stress transcription factor C-1 | 884 |
TRINITY_DN10430_c0_g1 | Heat shock 90- mitochondrial | 2982 |
TRINITY_DN12677_c0_g1 | kDa heat shock peroxisomal | 1302 |
TRINITY_DN35298_c0_g1 | kDa class I heat shock-like | 1156 |
TRINITY_DN16768_c1_g1 | Heat stress transcription factor A-6b-like | 1616 |
TRINITY_DN40620_c0_g1 | Heat shock 70 kDa | 2605 |
TRINITY_DN18528_c0_g5 | Heat stress transcription factor B-3 | 1280 |
TRINITY_DN19073_c1_g2 | Heat stress transcription factor B-2a-like | 1936 |
TRINITY_DN18254_c0_g3 | Activator of 90 kDa heat shock ATPase homolog | 2328 |
TRINITY_DN16913_c0_g5 | DNAJ heat shock N-terminal domain-containing | 749 |
TRINITY_DN37813_c0_g1 | kDa class II heat shock-like | 239 |
TRINITY_DN14498_c0_g1 | Heat shock factor HSF24 | 2074 |
TRINITY_DN20386_c0_g11 | kDa heat shock mitochondrial | 1530 |
TRINITY_DN19503_c0_g3 | Heat shock 70 kDa mitochondrial | 3512 |
TRINITY_DN7426_c0_g1 | Heat stress transcription factor A-3-like | 3147 |
TRINITY_DN19828_c0_g2 | kDa class IV heat shock | 1262 |
TRINITY_DN19733_c0_g4 | Heat shock 83 | 3248 |
TRINITY_DN9818_c0_g1 | Heat shock 70 kDa 8 | 3021 |
A considerable amount of mapping data (88.37 percent of unigenes with mapping data) was extracted from the UniProt Knowledgebase (UniProtKB) database, followed by Arabidopsis Information Resource (TAIR; 2.78%), Protein Data Bank (PDB; 0.06%), and GR_protein (0.01%).
Out of 1551 upregulated DEGs, there were 1192 (76.80%) that had IPS and 668 (43.04%) of them had GOs. A total of 301 protein families were found (Figure
Top 10 IPS families, domains, repeats, and sites for upregulated DEGs responsive for heat stress.
A total of 551 protein domains were detected (Figure
A total of 15 IPS repeats were detected. (IPR001611) Leucine-rich repeat matched with the largest number of unigenes (8), followed by (IPR002885) Pentatricopeptide repeat (6 unigenes), and (IPR019734) Tetratricopeptide repeat (6 unigenes). Three detected IPS sites were (IPR000048) IQ motif, EF-hand binding site (2 unigenes), (IPR006311) Twin-arginine translocation pathway, signal sequence (1 unigene), and (IPR018467) CO/COL/TOC1, conserved site (1 unigene).
Of the three-core GO annotation categories, biological processes (BP) comprised 37.88% of the total assigned annotations. Whereas molecular functions (MF) and cellular components (CC) comprised 37.96% and 24.16%, respectively. The GO terms with the largest number of assigned unigenes in the biological process (BP) category were biosynthetic process (206; 14.30%), nucleobase-containing compound metabolic process (145; 10.06%), cellular process (139; 9.65%), cellular protein modification process (108; 7.49%), metabolic process (105; 7.29), and response to stress (90; 6.25%) (Figure
GO classification of the upregulated genes under heat stress.
The KEGG pathways-based analysis indicated that 225 (14.49%) unigenes of the 1551 upregulated unigenes under heat stress obtained hits in the KEGG database, and those unigenes were associated with 158 enzymes and 102 KEGG pathways (Supporting Table
The top 20 KEGG pathways assignments in upregulated heat stress-responsive unigenes. The number of unigenes predicted to belong to each category is shown.
The 158 enzymes were further categorized into 6 main classes. As illustrated in Figure
Enzyme code distribution of heat stress-responsive upregulated.
Transcription factors are important regulators that participate in the response to biotic and abiotic stresses. To better understand the molecular mechanism which regulates the heat stress response in
Distribution of upregulated transcription factors (TFs) under heat stress.
The results and description of the best hit of these TFs against
The total of 1467 downregulated unigenes were subjected to analysis using BLAST2GO. Out of those unigenes, 1467 (100%) were with InterProScan, 1395 (95%) blasted, 1172 (84%) mapped, and 1135 (77.37%) annotated. The extra unblasted 72 unigenes could be considered as new genes exclusive to guar that could be downregulated significantly in response to heat stress. These new genes might be a useful material for future researches.
Out of 1467 unigenes, there were 1298 (88.48%) that had IPS and 815 (55.56%) of them had GOs. A total of 353 IPS families were found (Figure
Top 10 IPS families, domains, repeats, and sites for heat stress-responsive downregulated unigenes.
A total of 109 IPS repeats were detected. (IPR001611) Leucine-rich repeat matched with the largest number of unigenes (42; 38.53%), followed by (IPR003591) Leucine-rich repeat, typical subtype (27; 24.77%). Five IPS sites were detected: (IPR000048) IQ motif, EF-hand binding site (6 unigenes), (IPR000047) Helix-turn-helix motif (2 unigenes), (IPR008918) Helix-hairpin-helix motif, class 2 (2 unigenes), (IPR017956) AT hook, DNA-binding motif (1 unigene), and (IPR018467) CO/COL/TOC1, conserved site (1 unigene).
Of the three-core GO annotation categories, biological processes (BP) comprised 36.94% of the total assigned annotations, whereas molecular functions (MF) and cellular components (CC) comprised 36.79% and 26.27%, respectively. The GO terms with the largest number of assigned unigenes in the biological process (BP) category were biosynthetic process (201; 13.99%), cellular protein modification process (139; 9.67%), cellular nitrogen compound metabolic process (103; 7.17%), carbohydrate metabolic process (89; 6.19%), and response to stress (77; 5.36%) (Figure
GO classification of the downregulated genes detected in leaf tissues of the guar accession under heat stress.
The KEGG pathways-based analysis indicated that 310 (21.13%) unigenes of the 1467 downregulated unigenes under heat stress obtained hits in the KEGG database, and those unigenes were associated with 183 enzymes and 100 KEGG pathways (Supporting Table
The top 20 KEGG pathways assignments in heat stress-responsive downregulated unigenes. The number of unigenes predicted to belong to each category is shown.
The 183 enzymes were further categorized into 6 main classes. As illustrated in Figure
Enzyme code distribution of heat stress-responsive downregulated.
A total of 76 downregulated TFs were identified from DEGs. These downregulated TFs belong to 27 TF families. The bHLH family represented the most of TFS (14), followed by GRAS (9), MYB (6), C2H2 (5), HD-ZIP (4), and MYB_related (4), then the other families included 3 TFS or less (Figure
Distribution of downregulated transcription factors (TFs) under heat.
Heat stress influences plant growth and development and can reduce crop yield [
RNA-Seq is a sturdy technology that has been used to get genome-wide estimates of the relative expression of genes, as well as to identify genes, hormones, and processes which are participated in the response of leguminous plants to heat stress such as
The differential expression analysis of RNA-seq data revealed that cutting the hierarchically clustered gene tree at 60% height of the tree resulted into partitioning genes into two clusters which represent 1551 upregulated and 1466 downregulated DEGs. Our results are in the range of previous studies of heat stress-responsive DEGs. For instance, the range of DEGs varied from 607 [
Gene ontology (GO) is a major bioinformatics initiative to unify the representation of gene and gene product attributes across all species [
Protein sequence analysis & classification (InterProScan; IPS) is a tool that allows sequences (protein and nucleic) to be scanned against InterPro’s signatures. Our collection of heat-responsive DEGs were blasted to the 14 databases of InterPro consortium. In upregulated DEGs collection, we found 301 protein families, 551 domains, 15 repeats, and 3 sites. On the other side, our downregulated DEGs matched with 353 families, 647 domains, 109 repeats, and 5 sites. Cytochrome P450 enzymes are a superfamily of haem-containing monooxygenases that are found in all kingdoms of life [
Heat shock proteins (HSPs) are a group of heat shock-induced proteins found in virtually every living organism, from bacteria to humans [
Upregulated DEGs were also coded under heat stress to a considerable number of heat shock protein (HSP) domains including (IPR020575) Heat shock protein Hsp90, N-terminal (7 unigenes), (IPR029047) Heat shock protein 70kD, peptide-binding domain (6 unigenes), (IPR000232) Heat shock factor (HSF)-type, DNA-binding (6 unigenes), (IPR029048) Heat shock protein 70kD, C-terminal domain (5 unigenes), (IPR001305) Heat shock protein DnaJ, cysteine-rich domain (2 unigenes), and (IPR006636) Heat shock chaperonin-binding (1 unigene). Two upregulated DEGs of our collection were coded to the activator of Hsp90 ATPase homologue family. This family includes eukaryotic, prokaryotic, and archaeal proteins which have similarity to a 90 kDa heat shock protein ATPase homologue 1 C-terminal region of human activator (AHSA1/p38, O95433). This protein is reported to interfere with Hsp90’s middle domain and enhance its activity in ATPase [
Ethylene is an endogenous plant hormone that influences many aspects of plant growth and development. Some defense-related genes triggered by ethylene include a cis-regulatory element identified as the Ethylene-Responsive Element (ERE) [
UDP-glucuronosyl/UDP-glucosyltransferase family currently consists of plants Flavonol O (3)-glucosyltransferase (EC: 2.4.1.91), an enzyme that catalyzes the transfer of glucose from UDP-glucose to a Flavonol. This reaction is essential and one of the last steps in anthocyanin pigment biosynthesis [
The light-harvesting complex (LHC) consists of chlorophylls A and B and the chlorophyll A-B binding protein [
Transcription factors (TFs) (or sequence-specific DNA-binding factor) are proteins that control the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence [
In the current study, we investigated the genes that differentially expressed responsive to heat stress and their metabolic pathways. Our results uncovered 1551 up- and 1466 downregulated differentially expressed genes responsive to heat stress. Of those, 200 up- and 72 downregulated genes could be considered as new genes exclusive to guar responsive to heat stress. Cytochrome P450, small heat shock protein HSP20, heat shock transcription factor family, heat shock protein Hsp90 family, and heat shock protein 70 family were the most upregulated protein families. Heat shock factor 4, heat shock transcription factor A6B, heat shock transcription factor B3, heat shock transcription factor A4A, heat shock transcription factor B2A, and heat shock factor 6 were upregulated responsive to heat stress. Resulting data will be helpful to understand the molecular behaviour of plants induced by heat stress. The new putative and membranes’ genes might be useful for future researches.
The raw sequence data has been deposited at the NCBI Short Read Archive (SRA) with accession numbers (SRR10120601, SRR10120602, SRR10120603, SRR10120610, SRR10120611, and SRR10120612).
The authors declare that there is no conflict of interest regarding the publication of this paper.
Researchers Supporting Project number (RSP-2020/73), King Saud University, Riyadh, Saudi Arabia.
Supporting Figure 1: pairwise comparisons of replicate log (CPM) values in the control (GC). Supporting Figure 2: pairwise MA plots in control (GC). Supporting Figure 3: the sum of mapped genes in the control (GC). Supporting Figure 4: a replicate Pearson correlation heatmap in the control (GC). Supporting Figure 5: pairwise comparisons of replicate log (CPM) values in the heat stress (GH). Supporting Figure 6: pairwise MA plots in heat stress (HC). Supporting Figure 7: the sum of mapped genes in the heat stress (GH). Supporting Figure 8: a replicate Pearson correlation heatmap in heat stress (GH). Supporting Figure 9: sample correlation matrix between the control (GC) and heat stress (GH). Supporting Figure 10: principal component analysis of the control (GC) and heat stress (GH). Supporting Figure 11: distribution of genes according to fold-change (FC), counts, and FDR. Supporting Figure 12: the clustered heatmap showing the correlation matrix of DEGs between the heat stress and the control at