A large number of EBV immortalized LCLs have been generated and maintained in genetic/epidemiological studies as a perpetual source of DNA and as a surrogate
Epstein-Barr virus (EBV) immortalized lymphoblastoid cell lines (LCLs) have been routinely used as surrogate
The molecular events leading to the maintenance of pluripotency in embryonic stem cells (ESCs) and reacquisition of a stem-like state in iPSCs during somatic reprogramming represent mechanistically distinct processes that converge on a set of remarkably similar transcriptional events that underpin the pluripotent state [
Although some success in developing a more efficient LCL-to-iPSC reprogramming protocol has been achieved [
In this study, we sequenced six LCLs and their reprogrammed iPSCs for miRNome (microRNA/miRNA) and transcriptome (mRNA). We analyzed these dynamic datasets, aiming at identifying the functional changes at the global gene expression levels during LCLs-to-iPSC reprogramming process. A differential gene expression analysis was performed between LCLs and generated iPSC in combination with functional annotations and Ingenuity® Pathway Analysis (IPA).
We have a rich resource of LCLs established using the peripheral blood mononuclear cells (PBMCs) collected from more than 1400 Mexican American participants of our San Antonio Family Heart Study (SAFHS). Whole genome sequence data and extensive phenotype data for common complex human diseases are available for most of these SAFHS participants. Our large, well characterized LCL resource provides a unique opportunity to generate pluripotent stem cells from any of these individuals in the context of their own particular genetic identity for disease modeling, particularly by differentiating specific cell/tissue type from generated iPSC to experimentally test the hypotheses developed by statistical genetics methods.
The six human lymphoblastoid cell lines used in this study were previously established
The six LCLs were thawed from the SAFHS cell line repository. The thawed cell lines were cultured in RPMI 1640 complete media (i.e., RPMI 1640 media containing 15% heat inactivated fetal bovine serum, 1% MEM nonessential amino acids, 1 mM sodium pyruvate, and 10 mM HEPES buffer, all from Life Technologies) at 37°C, 5% CO2, and atmospheric O2 for 1-2 passages to obtain the appropriate number of viable cells.
On day 0, that is, 24 hours before nucleofection, LCL cultures were split into 1 : 2 ratios to keep LCLs in log growth phase. On day 1, about one million cells were nucleofected with 2.5
LCL-to-iPSC reprogramming and characterization. (a) Schematic diagram of LCL-to-iPSC reprogramming. (b) Morphology of a reprogrammed iPSC colony at 5x, 10x, and 40x original magnifications, respectively. (c) Immunocytochemistry analysis of generated iPSCs showing expression of pluripotency markers. (d) The graphs showing gene expression of core pluripotency markers in LCLs and their reprogrammed iPSCs. (e) PCR analysis of genomic DNA confirms no integration or retention of plasmid genome/transgene in the LCL reprogrammed iPSCs at passages 17–20. (f) Image showing immunocytochemistry analysis of the cells of three embryonic germ layers differentiated from reprogrammed iPSCs using monolayer differentiation protocol. (g) Image showing normal karyotype of an iPSC line. Karyotype analyses of each reprogrammed iPSC line were found to be normal. (h) The differential gene expression graph showing significant downregulation of LCL specific genes.
Using this newly developed protocol, we achieved a consistently high reprogramming efficiency compared with the previously published LCL-to-iPSC reprogramming protocols (~50–200 colonies/million nucleofected cells), which enabled us to downsize our reprogramming experiments to approximately a third of the original (i.e., reducing the number of cells nucleofected and culture wells from ~1 million cells and 6-well format to only ~0.3 million cells and 2-culture-well format), thereby reducing the reprogramming media and other culture costs considerably. This constitutes a significant step towards the use of this technology in modeling human disease at a population scale.
The reprogrammed iPSC lines were confirmed by immunocytochemistry (Figure
Total cellular and plasmid DNA from snap-frozen cell pellets (~5 × 106 cells) of the six stable reprogrammed iPSC lines was isolated using the DNeasy Blood and Tissue kit (Qiagen) according to the manufacturer’s instructions. PCR to detect episomal plasmid DNA was performed for 30 cycles using primer pairs 5′-GGCGTAATCATGGTCATAGC-3′ and 5′-ACGACAGGTTTCCCGACT-3′ and Maxima Hot Start master mix (Thermo Scientific) following the manufacturer’s instructions. The PCR primers were designed to amplify a genomic region common in all episomal plasmids used in our LCL-to-iPSC reprogramming method but not complimentary to human genome. Purified plasmid DNA was used as positive control. A genomic DNA fragment from a human single copy gene (
RNA was extracted from cell pellets (~5 × 106 cells) snap-frozen from LCLs immediately before nucleofection with reprogramming factors and from their stable reprogrammed iPSC lines. Total RNA was extracted from aforesaid LCL and iPSC lines (six each) using TRIzol reagent (Life Technologies) and the manufacturer protocol with minor modifications. RNA quality and quantity were assessed using a NanoDrop 2000 Spectrophotometer (Thermo Scientific) and an Agilent 2100 Bioanalyzer (Agilent Technologies).
The Illumina® TruSeq® Small RNA Sample Preparation Kit was used to prepare small RNA sequencing libraries from 1
The Illumina TruSeq RNA Sample Preparation Kit v2 was used to prepare cDNA sequencing libraries from 1
Raw fastq sequence files were generated and demultiplexed using the Illumina CASAVA v1.8 pipeline. After prealignment QCs, sequences were aligned to human genome build 19 (hg19) and mapped to UCSC transcripts using Strand NGS software v2.1 (Strand Genomics Inc.) with default settings. The small RNA reads were also mapped to small RNA annotations as implemented in Strand NGS v2.1 (Strand Genomics Inc.) The aligned reads were then filtered based on read quality metrics (i.e., quality threshold ≥ 20; N’s allowed in read ≤ 1; mapping quality threshold ≥ 40; read length ≥ 20), so that only good alignments were retained and then quantification was performed. The expression values (read counts) were log transformed and “DESeq” normalization was applied. Only known mRNAs and miRNAs having the normalized read count (NRC) ≥ 20 in all samples of any one out of two cell types or in both (i.e., LCLs or iPSCs or both) were selected for differential expression analysis.
For both miRNA-seq and mRNA-seq datasets, we performed moderated
To identify biological functions that were most significant to our dataset, functional annotation enrichment analysis of significantly DE mRNAs and miRNAs was performed using IPA. Right-tailed Fisher’s exact test
Given the potential utility of the extensive LCL bioresource, we attempted reprogramming six LCLs from our SAFHS Mexican American participants into iPSCs using two previously published methods that demonstrated successful reprogramming of LCLs [
iPSC induction efficiency from six LCL lines.
Cell line | Cell number nucleofected | Reprogramming efficiency (%)/plasmid mixture and conditions | ||
---|---|---|---|---|
EN2L + ET2K + EM2K after Choi et al. [ |
OSNK + OSTK + L-mL after Rajesh et al. [ |
Our optimized protocol (see Materials and Methods) | ||
LCL-1 | 1 × 106 | 0.0 | 0.0003 | 0.0102 |
LCL-2 | 1 × 106 | 0.0 | 0.0 | 0.0204 |
LCL-3 | 1 × 106 | 0.0 | 0.0001 | 0.0054 |
LCL-4 | 1 × 106 | 0.0 | 0.0 | 0.0108 |
LCL-5 | 1 × 106 | 0.0 | 0.0 | 0.0132 |
LCL-6 | 1 × 106 | 0.0 | 0.0 | 0.0216 |
LCLs show high expression of the B-cell activation markers (FCER2/CD23, CD70, TNFRSF8/CD30, and ENTPD1/CD39) and cellular adhesion molecules (ITGAL/CD11a, LFA3/CD58, and ICAM1/CD54) [
To investigate the mechanistic gene expression changes that occurred during LCL-to-iPSC reprogramming, we performed a parallel genome-wide miRNA and mRNA expression analysis in six LCLs and their reprogrammed iPSCs. A total of 5.5 and 8.3 million small RNA 40 bp single-end reads and 28.4 and 29.9 million mRNA 100 bp paired-end reads were obtained for LCLs and their reprogrammed iPSCs, respectively, from 24 cDNA libraries (12 each for small RNA and mRNA) sequenced on an Illumina HiSeq 2500 platform. Only known miRNA and mRNA genes/transcripts with NRC ≥ 20 (i.e., normalized value ≥ 4.3219 on
Reproducibility of the expressed transcriptomic profiles in the biological replicates of each cell type was evaluated by calculating the correlation coefficient (
Differential gene expression statistics in LCL-to-iPSC reprogramming. (a) Expressed genes (NRC ≥ 20) correlation coefficient (
To identify the unique transcriptomic signature of the LCL-to-iPSC reprogramming, we performed moderated
Principal component analysis (PCA) of differentially expressed mRNAs and miRNAs during LCL-to-iPSC reprogramming is shown in Figure
Principal component analysis (PCA) based on DE genes. (a) DE mRNAs during LCL-to-iPSC reprogramming. (b) DE miRNAs during LCL-to-iPSC reprogramming.
These results suggest discrete and uniform resetting of both mRNA and miRNA expression during iPSC reprogramming, each cell type expressing a unique set of genes and miRNAs.
Hierarchical clustering based on the expression profiles of significantly DE genes and miRNAs during LCL-to-iPSC reprogramming is shown in Figure
Gene expression patterns in LCLs, reprogrammed iPSC, and ESCs. (a) Expression pattern of all DE mRNAs and miRNAs in LCLs and their reprogrammed iPSCs. (b) Expression profiles of LCL specific mRNAs and miRNAs in LCLs and their reprogrammed iPSCs. (c) Expression profiles of mRNAs and miRNAs known from the literature to be involved in maintenance of pluripotency and stemness in human ESC/iPSCs. (d) PCA on gene expression profiles of six LCLs and their reprogrammed iPSCs generated in this study and three ESC and four iPSC gene expression profiles downloaded from the GEO database.
During
The genes and miRNAs expected to be enriched in iPSCs/ESCs, from the literature [
To further assess and compare the gene expression profile of our LCL reprogrammed iPSCs with other ESC and iPSC line gene expression profiles, we downloaded whole genome RNA sequencing data of three ESC lines, that is, GSM1888661 (H9ESC), GSM1888664 (HUES1), and GSM1888680 (HUES3), and four iPSC lines, that is, GSM1888662 (iPS11b), GSM1888660 (iPS15b), GSM1888679 (iPS18c), and GSM1888663 (iPS20b), from GEO public database submitted by Choi et al. [
To better understand what biological functions were affected and how these were affected by differentially expressed mRNAs and miRNAs during LCL-to-iPSC reprogramming, we performed functional annotation enrichment analysis of downregulated and upregulated mRNAs and miRNAs using IPA. The significantly enriched (FDR ≤ 0.001) functions that were also either significantly upregulated (activation
Graphical presentation of the top 15 upregulated and downregulated cellular functions found to be enriched during LCL-to-iPSC reprogramming.
Further, we explored the effects of iPSC reprogramming on key LCL and human iPSC related canonical pathways using IPA platform.
Previous studies of the molecular genetics and pathogenesis of EBV induced B-cell growth support a model where EBV encoded nuclear antigens (EBNA1, EBNA2, and EBNA3A-C) and integral membrane proteins (LMP1 and LMP2) utilize intrinsic B-cell receptors (BCR) signaling pathways to support rapid growth and survival of latently infected B-cells/LCLs [
The EBV principal oncoprotein LMP1 along with LMP2A mimics CD40 and B-cell receptor (BCR) signaling, respectively, and activates NF-
Diagram showing key LCL specific canonical pathways. (a) Pathway genes expressed in LCLs. (b) Differential expression of pathway genes during LCL-to-iPSC reprogramming.
Human iPSCs have similar properties to human ESCs (hESCs), such as self-renewal and differentiation capacity [
Diagram showing key human pluripotency pathways in ESCs/iPSCs. (a) Pathway genes expressed in reprogrammed iPSCs. (b) Differential expression of pathway genes during LCL-to-iPSC reprogramming.
Genes expressed in iPSCs
DE genes in LCL-to-iPSC reprogramming
In contrast to TGF-
Wnt/
Our reprogrammed iPSCs also showed evidence of activation of the S1P signaling pathway (Figure
Because the regulation of a large number of genes was affected by iPSC reprogramming (42.4% of the total expressed genes), we investigated whether the gene expression pattern specific to the donor’s genetic relationships and disease state was recovered in the process. We performed hierarchical clustering analysis and PCA using data on all 12,325 mRNAs detected as expressed in LCLs and their reprogrammed iPSCs. The LCLs fail to consistently cluster by the genetic relationships of their donors or by the disease state (Figure
Clustering properties of LCLs and their reprogrammed iPSCs (a) Hierarchical clustering analysis and PCA in LCLs based on all genes detected as expressed during LCL-to-iPSC reprogramming. (b) Hierarchical clustering analysis and PCA in generated iPSCs based on all genes detected as expressed during LCL-to-iPSC reprogramming. #The donors of LCL-1 and LCL-3 were diagnosed with sporadic Parkinson’s disease. The donors of LCL-2, LCL-4, LCL-5, and LCL-6 were healthy.
To enable the utilization of existing LCL bioresources in iPSC based disease modeling, it is an absolute necessity to develop an efficient and reproducible LCL-to-iPSC reprogramming method. Here, we describe a MEF feeder-free protocol for efficient and reproducible reprogramming of iPSCs from LCL using publically available plasmids and commercially available media. In addition, our comprehensive analysis of genome-wide miRNome and transcriptome of LCLs and their reprogrammed iPSCs provides important documentation of differentially expressed genes and miRNAs and their functional consequences during LCL-to-iPSC reprogramming which were previously unknown.
The datasets supporting the conclusion of this paper were submitted to GEO archive (gene expression omnibus) and a GEO accession number was assigned: GSE74289.
The authors declare that they have no competing interests.
Satish Kumar performed all the cell culture and reprogramming experiments. Satish Kumar and Joanne E. Curran performed whole genome small RNA and RNA sequencing. Satish Kumar performed differential gene expression analysis, functional annotation, and pathway analysis. Satish Kumar and Joanne E. Curran drafted the paper. John Blangero and David C. Glahn provided input into the analysis and helped to improve the paper for important intellectual content. Satish Kumar, Joanne E. Curran, and John Blangero conceived the study and participated in its design and coordination of the work. All authors read and approved the final paper.
The authors are grateful to the participants of the San Antonio Family Heart Study for their generous participation and cooperation. Data collection for the San Antonio Family Heart Study was supported by National Institutes of Health (NIH) Grant R01 HL045522.