Fuchs endothelial corneal dystrophy (FECD) is a rare corneal genetic disease that can cause vision problems such as blurred vision, corneal opacification, and lower visual acuity [
Cornea endothelial cells are a monolayer of hexagonal cells that lay in the back of cornea. They are a layer of cells responsible for the maintenance of corneal stromal dehydration and transparency [
FECD is a progressive disease that can be clinically defined into several stages. During FECD disease development, altered or loss of endothelial cell function causes excessive fluid to accumulate inside stroma of the cornea, a condition named corneal edema [
The medical management for FECD patients was traditionally performed with ointments or hypertonic saline [
Until now, there has been no pharmaceutical therapy approved by the US Food and Drug Administration for FECD, a variety of new drug candidates investigated in this study that could provide new therapeutic options for managing Fuchs endothelial corneal dystrophy patients in the future. Understanding how the disease develops, nonsurgical therapy targeting cell signaling pathway underlying the disease progression may be a promising strategy.
Gene expression analysis using microarray or RNA-sequencing revealed the unique gene expression profile with differentially expressed genes comparing Fuchs endothelial corneal dystrophy patients with controls. Several studies using gene expression analysis have revealed the molecular mechanism of FECD disease [
Raw RNA-seq data for the GEO dataset (GSE101872) was downloaded from the SRA database (
Principal Component Analysis was performed using the PCA function from the sklearn Python module. Prior to performing PCA, the raw gene counts were normalized using the logCPM method, filtered by selecting the 2500 genes with most variable expression, and finally transformed using the Z-score method. By comparing gene expression levels between the control group and the experimental group using the limma R kit, the gene expression signature was created [
Clustergrammer was used to produce the heatmap [
Gene fold changes were transformed using log2 and displayed on the
By analyzing the upregulated and downregulated gene sets using Enrichr, enrichment results were generated with GO Biological Process 2018 [
By analyzing the upregulated and downregulated gene sets using Enrichr, enrichment results were generated [
Proteins involved in the transcriptional regulation of gene expression are transcription factors (TFs). A large number of associations between TFs and their transcriptional targets are found in databases such as ChEA and ENCODE. Enrichr uses this data to classify the transcription factors whose targets are overrepresented in the upregulated and downregulated genes found by comparing two groups of samples.
L1000CDS2 is a web-based tool for querying gene expression signatures against signatures created from human cell lines treated with over 20,000 small molecules and drugs for the LINCS project. The L1000CDS2 analysis [
RNA-seq data from the corneal endothelia of Fuchs endothelial corneal dystrophy patients and controls revealed several differential expressed pathways. The GEO dataset GSE101872 is loaded and analyzed by BioJupies. Expression data was quantified as gene-level counts using the ARCHS4 pipeline [
Sample metadata.
Sample_geo_accession | Sample title | Subject status | Tissue |
---|---|---|---|
GSM2717439 | 2011–020 (FECD with expansion) | Fuchs endothelial corneal dystrophy | Corneal endothelium |
GSM2717440 | 2011–024 (FECD with expansion) | Fuchs endothelial corneal dystrophy | Corneal endothelium |
GSM2717441 | 2011–038 (FECD with expansion) | Fuchs endothelial corneal dystrophy | Corneal endothelium |
GSM2717442 | 2011–041 (FECD with expansion) | Fuchs endothelial corneal dystrophy | Corneal endothelium |
GSM2717443 | 6004 (FECD with expansion) | Fuchs endothelial corneal dystrophy | Corneal endothelium |
GSM2717444 | Control 1 | Control | Corneal endothelium |
GSM2717445 | Control 2 | Control | Corneal endothelium |
The table displays the metadata associated with the samples in the RNA-seq dataset. Rows represent RNA-seq samples, and columns represent metadata categories.
BioJupies is a web-based Jupyter notebook (Python developer) integrated with several components such as Principal Component Analysis (PCA), clusters, volcano plots, and Gene Ontology. PCA is commonly used to explore the similarity of biological samples in RNA-seq datasets. To achieve this, gene expression values are transformed into Principal Components (PCs), a set of linearly uncorrelated features which represent the most relevant sources of variance in the data, and subsequently visualized using a 3-dimensional scatter plot. In Figure
Principal Component Analysis of RNA-seq data from FECD patients. The 3-dimensional figure displays a scatter plot of the first three Principal Components (PCs) of the data. Each point represents an RNA-seq sample. Samples are clustered in the three-dimensional space based on their similar gene expression profile.
In addition to identifying clusters of samples, it also allows to identify the genes that contribute to the clustering. In Figure
Heatmap visualization of RNA-seq data from FECD patients. For each sample in the RNA-seq dataset, the figure includes a heatmap demonstrating gene expression. Each row of the heatmap represents a gene, each column represents a sample, and each cell displays normalized gene expression values. Blue color represents low expressed genes and red color represents highly expressed genes.
Volcano plots can be used to quickly identify genes whose expression is significantly altered in a perturbation and to assess the global similarity of gene expressions in control and FECD samples. Each point in the scatter plot represents a gene; the axes display the significance versus fold change estimated by the differential expression analysis. From our analysis in the volcano plot (Figure
Volcano plot display of differentially expressed genes. A scatter plot showing the log2-fold modifications and statistical significance of each gene determined by conducting a differential analysis of gene expression is included in the figure. Every point in the plot represents a gene. Red points indicate genes that are upregulated, blue points indicate genes that are downregulated.
Gene Ontology (GO) contains a large collection of experimentally validated and predicted associations between genes and biological terms. This information can be leveraged by Enrichr to identify the biological processes, molecular functions, and cellular components which are overrepresented in the upregulated and downregulated genes identified by comparing two groups of samples, control and FECD patients. In Figure
Gene Ontology (GO) enrichment analysis of genes within biological categories. The figure contains bar charts showing the results of the enrichment analysis of Gene Ontology developed using Enrichr. For each term, the
Biological pathways databases such as KEGG, Reactome, and WikiPathways contain many associations between such pathways and genes. This information can be leveraged by Enrichr to identify the biological pathways which are overrepresented in the upregulated and downregulated genes identified by comparing two groups of samples. In Figure
Pathway enrichment analysis identifying significantly impacted pathways. The enrichment results are now displayed as a summary of enriched terms displayed as bar generated using Enrichr. For each term, the
Transcription factors (TFs) databases such as ChEA and ENCODE contain a large number of associations between TFs and their transcriptional targets. This information can be leveraged by Enrichr to identify the transcription factors whose targets are overrepresented in the upregulated and downregulated genes identified by comparing two groups of samples from controls and FECD (Supplementary Table
Protein kinases are enzymes that modify other proteins by chemically adding phosphate groups. Databases such as KEA contain a large number of associations between kinases and their substrates. This information can be leveraged by Enrichr to identify the protein kinases whose substrates are overrepresented in the upregulated and downregulated genes identified by comparing two groups of samples from controls and FECD patients (Supplementary Table
We explored the database L1000CDS2 to find drugs that can reverse disease phenotypes. L1000CDS2 is a web-based tool for querying gene expression signatures against signatures created from human cell lines treated with over 20,000 small molecules and drugs for the LINCS project. It is commonly used to identify small molecules which mimic or reverse the effects of a gene expression signature generated from a differential gene expression analysis. In Figure
L1000CDS2 identify drug candidates that reverse the differential expression signatures. A bar chart showing the top small molecules found by the L1000CDS2 query is contained in the figure. The left panel shows the small molecules that imitate the signature of gene expression observed, while the small molecules that reverse it are seen on the right panel.
Corneas are generally considered as an “immune privileged” site, which are able to tolerate foreign antigens without causing an inflammatory immune response [
From our RNA-seq analysis, we found that pathways related to glycolysis and neuronal system pathway were downregulated. Loss of glycolysis can be partially explained by the decrease of general metabolic activities. The corneal endothelium cells are responsible for maintaining the homeostasis of cornea by pumping out water osmotically, from the corneal stroma into the aqueous humor [
To find new interventions for FECD, we have several strategies to reverse disease phenotypes. Firstly, we can either inhibit the immune responses such as Toll-like receptor 4 (TLR4) and interferon-gamma (IFN-g) [
Raw RNA-seq data for GEO dataset GSE101872 were downloaded from the SRA database (
The authors declare that they have no conflicts of interest.
This work was supported in part by the Science, Technology, and Innovation Commission of Shenzhen Municipality (Grant no. GJHZ20180420180937076) and Sanming Project of Medicine in Shenzhen (Grant no. SZSM201812090).
Supplementary Table 1: differential expression table. The figure displays a browsable table containing the gene expression signature generated from a differential gene expression analysis. Every row of the table represents a gene; the columns display the estimated measures of differential expression. Supplementary Table 2: transcription factor enrichment analysis results. The table contains scrollable tables displaying the results of the transcription factor (TF) enrichment analysis generated using ENCODE libraries, indicating TFs whose experimentally validated targets are enriched. Supplementary Table 3: kinase enrichment analysis results. The figure contains browsable tables displaying the results of the protein kinase (PK) enrichment analysis generated using the ARCHS4 library, indicating PKs whose top coexpressed genes (according to the ARCHS4 dataset) are enriched.