In cellular physiology and signaling, reactive oxygen species (ROS) play one of the most critical roles. ROS overproduction leads to cellular oxidative stress. This may lead to an irrecoverable imbalance of redox (oxidation-reduction reaction) function that deregulates redox homeostasis, which itself could lead to several diseases including neurodegenerative disease, cardiovascular disease, and cancers. In this study, we focus on the redox effects related to vascular systems in mammals. To support research in this domain, we developed an online knowledge base, DES-RedoxVasc, which enables exploration of information contained in the biomedical scientific literature. The DES-RedoxVasc system analyzed 233399 documents consisting of PubMed abstracts and PubMed Central full-text articles related to different aspects of redox biology in vascular systems. It allows researchers to explore enriched concepts from 28 curated thematic dictionaries, as well as literature-derived potential associations of pairs of such enriched concepts, where associations themselves are statistically enriched. For example, the system allows exploration of associations of pathways, diseases, mutations, genes/proteins, miRNAs, long ncRNAs, toxins, drugs, biological processes, molecular functions, etc. that allow for insights about different aspects of redox effects and control of processes related to the vascular system. Moreover, we deliver case studies about some existing or possibly novel knowledge regarding redox of vascular biology demonstrating the usefulness of DES-RedoxVasc. DES-RedoxVasc is the first compiled knowledge base using text mining for the exploration of this topic.
King Abdullah University of Science and TechnologyFCC/1/1976-24-01BAS/1/1606-01-01Ministarstvo Prosvete, Nauke i Tehnološkog Razvoja1730341730331. Introduction
In cellular physiology and signaling, reactive oxygen species (ROS) are involved in various processes including cellular growth, gene expression, activation of signal transduction pathways, and induction of transcription factors in defense against infection [1–3]. In the vascular system, ROS play an important role in regulating endothelial function and vascular tone in physiological condition [4]. However, ROS are also involved in pathophysiological processes such as inflammation, endothelial dysfunction, and vascular remodeling in cardiovascular diseases (CVD), including hypertension [5–8]. ROS are implicated in vascular pathophysiology, leading to atherosclerosis and arterial hypertension. Moreover, ROS-generating systems were found to facilitate diseases which promote vascular pathologies, such as hypercholesterolemia, diabetes mellitus, and obesity [9]. Within the cardiovascular system (CVS), ROS have the role of signaling molecules and facilitate cellular differentiation and growth, cell migration, inactivation of NO, protein phosphorylation, and extracellular matrix production and breakdown. However, many of these effects relate to pathological changes in the vasculature [1]. ROS are produced by endothelial cells (EC), vascular smooth muscle cells (VSMC), and adventitial cells and can be generated by various enzymes [10].
We are witnessing an enormous increase in the volume of published research material, which makes it infeasible for an individual researcher or a team of researchers to track all important developments even in a specific field. This is very prominent in the biomedical domain where, in addition to the great volume of published scientific reports, the information contained in these documents is itself highly complex. For example, the following query: “(human OR mouse OR rat OR mammal∗) AND (radical∗ OR peroxide∗ OR “reductive stress” OR ROS OR “reactive oxygen species” OR RNS OR “reactive nitrogen species” OR redox OR “reduction-oxidation reaction” OR oxidative OR nitrosative OR peroxide∗ OR superoxide∗ OR detoxifi∗ OR antioxid∗ OR “polyunsaturated fatty acids” OR “arachidonic acid” OR “linoleic acid” OR hydroperoxide∗ OR “hypochlorous acid” OR peroxynitrit∗ flavoprot∗ OR xanthine oxidase∗ OR “cytochromes P450” OR catalase∗ OR sulfiredoxin∗ OR peroxiredoxin∗) AND (“angina pectoris” OR anemia OR aneurysm∗ OR angio∗ OR arter∗ OR atrial OR atrioventricular OR aort∗ OR bradycardia OR blood OR brain OR circulati∗ OR clogging OR cardio∗ OR coronary OR edema OR heart OR ishemic OR hemo∗ OR hypertension OR leukemia OR leuko∗ OR macroangiopathy OR microangiopathy OR neovascularization OR occlusion OR pericardi∗ OR sepsis OR “sickle cell” OR tachycardia OR tachyarrhythmia OR thromb∗ OR vaso OR vein∗ OR ventricular OR vascular∗ OR vessel∗)” was used to retrieve all literature specifically focused on the problems related to redox effects on the cardiovascular system in mammalian organisms. Clarivate Analytics (https://clarivate.com/) has indexed in the Web of Science (All Databases), having 36063 and 169212 scientific articles published in 2017 and in the 2013-2017 period, respectively. This clearly highlights the challenges of analyzing information even in specialized domains.
The problem of how to explore such a voluminous information pool leads to looking for ways to simplify the search for useful information. This problem is not new, and it has been clear that one needs automated systems to support analysis of information contained in published literature. The last three decades have seen numerous attempts devoted to developments in this direction. This problem is addressed through text mining. Different aspects of text mining and a complementary set of techniques for the so-called natural language processing (NLP) have been applied for the exploration of biomedical information from free text [11–23].
Different methods were used for obtaining information from free text [24–33], many based on heavy utilization of ontologies and ontology structures [28]. Also, there have been systematic efforts to combine text mining with other methods to enhance the capacity to extract useful information (for example, [30–32, 34]).
Text mining found applications in different biomedical domains [31, 35–48], for example, dealing with problems of cancers [42], disease biomarkers [47], sickle cell disease [49], tomato species [50], medicinal herbs [35], sodium channels [51], drug repurposing [37], protein analysis [40, 52], prioritization of cancer genes and pathways [41], hepatitis C virus [53], cancer risk assessment [48], associations of mutations and human diseases [54], or association of transcription factors [55].
Research in the utilization of text mining in the biomedical field has resulted in a number of applications that are accessible online, such as [56–79]. These demonstrate the increasing value of applying text mining to the biomedical field.
In this study, to support research in redox biology and its effects on CVS, we developed an online knowledge base (KB), DES-RedoxVasc (http://www.cbrc.kaust.edu.sa/des-rv), which enables exploration of information contained in biomedical scientific literature focused on redox control of vascular systems in mammals. We provide examples of DES-RedoxVasc use.
2. Exploration System2.1. Server Architecture and Underlying Systems
DES-RedoxVasc is a publicly available visual, interactive, topic-specific literature exploration system that was developed using an upgraded version of the DES system originally developed by some of the coauthors of this report (VBB and AR) and was used as the underlying framework for several published topic-specific KB (different versions) [24, 49–55, 58, 62, 65, 68, 70, 72, 78, 80–83]. The KB is implemented and hosted on a CentOS-7 operating system. Results are provided using Apache web server version 2.4.6. A local MongoDB (2.6.11) database stores the literature repository which comprises open-access PubMed and PubMed Central articles, and the KB index and related tables are stored on a PostgreSQL (9.2.15) database. Apache Lucene was used to index the documents. Various programming languages/tools were used to develop the KB including: JavaScript, JQuery 3.0.0 C/C ++ (gcc 4.8.5), Java (OpenJDK 1.8.0_91), Perl v5.16.3, and PHP 5.4.16. DES-RedoxVasc is functional across commonly used web browsers (Windows, Linux, and Mac OS platforms) and was specifically tested for Firefox, Safari, and Chrome. The DES workflow has been described earlier [54].
2.2. The Literature Corpus and Dictionaries Incorporated into DES-RedoxVasc
The MongoDB literature repository contains only documents that are tagged as open access, which means that they are freely amenable to text mining. Thus, to create the literature corpus to be analyzed, the local MongoDB repository, last updated on September 03, 2018, was queried for all topic-specific PubMed and PMC articles. The same query used to query Web of Science (All Databases) above was used to create the literature corpus. The literature index server is designed to match the query to the titles, abstracts, and full-text article when available through the PMC set. The query retrieved 233399 articles.
Also, 28 topic-relevant dictionaries were used in this KB, of which eight dictionaries were newly compiled (see Table 1). The remaining 20 dictionaries were previously used in other KBs developed using the DES framework and in Table 1.
Dictionaries used in DES-RedoxVasc with data source references.
Dictionary
Enriched unique terms in the KB
Status
Chemicals/compounds
Chemical Entities of Biological Interest (ChEBI) [84]
17981
Toxins (T3DB) [85]
2083
Lipids (lipids maps) [86, 87]
2852
Amyloids (Human and Mouse); compiled in-house
393
Newly compiled
Functional annotation
Biological Process (GO) [88]
5438
Cellular Component (GO) [88]
1125
Molecular Function (GO) [88]
1755
Pathways (KEGG [89], Reactome [90], UniPathway [91], and PANTHER [92])
ICD9 Ontology (BioPortal)—International Classification of Diseases, Version 9 - Clinical Modification [100]
688
Drugs
Drugs (DrugBank) [101]
3918
ATC Ontology (BioPortal)—Anatomical Therapeutic Chemical Classification [102]
1991
Newly compiled
CSSO Ontology (BioPortal)—Clinical Signs and Symptoms Ontology
210
Newly compiled
SIDER (Drug Indications and Side Effects) [103]
3190
Human
Human Genes and Proteins (Entrez Gene) [104]
21858
Human Transcription Factors [105]
1505
Human Transcription Co-Factors (TcoF-DB) [105]
384
Human microRNAs (HGNC and Entrez Gene)PMEDIDs for HGNC and Entrez Gene
1811
Updated
Human Long Non-Coding RNAs (HGNC) [106]
460
Mutations (tmVar) [107]
12514
Human Anatomy (in-house compiled)
2538
OMIT Ontology (BioPortal)—Ontology for MicroRNA Target [108]
656
Newly compiled
All dictionary concepts (see Table 2 for definitions) are normalized where possible. Normalization of concepts ensures that when concepts can be referred to by different symbols, names, or synonyms, it is always associated to a single entity (using an internal identifier) and it also ensures that concepts can be recognized through universal IDs such as NCBI Taxonomy ID, Entrez Gene ID, and UniProt ID that are regarded as trusted sources. For example, dealing with genes and proteins is frequently problematic in text mining. This is as a consequence of gene/protein names/symbols and their aliases, frequently denoting more than one gene/protein. We combined Entrez Gene (for genes) with UniProt (for proteins) nomenclatures which provide the official names/symbols/aliases routinely used. Then the normalization is applied in the DES system. The normalization of dictionary concepts improves the accuracy of concepts’ enrichment estimates.
Vocabulary and interactive tools used in DES-RedoxVasc.
Vocabulary and interactive tools
Definition
Concepts
Biological words or phrases (e.g., inflammation, oxidative stress, and hydrogen peroxide) found in this topic-specific literature, organized into thematic dictionaries, and used to mine the literature
Concept Pairs
Cooccurring “Enriched Concepts” (e.g., cell fate determination and TAL1; Wnt receptor and CELSR2; and BMP2K and coronary artery endothelial cell) that may or may not have a biological association/connection
FDR
“To be enriched, a concept or a pair has to have an FDR falsediscoveryrate<0.05 in the DES-RedoxVasc corpus. The FDR is obtained by correcting the enrichment P values for multiplicity testing based on the Benjamini-Hochberg procedure”
Literature
Provides the literature set used in the development of this KB
Network
A tool for the visualization of concept associations as a graph of interlinked nodes
Concept Co-occurrences
A list of concepts which cooccur in the literature with the concept in question. Concepts are regarded as cooccurring in the text if they are within a 200-character distance from each other (refer to rationale below). Only enriched pairs are shown in this list
Knowledge base
A store of information or data that is available to draw on
Dictionaries
A set of topic-specific vocabularies made up of words or phrases used for the purpose of text mining
kb_frequency
Frequency of a concept within the KB literature corpus
bkg_freq
Refers to background frequency: frequency of a concept within the whole PubMed/PMC literature corpus
Density
KB frequency divided by background frequency
Some concepts are relevant to more than one dictionary, for example, enzymes are gene products, and it is expected that nomenclatures of these entity types would have a substantial intersection. The same goes for drugs and chemicals, drugs and antibiotics, gene functions and pathways, etc. It is worth noting that normalization is done at the dictionary level and not across dictionaries because (1) it is the semantically valid approach, as biological entities might be pertinent to, say, both chemicals and drugs, and should be viewed as such depending on the scope of the literature and the user’s interest and (2) these dictionaries are used in a modular fashion independently from each other; it is not redundant to keep a reference to the same entity in two or more dictionaries. For example, a user might be interested only in drugs, and not in the more general collection of chemicals, and as such chooses only drugs for the KB annotation; therefore, they should have access to all drugs that are also part of the chemical dictionary. This also applies when doing dictionary specific searches within the same KB. It is not however acceptable to have redundant concepts within the same dictionary.
The literature corpus and 28 dictionaries were used for concept document mapping. The concept document mapping results were then used to statistically determine enriched concepts and enriched pairs of concepts.
2.2.1. Enriched Concepts
In a KB, concepts could be statistically enriched or not. If they are enriched in the KB, this is based on their abundance in the KB corpus which should be greater than one would expect as compared to the rest of the PubMed/PMC literature. The frequency of the concept across the entire literature is indicative of the expectation of its frequency in any randomly selected sample from the literature. A concept is enriched when its frequency in the KB is significantly higher than the expected frequency. To quantify determination of which concepts are enriched, a concept has to have a Pvalue<0.05 in the DES-RedoxVasc corpus when compared to the complete set of PubMed Central and PubMed articles in our local repository; in this manner, concepts most relevant to the KB are identified. The P value was calculated based on the Benjamini-Hochberg procedure to correct for multiplicity testing. Note that this P value is also known as a false discovery rate (FDR).
2.2.2. Enriched Concept Pairs
Pairs of enriched concepts are considered enriched for association by considering the abundance of their cooccurrence as compared to the individual occurrence of concepts that form the pair. So, for example, if two concepts occur 100 times each and they cooccur 90 times, there is a high chance that they are associated, because they each occurred with the other concept 90% of the time. The situation is of course not typically symmetric, but the example is just for clarification. The resulting enriched pairs of concepts may or may not be directly associated; however, the more a pair is enriched this way, the higher the probability for the association between the two concepts.
Using cooccurrence as a proxy for semantic relatedness, or association, is a well-established, if not the dominant, approach to semantic analysis and association extraction and is by no means particular to DES. PMI (pointwise mutual information) and cosine distance from Word2Vec embeddings are some of the mainstream examples of such an approach. Establishing association between two biomedical entities from the text in a biologically meaningful way (e.g., causality, inhibition, and coexpression) is however a much more challenging task, that is, the subject of much research pertinent to the more general question of NLU (natural language understanding). Focusing on one type of association, with certain simplifying assumptions, can render the task of targeted association extraction more amenable to computation, but this is not the purpose of our explorative system.
The total number of statistically enriched concepts from all 28 dictionaries used is 101938. The number of enriched concepts per dictionary is provided in Table 1. The total number of statistically enriched pairs of concepts that are themselves found statistically enriched is 5631393. The literature corpus, 28 dictionaries, enriched concepts, and enriched pairs of concepts were integrated to create DES-RedoxVasc. The resulting network of concept pairs was also embedded in a high-dimensional semantic space, therefore enabling the computation of semantic similarity between any two concepts within the KB.
2.2.3. Semantic Similarity
This similarity is a metric which establishes the likeness or closeness of two concepts in terms of their potential meaning. Semantic similarity can be the result of semantic relatedness, such as synonymy, antonymy, and hypernymy. For example, tall and short are semantically similar even though they are antonyms because they both share the semantic dimension of “height.” Semantic similarity within DES is calculated as the cosine distance between two concept embeddings (vector representations in a latent semantic space). These embeddings are obtained using a skip-gram Word2Vec model trained on the DES-RedoxVasc literature corpus with normalized concept annotation. Therefore, the underlying assumption for semantic similarity in DES is concept cooccurrence, but not necessarily direct cooccurrence.
3. DES-RedoxVasc Overview and Case Studies
DES-RedoxVasc allows oxidative control and vascular system-related literature to be easily explored using terms and associations that are determined to be statistically enriched in topic-specific publication. Briefly, these enriched terms/concepts can be explored using the “Enriched Concepts” (Enriched Terms) link or via the “Enriched Pairs” (Enriched Term Pairs) link that provides enriched cooccurring concepts. Concepts are regarded as cooccurring based on their cooccurrence in the text within a 200-character distance from each other. However, DES-RedoxVasc only reports the portion of cooccurring concepts (pairs of concepts) where pairs are statistically enriched, thereby increasing the probability that the reported associations could have “biological relevance.” However, “biological relevance” is left to the user to check on by exploring the actual related literature provided through the interface. So, if genes or proteins keep cooccurring with a particular disease or process much more frequently than is statistically expected, then we assume that these genes or proteins are deemed to be important to the disease pathology or process (also refer to Enriched Concept Pairs).
Users can also use the “Column visibility” tab in these links to explore enriched terms using ranking options for the false discovery rate (FDR), density, kb_frequency, and bkg_freq. Also, concepts are color coded to indicate the dictionary from which the concepts are retrieved.
Moreover, each concept is linked to a clickable box through which the “Network” and “Term Co-occurrences” links can be examined. Detailed description is provided in [72]. There is also the “Literature” link that allows users to explore the literature in DES-RedoxVasc (PubMed abstracts and PMC full-text articles) and the “Network” link that allows users to explore and generate networks of enriched concept pairs. This version of DES also provides a new link named “Semantic Similarity.” Users are also provided with a “Software Manual” on the “Home” page of DES-RedoxVasc. Below, we provide several examples wherein a range of biomedical entities are used to develop insights into redox control in vascular systems.
3.1. Example 1: Finding the Relevant Concepts of Different Categories Using “Enriched Concepts” View
One rather simple but useful use of DES-RedoxVasc is a possibility to quickly find some of the most relevant concepts related to redox processes in CVS. For this, one can choose the “Enriched concepts” view button (on the left side). Then the page will show the list of most characteristics concepts from all dictionaries as found by the system. If one wants to see the most enriched concepts from a specific dictionary, this is possible by selecting the dictionary from the dropdown menu from the right side. As the inspection of these most characteristic concepts will show, most of them are very clearly related to the topic that we study. In the following, we examine such singled-out genes/proteins and microRNAs in more details.
Oxidants classified either as ROS [109, 110] or reactive nitrogen species (RNS) [109, 110] are generated through the cells’ normal metabolic processes as well as exogenous factors such as atmospheric pollutants and irradiation. These oxidants play important physiological roles in cell maintenance and are considered not to harm the human body when oxidant-antioxidant levels are relatively in equilibrium [111]. However, in cases where the levels of these oxidants exceed the levels of antioxidants, oxidative stress (OS) is triggered [112]. To counteract this state of oxidative stress, the cells increase antioxidant production to reestablish redox homeostasis [113, 114]. However, in contrast to the oxidative mechanisms, excess levels of antioxidants lead to excess reducing equivalents of glutathione (GSH), NADPH, and NADH that depletes ROS and triggers reductive stress (RS) [115]. This state of chronic reductive stress stimulates an increase in the production of oxidants only to establish an oxidative stress state that is eventually driven back to the reductive stress state. Thus, excess antioxidant agents may also induce prooxidant effects [116].
These counter mechanisms describe the general processes that govern redox control. Moreover, the lack of redox control in the form of prolonged oxidative or reductive stresses has been linked to several disease states [117–119] including cardiovascular diseases.
Thus, we start exploring the efficacy of DES-RedoxVasc to retrieve established associations through the “Enriched Concepts” link (see Figure 1 and also see the “‘Published Examples” link for a more detailed description of how examples were generated).
Using DES-RedoxVasc to find out potential connections between the concepts. The purple circles, the pink circles, and the green circles mark the “CVDO Ontology (BioPortal) Cardiovascular Disease Ontology” dictionary, the “Human Genes and Proteins (Entrez Gene)” dictionary, and the “Human microRNAs” dictionary, respectively. Based on the cooccurrence frequency, the color of edges can go from black (strong association) to grey (weaker association). The number of documents that link the potentially associated nodes is displayed on each edge.
3.1.1. Gene/Protein Associations with “Oxidative Stress”
Figure 1 shows that the gene/protein nodes are connected with “Oxidative stress” by a large number of articles. To confirm that the genes/proteins nodes and microRNA have true associations retrieved by DES-RedoxVasc, we checked the literature suggested by DES-RedoxVasc. Li et al. demonstrated that eNOS knockout mice exhibit cardiac aging prematurely and early mortality [120]. In line with this finding, Zanetti et al. used aortae of rats (old and young) to demonstrate that the activated inducible nitric oxide synthase (iNOS), impaired SOD1 activity, and increased OS are associated with vascular aging. They also showed that caloric restriction blunts oxidative stress, reduced iNOS expression, and increased SOD1 activity [121]. They further reported that SIRT1 expression remains unchanged. However, it has been shown that human coronary arterial endothelial cells treated with resveratrol induced SIRT1, as well as upregulated eNOS in a SIRT1-dependent manner [122]. Also, OS induced with SOD1 deficiency triggers oxidatively modified CA2 to accumulate in erythrocytes [123].
ROS is also produced in normal airway epithelial cells stimulated with human neutrophil elastase (also known as HNE or ELANE) [124]. It was also shown in a large gene set that Nrf2 binds to the antioxidant response element (ARE) (including glutamate-cysteine ligase (GCL), NAD(P)H-quinone oxidoreductase 1 (NQO1), heme oxygenase-1 (HMOX1), which encodes HO-1, and thioredoxin reductase 1 (Txnrd1)) to alleviate oxidative stress [125]. Thus HO-1 was shown to play a key role in oxidative stress-related pathologies such as CVDs and atherosclerosis [126]. OGG1 repairs DNA damage induced by OS, and an OGG1 (rs1052133) polymorphism has been associated with atherosclerosis [127] and CVD [128] risk.
All genes/proteins from Figure 1 had an association with “oxidative stress” except LPO. The reason is that LPO in the text was used to refer to lipid peroxide instead of “lactoperoxidase.” Despite ELANE (with one of its synonyms being HNE) and CA2 being associated with “Oxidative stress,” in most of the articles that putatively linked these concepts to “Oxidative stress,” HNE refers to the peroxidation by-product 4-hydroxy-2-nonenal instead of the human neutrophil elastase gene or product and CA2 refers to calcium. These examples illustrate a limitation of text mining caused by multiple meanings of the same symbol.
3.1.2. MicroRNA Associations with “Oxidative Stress”
On the other hand, if we look at nodes that are connected by a small number of articles such as the nodes for microRNAs in Figure 1, Step 3, we find “MIR23A” [129], “MIR34A” [130], “MIR155” [131], “MIR210” [132], and “MIR106B” [133] being associated in our KB with “oxidative stress” via “6,” “4,” “3,” “2,” and “1” articles, respectively.
The literature focused on “MIR23A-” (miR-23a-) revealed areas of research that may increase our insight of miR-23a-related redox control in various diseases. Dubois-Deruy et al. demonstrated that SOD2 is increased in the left ventricle after heart failure in rats, as well as miRNAs (miR-222-3p, miR-23a-3p, and miR-21-5p) targeting SOD2 [129]. They further demonstrated that left ventricular remodeling postmyocardial infarction in REVE-2 patients [134] exhibits high levels of these SOD2-targeting miRNAs. In line with this finding, it was demonstrated that inhibiting oxidative stress-induced miR-23a (MIR23A) reduces degeneration of retinal pigment epithelium (RPE) cells [135]. They further demonstrated that glutaminase (GLS) is a direct target of miR-23a and oxidative stress in miR-23a-overexpressed RPE cells is alleviated by GLS expression. This is interesting as GLS converts glutamine to glutamate, the precursor needed for synthesis of the antioxidant glutathione (see Figure 2).
An overview of microRNA in redox control linked to maintaining a healthy state and contribution to redox dysfunction impacting different diseases.
Figure 2 depicts an overview of how redox control contributes to maintaining a healthy state and how redox dysfunction contributes to different disease states. When an increase in OS is coupled with the inhibition of the oxidative stress-induced microRNA, antioxidant synthesis is increased which reduces the oxidative stress back to the redox “homeostasis” state. Conversely, redox dysregulation in the form of increased expression of microRNAs inhibits antioxidant synthesis possibly leading to a disease state. This contradicts Lin et al. who, instead of inhibition, suggested the expression of miR-23a is required for the maintenance of healthy RPE cells [136].
3.2. Example 2: Hypotheses and Potentially New Insights Derived through the Use of DES-RedoxVasc3.2.1. Hypothesis 1: Heart Failure May Occur in Response to Oxidative Stress
On the page “Enriched pairs,” “Oxidative stress response” in column 1 is linked to a number of miRNAs (see column 2 when the “Human miRNAs” dictionary is selected), among which there is “MIR4639” (hsa-miR-4639). We checked the FARNA database [137] for hsa-miR-4639 and found that this miRNA is expressed in the heart [137]. Furthermore, FARNA suggests that hsa-miR-4639 is implicated in heart failure. On the other hand, Chen et al. demonstrated that increased levels of hsa-miR-4639 in plasma leads to downregulation of the DJ-1 protein activity in patients with Parkinson’s disease [138]. Moreover, they demonstrated that miR-4639-5p directly binds the DJ-1 transcript at its 3′UTR that results in the downregulation of the DJ-1 protein activity. This is interesting, as oxidative stress activates DJ-1 and DJ-1 is shown to inhibit alpha-synuclein aggregate formation that leads to Parkinson’s disease [139]. The relationship between miR-4639 and oxidative stress is via DJ-1, as the Nrf2-regulated antioxidant defense mechanism is impaired when levels of DJ-1 are decreased [140]. DJ-1 has also been shown to protect the heart against oxidative damage. That is, Billia et al. demonstrated that DJ-1 (with synonym PARK7) protects murine hearts against oxidative damage [141]. DJ-1 was also shown to protect the heart from ischemia-reperfusion injury [142, 143]. Moreover, the work of Li et al. shows that miR-4639 is almost 3-fold overexpressed in chronic heart failure patients compared to the control group [144]. All this leads us to the following hypothesis (see Figure 3): “overexpression of miR-4639 in the heart downregulates DJ-1 that protects the heart from oxidative damage, which may be one of the causes leading to heart failure.”
An overview of how oxidative damage may lead to heart failure.
3.2.2. Hypothesis 2: Vascularization Redox Is Relevant to Alzheimer’s Disease
In search of novel insights, it is also useful to look at the concepts from different dictionaries that are associated with each other. For this analysis, we looked at all connections/association found between concepts in DES-RedoxVasc. Figure 4 shows the interconnectedness of the dictionaries with themselves and with the other dictionaries based on the cooccurring concept pairs, in the form of a heatmap. As shown in Figure 4, after normalization, the concepts from the ADO dictionary have the most connections to concepts from other dictionaries. This might seem surprising, but within the field of Alzheimer’s disease research, vascularization is intensely researched as a mechanism for the disease development, with some researchers proposing that it is primarily a vascular disorder rather than a neurodegenerative disease [145]. However, since this link is based on the analysis of literature focused on redox effects to CVS, this implicitly suggests that redox-related vascular disorders may link to Alzheimer’s disease. We take this observation based on Figure 4 cautiously, as the number of concepts included in different dictionaries varies as well as the coverage of a particular domain by these concepts. So, it also could be that the quality of the ontologies from which we derived some of our dictionaries is affecting the heatmap in Figure 1. In any case, it was interesting to observe potential support for the hypothesis on a link of vascularization to Alzheimer’s disease.
An illustration of dictionary connectivity based on concepts mapped to the analyzed corpus. Log normalized values of the number of pairs divided by the multiplication of the total amount of enriched terms in both dictionaries are displayed. The highest displayed value is -3.4 shown in bright-yellow, and the lowest value is -11.5 shown in dark purple. White is equal to the value of 0 when no pairs of concepts were found enriched for such combination of dictionaries. Rows and columns are sorted according to the total sum of enriched pairs of the dictionary, with ADO ontology having the highest normalized number of pairs.
3.2.3. Hypothesis 3: ZFAS1 May Play a Role in the Fine-Tuning of the Oxidative Stress-Responsive miR-27B
In search of novel insights, we also looked at the associations of concepts based on semantic similarity using the “Semantic Similarity” link (see Figure 5 and also see the “Published Examples” link for a more detailed description of how examples were generated). One of the semantic similarities (similarity>0.8) established by DES-RedoxVasc is between miR-27b and long non-coding RNA, ZFAS1. Xu et al. demonstrated that collagenase-induced intracerebral hemorrhage (ICH) in the rat brain reduces the expression of the oxidative stress-responsive miR-27b. It was also shown that overexpression of miR-27b reduced expression of Nrf2, SOD1, Hmox1, and Nqo1 and that miR-27b targets Nrf2 mRNA directly. They further demonstrated that miR-27b inhibition promotes the opposite effects, such as activation of the Nrf2/ARE pathway and reduced OS; these effects are blocked by Nrf2 knockdown [146]. Thus, miR-27b is reduced to reestablish redox homeostasis. The dysfunction of this mechanism leads to vascular diseases. That is, it was demonstrated that when miR-27b overexpresses, it induces cardiac dysfunction and hypertrophy in mice [147]. Also, Signorelli et al. demonstrated that the levels of miR-27b, miR-130a, and miR-210 are increased in patients with peripheral artery disease when compared to healthy controls [148]. However, miR-27b has not been linked to ZFAS1. Despite that, this link may be correct as ZFAS1 is predicted to bind hsa-miR-27b-3p using the DIANA tool, LncBase Predicted v.2 [149].
Using DES-RedoxVasc to find out potential connections between the concepts using semantic similarity.
Current research to a certain extent supports this hypothesis, as Pan et al. reported overexpression of ZFAS1 in gastric cancer (GC) serum and tissue samples and demonstrated that ZFAS1 knockdown inhibits the proliferation and migration of GC cells by suppressing cell cycle progression and apoptosis [150], while Chen et al. demonstrated that miR-27b is downregulated in GC and show miR-27b to be a potential GC biomarker. Moreover, they show that miR-27b functions as a tumor suppressor in GC by targeting VEGFC [151]. This shows a possible inverse relationship between ZFAS1 and miR-27b. Moreover, Shin et al. report the risk of ischemic stroke and coronary heart disease incidence in GC patients [152]. ZFAS1 was also determined to be a potential biomarker for coronary artery disease/acute myocardial infarction [153]. Lyu et al. also showed ZFAS1 to be upregulated in rats with traumatic brain injury [154]. This shows that miR-27b has been linked to OS and vascular disease and that ZFAS1 has been linked to vascular disease but its possible role in the fine-tuning of miR-27b in these pathologies have not been explored.
4. Discussion and Concluding Remarks
DES-RedoxVasc allows for exploration of numerous associations between different concepts as they are found in the analyzed literature. Over 5.6 million such associations have been identified by DES-RedoxVasc. These potential concept associations are based on the cooccurrence of the concepts in the text placed relatively close to each other (up to a 200-character distance). Moreover, these associations are found statistically enriched in the analyzed literature with FDR<0.05 and are made of concepts that themselves are statistically enriched in the same document set with FDR<0.05, compared to documents in the background. Users can evaluate if such association found is meaningful by inspecting the text from where the association is derived. Another set of associations is between any of the individually enriched concepts and statistically enriched concepts that are semantically similar to them. In total, there are over 10 billion such associations found in the analyzed documents. Usually, when similarity between concepts is high, i.e., >0.75, such associations appear mostly meaningful, which reduces the number of concept pairs to an estimated 50 million.
Being primarily based on the text mining approach, DES-RedoxVasc carries all shortcomings of text mining. As we used dictionaries of terms related to different categories of concepts, the quality and completeness of these dictionaries affect the results. If a term that represent a synonym of a concept or the concept itself is not present in the dictionary, the system will not be able to identify it in the text. Also, some terms are “promiscuous” as they are very common and thus do not convey significant information. That is, promiscuous terms are terms which have very high connectivity in the knowledge graph. This is in turn due to their high frequency, because the more frequent a term is, the greater the probability for it to cooccur with more concepts. Usually, promiscuous terms have a broad semantic coverage like “function” or “disease.” Term ambiguity can also result in term promiscuity, such as the use of the term HAND or PDF as a gene symbol. Promiscuous terms might have thousands of edges, where every single edge might refer to thousands of cooccurrence hits within the annotation. Consequently, they inflate the index and the knowledge graph and therefore pose more demands on computation. More importantly, they affect the quality of extracted information and any inferences thereof, because they affect the very topology of the knowledge graph and act as high centrality hubs, creating short paths between concepts which are not otherwise associated. For example, the term “disease” can potentially link most disease concepts which are not necessarily linked, the same for pathological mutations, pathological microorganisms, etc., which are all related to the concept of disease. Removing promiscuous terms restores the intended topology of the knowledge graph. Pair enrichment provides another corrective layer for cases where promiscuous or irrelevant concepts seeped through the dictionary cleaning phase.
Computationally, to understand the improvements gained by removing these terms, we refer to the concept of term frequency distribution and in particular to Zipf’s law, which establishes that a term frequency and its rank (within a descending frequency-ordered list of terms within a corpus) obey a simple power law. The main consequence of this law is that a very small proportion of top-frequency-ranked terms (usually promiscuous in a biological context) account for a substantial amount of the text (in our case, the annotation and the knowledge graph). In our latest dictionary cleaning process, the removal of 0.1% of such high-frequency terms resulted in reducing the annotation size by a third.
An additional observation is that the Cardiovascular Disease Ontology (CVDO) on the other hand does not seem to resonate well within the knowledge base, having relatively few connections, despite being conceptually of central importance. Compared to CVDO, the Heart Failure Ontology (HFO) is much better connected to the other ontologies that we used. It is possible that this is the consequence of relatively incomplete CVDO that may need some improvements if it is to show the full usefulness in text mining tasks.
Despite these limitations, the examples provided hereby as “case studies” demonstrate that the KB can be useful and that the user-friendly interface allows users to easily navigate and explore information in the KB. The DES-RedoxVasc KB literature and dictionaries will be updated biannually, and the KB will be updated accordingly.
AbbreviationsCVS:
Cardiovascular system
CVD:
Cardiovascular disease
CVDO:
Cardiovascular Disease Ontology
EC:
Endothelial cells
FDR:
False discovery rate
GSH:
Glutathione
GCL:
Glutamate cysteine ligase
HFO:
Heart Failure Ontology
KB:
Knowledgebase
lncRNA:
Long non-coding RNA
miRNA:
MicroRNA
NLP:
Natural language processing
ncRNA:
Non-coding RNA
Nrf2:
Nuclear erythroid 2-related factor 2
OS:
Oxidative stress
PMI:
Pointwise mutual information
RNS:
Reactive nitrogen species
ROS:
Reactive oxygen species
RS:
Reductive stress
VSMC:
Vascular smooth muscle cells.
Disclosure
This work is part of a collaboration between the Laboratory of Radiobiology and Molecular Genetics, Institute of Nuclear Sciences, Vinca, University of Belgrade, Belgrade, Serbia, and King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, Saudi Arabia.
Conflicts of Interest
The authors confirm that this article content has no conflict of interest.
Authors’ Contributions
ME, AS, JS, FT, ABR, AH, MU, CVN, AT, VPB, VBB, and ERI wrote the paper; ERI and VBB designed, supervised, and critically revised the paper. Magbubah Essack and Adil Salhi are co-first authors.
Acknowledgments
This work has been supported by grant no. 173033 (ERI) and no. 173034 (BSP) from the Ministry of Education, Science and Technological Development, Republic of Serbia. VBB has been supported by the King Abdullah University of Science and Technology (KAUST) Base Research Fund (BAS/1/1606-01-01), and ME has been supported by KAUST Office of Sponsored Research (OSR) Award no. FCC/1/1976-24-01.
GriendlingK. K.SorescuD.LassègueB.Ushio-FukaiM.Modulation of protein kinase activity and gene expression by reactive oxygen species and their role in vascular physiology and pathophysiology200020102175218310.1161/01.ATV.20.10.21752-s2.0-0033781454LanderH. M.An essential role for free radicals and derived species in signal transduction199711211812410.1096/fasebj.11.2.90399539039953DrogeW.Free radicals in the physiological control of cell function2002821479510.1152/physrev.00018.20012-s2.0-003608613011773609KalwaH.SartorettoJ. L.SartorettoS. M.MichelT.Angiotensin-II and MARCKS: a hydrogen peroxide- and RAC1-dependent signaling pathway in vascular endothelium201228734291472915810.1074/jbc.M112.3815172-s2.0-8486523723422773836Al GhoulehI.KhooN. K. H.KnausU. G.GriendlingK. K.TouyzR. M.ThannickalV. J.BarchowskyA.NauseefW. M.KelleyE. E.BauerP. M.Darley-UsmarV.ShivaS.Cifuentes-PaganoE.FreemanB. A.GladwinM. T.PaganoP. J.Oxidases and peroxidases in cardiovascular and lung disease: new concepts in reactive oxygen species signaling20115171271128810.1016/j.freeradbiomed.2011.06.0112-s2.0-8005226326421722728BirS. C.KolluruG. K.FangK.KevilC. G.Redox balance dynamically regulates vascular growth and remodeling201223774575710.1016/j.semcdb.2012.05.0032-s2.0-8486663441522634069TabetF.SchiffrinE. L.CalleraG. E.HeY.YaoG.ÖstmanA.KappertK.TonksN. K.TouyzR. M.Redox-sensitive signaling by angiotensin II involves oxidative inactivation and blunted phosphorylation of protein tyrosine phosphatase SHP-2 in vascular smooth muscle cells from SHR2008103214915810.1161/CIRCRESAHA.108.1786082-s2.0-4884910429818566342Ushio-FukaiM.AlexanderR. W.AkersM.GriendlingK. K.p38 Mitogen-activated protein kinase is a critical component of the redox-sensitive signaling pathways activated by angiotensin II Role in vascular smooth muscle cell hypertrophy199827324150221502910.1074/jbc.273.24.150222-s2.0-00325109779614110ZinkevichN. S.GuttermanD. D.ROS-induced ROS release in vascular biology: redox-redox signaling20113013H647H65310.1152/ajpheart.01271.20102-s2.0-8005230569521685266ParaviciniT.TouyzR.Redox signaling in hypertension200671224725810.1016/j.cardiores.2006.05.0012-s2.0-3374519390816765337Rebholz-SchuhmannD.OellrichA.HoehndorfR.Text-mining solutions for biomedical research: enabling integrative biology2012131282983910.1038/nrg33372-s2.0-8486950527823150036AndersonP. F.ShannonC.BickettS.DoucetteJ.HerringP.KepselA.LyonsT.McLachlanS.WuL.Systematic reviews and tech mining: a methodological comparison with case study20189454055010.1002/jrsm.13182-s2.0-85057961940KilicogluH.Biomedical text mining for research rigor and integrity: tasks, challenges, directions20181961400141410.1093/bib/bbx0572-s2.0-85057237672HuangC. C.LuZ.Community challenges in biomedical text mining over 10 years: success, failure and the future201617113214410.1093/bib/bbv0242-s2.0-84960109752JovanovicJ.BagheriE.Semantic annotation in biomedicine: the current landscape2017814410.1186/s13326-017-0153-x2-s2.0-8502985228928938912KrallingerM.RabalO.LourençoA.OyarzabalJ.ValenciaA.Information retrieval and text mining technologies for chemistry2017117127673776110.1021/acs.chemrev.6b008512-s2.0-8502200559828475312MishraR.BianJ.FiszmanM.WeirC. R.JonnalagaddaS.MostafaJ.del FiolG.Text summarization in the biomedical domain: a systematic review of recent research20145245746710.1016/j.jbi.2014.06.0092-s2.0-8491984637925016293Rodriguez-EstebanR.BundschusM.Text mining patents for biomedical knowledge2016216997100210.1016/j.drudis.2016.05.0022-s2.0-8496957965227179985ZengZ.ShiH.WuY.HongZ.Survey of natural language processing techniques in bioinformatics201520151067429610.1155/2015/6742962-s2.0-84945338275SafferJ. D.BurnettV. L.KumarV.TipneyH.Introduction to biomedical literature text mining: context and objectives20141159New York, NY, USAHumana Press17Methods in Molecular Biology (Methods and Protocols)10.1007/978-1-4939-0709-0_12-s2.0-84927537595FluckJ.Hofmann-ApitiusM.Text mining for systems biology201419214014410.1016/j.drudis.2013.09.0122-s2.0-84895930934CohenA. M.HershW. R.A survey of current work in biomedical text mining200561577110.1093/bib/6.1.572-s2.0-1724438038015826357ShatkayH.FeldmanR.Mining the biomedical literature in the genomic era: an overview200310682185510.1089/1066527033227561042-s2.0-074232191314980013Bin RaiesA.MansourH.IncittiR.BajicV. B.Combining position weight matrices and document-term matrix for efficient extraction of associations of methylated genes and diseases from free text2013810, article e7784810.1371/journal.pone.00778482-s2.0-84885703210HoehndorfR.SlaterL.SchofieldP. N.GkoutosG. V.Aber-OWL: a framework for ontology-based data access in biology20151612610.1186/s12859-015-0456-92-s2.0-84926386014Rodríguez-GarcíaM. A.HoehndorfR.Inferring ontology graph structures using OWL reasoning2018191710.1186/s12859-017-1999-82-s2.0-8505079475029304741HoehndorfR.DumontierM.GkoutosG. V.Identifying aberrant pathways through integrated analysis of knowledge in pharmacogenomics201228162169217510.1093/bioinformatics/bts3502-s2.0-84865171141HoehndorfR.DumontierM.OellrichA.WimalaratneS.Rebholz-SchuhmannD.SchofieldP.GkoutosG. V.A common layer of interoperability for biomedical ontologies based on OWL EL20112771001100810.1093/bioinformatics/btr0582-s2.0-7995330240321343142AlshahraniM.KhanM. A.MaddouriO.KinjoA. R.Queralt-RosinachN.HoehndorfR.Neuro-symbolic representation learning on biological knowledge graphs201733172723273010.1093/bioinformatics/btx2752-s2.0-8504254487728449114RuchP.Text mining to support gene ontology curation and vice versa20171446698410.1007/978-1-4939-3743-1_62-s2.0-84994494266KwonO. S.KimJ.ChoiK. H.RyuY.ParkJ. E.Trends in deqi research: a text mining and network analysis20187323123710.1016/j.imr.2018.02.00730271711BadaM.Mapping of biomedical text to concepts of lexicons, terminologies, and ontologies20141159334510.1007/978-1-4939-0709-0_32-s2.0-84925413104ChungD.LawsonA.ZhengW. J.A statistical framework for biomedical literature mining201736223461347410.1002/sim.73842-s2.0-8502169770228675924TiffinN.KelsoJ. F.PowellA. R.PanH.BajicV. B.HideW. A.Integration of text- and data-mining using ontologies successfully selects disease gene candidates20053351544155210.1093/nar/gki2962-s2.0-1504434108215767279ParkS. H.HwangM. S.ParkH. J.ShinH. K.BaekJ. U.ChoiB. T.Herbal prescriptions and medicinal herbs for Parkinson-related rigidity in Korean medicine: identification of candidates using text mining201824773374010.1089/acm.2017.03872-s2.0-8505034413329583014XiaoF.LiC.SunJ.ZhangL.Knowledge domain and emerging trends in organic photovoltaic technology: a scientometric review based on CiteSpace analysis201756710.3389/fchem.2017.000672-s2.0-85033486774YangH. T.JuJ. H.WongY. T.ShmulevichI.ChiangJ. H.Literature-based discovery of new candidates for drug repurposing201718348849710.1093/bib/bbw0302-s2.0-8502023368827113728CarvalhoA. S.RodriguezM. S.MatthiesenR.Review and literature mining on proteostasis factors and cancer20161449718410.1007/978-1-4939-3756-1_22-s2.0-84987806969AbbeA.GrouinC.ZweigenbaumP.FalissardB.Text mining applications in psychiatry: a systematic literature review20162528610010.1002/mpr.14812-s2.0-8493762220426184780ShatkayH.BradyS.WongA.Text as data: using text-based features for proteins representation and for computational prediction of their characteristics201574546410.1016/j.ymeth.2014.10.0272-s2.0-8492387371925448299LuoY.RiedlingerG.SzolovitsP.Text mining in cancer gene and pathway prioritization201413s110.4137/CIN.S13874SpasićI.LivseyJ.KeaneJ. A.NenadićG.Text mining of cancer-related information: review of current status and future directions201483960562310.1016/j.ijmedinf.2014.06.0092-s2.0-8492588433625008281BravoÀ.CasesM.Queralt-RosinachN.SanzF.FurlongL. I.A knowledge-driven approach to extract disease-related biomarkers from the literature201420141125312810.1155/2014/2531282-s2.0-8490033470524839601TariL. B.PatelJ. H.KumarV.TipneyH.Systematic drug repurposing through text mining20141159New York, NY, USAHumana Press253267Methods in Molecular Biology (Methods and Protocols)10.1007/978-1-4939-0709-0_142-s2.0-84927558406VerspoorK. M.KumarV.TipneyH.Roles for text mining in protein function prediction20141159New York, NY, USAHumana Press95108Methods in Molecular Biology (Methods and Protocols)10.1007/978-1-4939-0709-0_62-s2.0-84927734457PiedraD.FerrerA.GeaJ.Text mining and medicine: usefulness in respiratory diseases201450311311910.1016/j.arbres.2013.04.00924507559IzarzugazaJ. M.KrallingerM.ValenciaA.Interpretation of the consequences of mutations in protein kinases: combined use of bioinformatics and text mining2012332310.3389/fphys.2012.003232-s2.0-84866490380KorhonenA.Ó SéaghdhaD.SilinsI.SunL.HögbergJ.SteniusU.Text mining for literature review and knowledge discovery in cancer risk assessment and research201274, article e3342710.1371/journal.pone.00334272-s2.0-8486557871222511921EssackM.RadovanovicA.BajicV. B.Information exploration system for sickle cell disease and repurposing of hydroxyfasudil201386, article e6519010.1371/journal.pone.00651902-s2.0-8487888418223762313SalhiA.NegrãoS.EssackM.MortonM. J. L.BougouffaS.RazaliR.RadovanovicA.MarchandB.KulmanovM.HoehndorfR.TesterM.BajicV. B.DES-TOMATO: a knowledge exploration system focused on tomato species201771596810.1038/s41598-017-05448-02-s2.0-8502546340628729549SagarS.KaurM.DaweA.SeshadriS.ChristoffelsA.SchaeferU.RadovanovicA.BajicV. B.DDESC: Dragon database for exploration of sodium channels in human20089162210.1186/1471-2164-9-6222-s2.0-60549106512ChowdharyR.ZhangJ.TanS. L.OsborneD. E.BajicV. B.LiuJ. S.PIMiner: a web tool for extraction of protein interactions from biomedical literature20137445046210.1504/IJDMB.2013.0542322-s2.0-84878775644KwofieS. K.RadovanovicA.SundararajanV. S.MaqungoM.ChristoffelsA.BajicV. B.Dragon exploratory system on hepatitis C virus (DESHCV)201111473473910.1016/j.meegid.2010.12.0062-s2.0-79956324084KordopatiV.SalhiA.RazaliR.RadovanovicA.TifrateneF.UludagM.LiY.BokhariA.AlSaieediA.Bin RaiesA.van NesteC.EssackM.BajicV. B.DES-mutation: system for exploring links of mutations and diseases201881, article 1335910.1038/s41598-018-31439-w2-s2.0-8505291703730190574PanH.ZuoL.ChoudharyV.ZhangZ.LeowS. H.ChongF. T.HuangY.OngV. W. S.MohantyB.TanS. L.KrishnanS. P. T.BajicV. B.Dragon TF Association Miner: a system for exploring transcription factor associations through text-mining200432Supplement 2W230W23410.1093/nar/gkh4842-s2.0-3242889751LiuY.LiangY.WishartD.PolySearch2: a significantly improved text-mining system for discovering associations between human diseases, genes, drugs, metabolites, toxins and more201543W1W535W54210.1093/nar/gkv3832-s2.0-8497986600225925572ChengD.KnoxC.YoungN.StothardP.DamarajuS.WishartD. S.PolySearch: a web-based text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites200836Supplement 2W399W40510.1093/nar/gkn2962-s2.0-4844908823218487273SalhiA.EssackM.AlamT.BajicV. P.MaL.RadovanovicA.MarchandB.SchmeierS.ZhangZ.BajicV. B.DES-ncRNA: a knowledgebase for exploring information about human micro and long noncoding RNAs based on literature-mining201714796397110.1080/15476286.2017.13122432-s2.0-8501975945228387604NevesM.LeserU.A survey on annotation tools for the biomedical literature201415232734010.1093/bib/bbs0842-s2.0-8489645006123255168KreinerK.HaynD.SchreierG.Twister: a tool for reducing screening time in systematic literature reviews20182555930306896PaynterR.BañezL. L.BerlinerE.ErinoffE.Lege-MatsuuraJ.PotterS.UhlS.2016Agency for Healthcare Research and QualityEssackM.RadovanovicA.SchaeferU.SchmeierS.SeshadriS. V.ChristoffelsA.KaurM.BajicV. B.DDEC: Dragon database of genes implicated in esophageal cancer20099121910.1186/1471-2407-9-2192-s2.0-67651180800BakerS.AliI.SilinsI.PyysaloS.GuoY.HögbergJ.SteniusU.KorhonenA.Cancer Hallmarks Analytics Tool (CHAT): a text mining approach to organize and evaluate scientific literature on cancer201733243973398110.1093/bioinformatics/btx4542-s2.0-8504398088829036271HowardB. E.PhillipsJ.MillerK.TandonA.MavD.ShahM. R.HolmgrenS.PelchK. E.WalkerV.RooneyA. A.MacleodM.ShahR. R.ThayerK.SWIFT-Review: a text-mining workbench for systematic review2016518710.1186/s13643-016-0263-z2-s2.0-84969524607KwofieS. K.SchaeferU.SundararajanV. S.BajicV. B.ChristoffelsA.HCVpro: hepatitis C virus protein interaction database20111181971197710.1016/j.meegid.2011.09.0012-s2.0-82655162057FrenchL.LiuP.MaraisO.KoremanT.TsengL.LaiA.PavlidisP.Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application201591310.3389/fninf.2015.000132-s2.0-84935873662BachmanJ. A.GyoriB. M.SorgerP. K.FamPlex: a resource for entity recognition and relationship resolution of human protein families and complexes in biomedical text mining201819124810.1186/s12859-018-2211-52-s2.0-85049200091MaqungoM.KaurM.KwofieS. K.RadovanovicA.SchaeferU.SchmeierS.OpponE.ChristoffelsA.BajicV. B.DDPC: Dragon Database of Genes associated with Prostate Cancer201139D980D98510.1093/nar/gkq8492-s2.0-7865126465720880996YeZ.TaftiA. P.HeK. Y.WangK.HeM. M.SparkText: biomedical text mining on big data framework2016119, article e016272110.1371/journal.pone.01627212-s2.0-8499167140227685652SagarS.KaurM.RadovanovicA.BajicV. B.Dragon exploration system on marine sponge compounds interactions2013511110.1186/1758-2946-5-112-s2.0-8487672506823415072JácomeA. G.Fdez-RiverolaF.LourençoA.BIOMedical Search Engine Framework: lightweight and customized implementation of domain-specific biomedical search engines2016131637710.1016/j.cmpb.2016.03.0302-s2.0-8496362703927265049SalhiA.EssackM.RadovanovicA.MarchandB.BougouffaS.AntunesA.SimoesM. F.LafiF. F.MotwalliO. A.BokhariA.MalasT.AmoudiS. A.OthumG.AllamI.MinetaK.GaoX.HoehndorfR.ArcherJ. A. C.GojoboriT.BajicV. B.DESM: portal for microbial knowledge exploration systems201644D1D624D63310.1093/nar/gkv11472-s2.0-8497687496626546514KhareR.WeiC. H.MaoY.LeamanR.LuZ.tmBioC: improving interoperability of text-mining tools with BioC2014201410, article bau07310.1093/database/bau0732-s2.0-84924440168RajaK.SubramaniS.NatarajanJ.PPInterFinder—a mining tool for extracting causal relations on human proteins from literature20132013, article bas05210.1093/database/bas0522-s2.0-84879367043DohmenR. M.Cell lineage in molluscan development19922217510210.1002/jemt.10702201072-s2.0-00267197121617209LiuH.ChristiansenT.BaumgartnerW. A.VerspoorK.BioLemmatizer: a lemmatization tool for morphological processing of biomedical text201231310.1186/2041-1480-3-32-s2.0-84871639988RoederC.JonquetC.ShahN. H.BaumgartnerW. A.VerspoorK.HunterL.A UIMA wrapper for the NCBO annotator201026141800180110.1093/bioinformatics/btq2502-s2.0-7795449154520505005KaurM.RadovanovicA.EssackM.SchaeferU.MaqungoM.KiblerT.SchmeierS.ChristoffelsA.NarasimhanK.ChoolaniM.BajicV. B.Database for exploration of functional context of genes implicated in ovarian cancer200937D820D82310.1093/nar/gkn5932-s2.0-58149187887ChiangJ. H.YuH. C.HsuH. J.GIS: a biomedical text-mining system for gene information discovery200420112012110.1093/bioinformatics/btg3692-s2.0-034772410014693818RaiesA. B.MansourH.IncittiR.BajicV. B.DDMGD: the database of text-mined associations between genes methylated in diseases from different species201543D1D879D88610.1093/nar/gku11682-s2.0-8495985862725398897DaweA. S.RadovanovicA.KaurM.SagarS.SeshadriS. V.SchaeferU.KamauA. A.ChristoffelsA.BajicV. B.DESTAF: a database of text-mined associations for reproductive toxins potentially affecting human fertility20123319910510.1016/j.reprotox.2011.12.0072-s2.0-8485672207122198179BajicV. B.VeronikaM.VeladandiP. S.MekaA.HengM.-W.RajaramanK.PanH.SwarupS.Dragon Plant Biology Explorer. A text-mining tool for integrating associations between genetic and biochemical entities with genome annotation and biochemical terms lists200513841914192510.1104/pp.105.0608632-s2.0-2774448057716172098ChowdharyR.TanS. L.ZhangJ.KarnikS.BajicV. B.LiuJ. S.Context-specific protein network miner – an online system for exploring context-specific protein interaction networks from the literature201274, article e3448010.1371/journal.pone.00344802-s2.0-8485948776622493694HastingsJ.de MatosP.DekkerA.EnnisM.HarshaB.KaleN.MuthukrishnanV.OwenG.TurnerS.WilliamsM.SteinbeckC.The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013201341D1D456D46310.1093/nar/gks11462-s2.0-8487656035823180789WishartD.ArndtD.PonA.SajedT.GuoA. C.DjoumbouY.KnoxC.WilsonM.LiangY.GrantJ.LiuY.GoldansazS. A.RappaportS. M.T3DB: the toxic exposome database201543D1D928D93410.1093/nar/gku10042-s2.0-84946074829CotterD.MaerA.GudaC.SaundersB.SubramaniamS.LMPD: LIPID MAPS proteome database20063490001D507D51010.1093/nar/gkj12216381922SudM.FahyE.CotterD.BrownA.DennisE. A.GlassC. K.MerrillA. H.MurphyR. C.RaetzC. R. H.RussellD. W.SubramaniamS.LMSD: LIPID MAPS structure database200735D527D53210.1093/nar/gkl8382-s2.0-3384605819817098933The Gene Ontology ConsortiumGene ontology consortium: going forward201543D1D1049D105610.1093/nar/gku11792-s2.0-8494673565425428369OgataH.GotoS.SatoK.FujibuchiW.BonoH.KanehisaM.KEGG: Kyoto Encyclopedia of Genes and Genomes1999271293410.1093/nar/27.1.292-s2.0-0032919364FabregatA.SidiropoulosK.GarapatiP.GillespieM.HausmannK.HawR.JassalB.JupeS.KorningerF.McKayS.MatthewsL.MayB.MilacicM.RothfelsK.ShamovskyV.WebberM.WeiserJ.WilliamsM.WuG.SteinL.HermjakobH.D'EustachioP.The Reactome pathway Knowledgebase201644D1D481D48710.1093/nar/gkv13512-s2.0-8497687011326656494MorgatA.CoissacE.CoudertE.AxelsenK. B.KellerG.BairochA.BridgeA.BougueleretL.XenariosI.ViariA.UniPathway: a resource for the exploration and annotation of metabolic pathways201240D1D761D76910.1093/nar/gkr10232-s2.0-8485982156322102589MiH.Lazareva-UlitskyB.LooR.KejariwalA.VandergriffJ.RabkinS.GuoN.MuruganujanA.DoremieuxO.CampbellM. J.KitanoH.ThomasP. D.The PANTHER database of protein families, subfamilies, functions and pathways200533D284D28810.1093/nar/gki0782-s2.0-1344431208315608197KibbeW. A.ArzeC.FelixV.MitrakaE.BoltonE.FuG.MungallC. J.BinderJ. X.MaloneJ.VasantD.ParkinsonH.SchrimlL. M.Disease Ontology 2015 update: an expanded and updated database of human diseases for linking biomedical knowledge through disease data201543D1D1071D107810.1093/nar/gku10112-s2.0-8494110372525348409MalhotraA.YounesiE.GündelM.MüllerB.HenekaM. T.Hofmann-ApitiusM.ADO: a disease ontology representing the domain knowledge specific to Alzheimer’s disease201410223824610.1016/j.jalz.2013.02.0092-s2.0-8489756475423830913El-SappaghS.KwakD.AliF.KwakK.-S.DMTO: a realistic ontology for standard diabetes mellitus treatment201891810.1186/s13326-018-0176-y2-s2.0-8504142095229409535WangL.BrayB. E.ShiJ.del FiolG.HaugP. J.A method for the development of disease-specific reference standards vocabularies from textual biomedical literature resources201668475710.1016/j.artmed.2016.02.0032-s2.0-8495986898426971304Arguello CasteleiroM.DemetriouG.ReadW.Fernandez PrietoM. J.MarotoN.Maseda FernandezD.NenadicG.KleinJ.KeaneJ.StevensR.Deep learning meets ontologies: experiments to anchor the cardiovascular disease ontology in the biomedical literature2018911310.1186/s13326-018-0181-12-s2.0-8504539978929650041KöhlerS.VasilevskyN. A.EngelstadM.FosterE.McMurryJ.AyméS.BaynamG.BelloS. M.BoerkoelC. F.BoycottK. M.BrudnoM.BuskeO. J.ChinneryP. F.CiprianiV.ConnellL. E.DawkinsH. J. S.DeMareL. E.DevereauA. D.de VriesB. B. A.FirthH. V.FresonK.GreeneD.HamoshA.HelbigI.HumC.JähnJ. A.JamesR.KrauseR.LaulederkindS. J. F.LochmüllerH.LyonG. J.OgishimaS.OlryA.OuwehandW. H.PontikosN.RathA.SchaeferF.ScottR. H.SegalM.SergouniotisP. I.SeverR.SmithC. L.StraubV.ThompsonR.TurnerC.TurroE.VeltmanM. W. M.VulliamyT.YuJ.von ZiegenweidtJ.ZanklA.ZüchnerS.ZemojtelT.JacobsenJ. O. B.GrozaT.SmedleyD.MungallC. J.HaendelM.RobinsonP. N.The Human Phenotype Ontology in 2017201745D1D865D87610.1093/nar/gkw10392-s2.0-8501598206627899602MungallC. J.TorniaiC.GkoutosG. V.LewisS. E.HaendelM. A.Uberon, an integrative multi-species anatomy ontology2012131, article R510.1186/gb-2012-13-1-r52-s2.0-8485631683622293552WHO201010thGeneva, SwitzerlandWorld Health OrganizationWishartD. S.FeunangY. D.GuoA. C.LoE. J.MarcuA.GrantJ. R.SajedT.JohnsonD.LiC.SayeedaZ.AssempourN.IynkkaranI.LiuY.MaciejewskiA.GaleN.WilsonA.ChinL.CummingsR.leD.PonA.KnoxC.WilsonM.DrugBank 5.0: a major update to the DrugBank database for 2018201846D1D1074D108210.1093/nar/gkx10372-s2.0-8504092424429126136ChenL.ZengW. M.CaiY. D.FengK. Y.ChouK. C.Predicting Anatomical Therapeutic Chemical (ATC) classification of drugs by integrating chemical-chemical interactions and similarities201274, article e3525410.1371/journal.pone.00352542-s2.0-84859732713KuhnM.LetunicI.JensenL. J.BorkP.The SIDER database of drugs and side effects201644D1D1075D107910.1093/nar/gkv10752-s2.0-8497950352226481350MaglottD.OstellJ.PruittK. D.TatusovaT.Entrez Gene: gene-centered information at NCBI201139D52D5710.1093/nar/gkq12372-s2.0-7865131790821115458SchmeierS.AlamT.EssackM.BajicV. B.TcoF-DB v2: update of the database of human and mouse transcription co-factors and transcription factor interactions201745D1D145D15010.1093/nar/gkw10072-s2.0-8501602533827789689YatesB.BraschiB.GrayK. A.SealR. L.TweedieS.BrufordE. A.Genenames.org: the HGNC and VGNC resources in 2017201745D1D619D62510.1093/nar/gkw10332-s2.0-8501608730327799471WeiC. H.HarrisB. R.KaoH. Y.LuZ.tmVar: a text mining approach for extracting sequence variants in biomedical literature201329111433143910.1093/bioinformatics/btt1562-s2.0-8487828126523564842HuangJ.DangJ.BorchertG. M.EilbeckK.ZhangH.XiongM.JiangW.WuH.BlakeJ. A.NataleD. A.TanM.OMIT: dynamic, semi-automated ontology development for the microRNA domain201497, article e10085510.1371/journal.pone.01008552-s2.0-8490426821525025130FormanH. J.FukutoJ. M.TorresM.Redox signaling: thiol chemistry defines which reactive oxygen and nitrogen species can act as second messengers20042872C246C25610.1152/ajpcell.00516.20032-s2.0-324271227615238356PackerL.WeberS. U.RimbachG.Molecular aspects of α-tocotrienol antioxidant action and cell signalling20011312369S373S10.1093/jn/131.2.369S11160563McCordJ. M.The evolution of free radicals and oxidative stress2000108865265910.1016/S0002-9343(00)00412-52-s2.0-003421318610856414HarmanD.Aging: a theory based on free radical and radiation chemistry195611329830010.1093/geronj/11.3.2982-s2.0-77049308856HalliwellB.Free radicals, proteins and DNA: oxidative damage versus redox regulation19962441023102710.1042/bst02410232-s2.0-00298545448968505HalliwellB.How to characterize a biological antioxidant19909113210.3109/107157690091485692-s2.0-0025069762NarasimhanM.RajasekaranN. S.Reductive potential — a savior turns stressor in protein aggregation cardiomyopathy201518521536010.1016/j.bbadis.2014.11.0102-s2.0-8491210667225446995Perez-TorresI.Guarner-LansV.Rubio-RuizM. E.Reductive stress in inflammation-associated diseases and the pro-oxidant effect of antioxidant agents20171810209810.3390/ijms181020982-s2.0-8503110287728981461VaziriN. D.Rodriguez-IturbeB.Mechanisms of disease: oxidative stress and inflammation in the pathogenesis of hypertension200621058259310.1038/ncpneph02832-s2.0-33749045289MorettiS.Mrakic-SpostaS.RoncoroniL.VezzoliA.DellanoceC.MonguzziE.BranchiF.FerrettiF.LombardoV.DonedaL.ScriccioloA.ElliL.Oxidative stress as a biomarker for monitoring treated celiac disease201896, article e15710.1038/s41424-018-0031-62-s2.0-8504821129629880904ForstermannU.Oxidative stress in vascular disease: causes, defense mechanisms and potential therapies20085633834910.1038/ncpcardio12112-s2.0-4444916275118461048LiW.MitalS.OjaimiC.CsiszarA.KaleyG.HintzeT. H.Premature death and age-related cardiac dysfunction in male eNOS-knockout mice200437367168010.1016/j.yjmcc.2004.05.0052-s2.0-444424092315350840ZanettiM.CappellariG. G.BurekovicI.BarazzoniR.StebelM.GuarnieriG.Caloric restriction improves endothelial dysfunction during vascular aging: effects on nitric oxide synthase isoforms and oxidative stress in rat aorta2010451184885510.1016/j.exger.2010.07.0022-s2.0-7795782097620637278CsiszarA.LabinskyyN.PintoJ. T.BallabhP.ZhangH.LosonczyG.PearsonK.de CaboR.PacherP.ZhangC.UngvariZ.Resveratrol induces mitochondrial biogenesis in endothelial cells20092971H13H2010.1152/ajpheart.00368.20092-s2.0-6765009137519429820IuchiY.OkadaF.OnumaK.OnodaT.AsaoH.KobayashiM.FujiiJ.Elevated oxidative stress in erythrocytes due to a SOD1 deficiency causes anaemia and triggers autoantibody production2007402221922710.1042/BJ200613862-s2.0-3384775748017059387AoshibaK.YasudaK.YasuiS.TamaokiJ.NagaiA.Serine proteases increase oxidative stress in lung cells20012813L556L56410.1152/ajplung.2001.281.3.L55611504681LobodaA.DamulewiczM.PyzaE.JozkowiczA.DulakJ.Role of Nrf2/HO-1 system in development, oxidative stress response and diseases: an evolutionarily conserved mechanism201673173221324710.1007/s00018-016-2223-02-s2.0-8496440277527100828RyterS. W.AlamJ.ChoiA. M. K.Heme oxygenase-1/carbon monoxide: from basic science to therapeutic applications200686258365010.1152/physrev.00011.20052-s2.0-3364594501416601269IzzottiA.PianaA.MinnitiG.VercelliM.PerroneL.de FloraS.Survival of atherosclerotic patients as related to oxidative stress and gene polymorphisms20076211-211912810.1016/j.mrfmmm.2006.12.0122-s2.0-3444729299217383690GokkusuC.CakmakogluB.DasdemirS.TulubasF.ElitokA.TamerS.SeckinS.UmmanB.Association between genetic variants of DNA repair genes and coronary artery disease201317430731310.1089/gtmb.2012.03832-s2.0-8487564792923368530Dubois-DeruyE.CuvelliezM.FiedlerJ.CharrierH.MulderP.HebbarE.PfanneA.BesemeO.ChwastyniakM.AmouyelP.RichardV.BautersC.ThumT.PinetF.MicroRNAs regulating superoxide dismutase 2 are new circulating biomarkers of heart failure201771, article 1474710.1038/s41598-017-15011-62-s2.0-8503338593329116107WanY.CuiR.GuJ.ZhangX.XiangX.LiuC.QuK.LinT.Identification of four oxidative stress-responsive microRNAs, miR-34a-5p, miR-1915-3p, miR-638, and miR-150-3p, in Hepatocellular Carcinoma2017201712518913810.1155/2017/51891382-s2.0-8502732039728811864YokoyamaY.MiseN.SuzukiY.Tada-OikawaS.IzuokaK.ZhangL.ZongC.TakaiA.YamadaY.IchiharaS.MicroRNAs as potential mediators for cigarette smoking induced atherosclerosis2018194109710.3390/ijms190410972-s2.0-8504513316029642385AyazL.DincE.Evaluation of microRNA responses in ARPE-19 cells against the oxidative stress201837212112610.1080/15569527.2017.13553142-s2.0-8502652405128707489BerberP.GrassmannF.KielC.WeberB. H. F.An eye on age-related macular degeneration: the role of microRNAs in disease pathology2017211314310.1007/s40291-016-0234-z2-s2.0-8498869535027658786FertinM.HennacheB.HamonM.EnnezatP. V.BiausqueF.ElkohenM.NugueO.TricotO.LamblinN.PinetF.BautersC.Usefulness of serial assessment of B-type natriuretic peptide, troponin I, and C-reactive protein to predict left ventricular remodeling after acute myocardial infarction (from the REVE-2 study)2010106101410141610.1016/j.amjcard.2010.06.0712-s2.0-7814946512321059429LiD. D.ZhongB. W.ZhangH. X.ZhouH. Y.LuoJ.LiuY.XuG. C.LuanC. S.FangJ.Inhibition of the oxidative stress-induced miR-23a protects the human retinal pigment epithelium (RPE) cells from apoptosis through the upregulation of glutaminase and glutamine uptake201643101079108710.1007/s11033-016-4041-82-s2.0-8497806896327411920LinH.QianJ.CastilloA. C.LongB.KeyesK. T.ChenG.YeY.Effect of miR-23 on oxidant-induced injury in human retinal pigment epithelial cells20115296308631410.1167/iovs.10-66322-s2.0-80053435752AlamT.UludagM.EssackM.SalhiA.AshoorH.HanksJ. B.KapferC.MinetaK.GojoboriT.BajicV. B.FARNA: knowledgebase of inferred functions of non-coding RNA transcripts20174552838284810.1093/nar/gkw9732-s2.0-85018362817ChenY.GaoC.SunQ.PanH.HuangP.DingJ.ChenS.MicroRNA-4639 is a regulator of DJ-1 expression and a potential early diagnostic marker for Parkinson’s disease2017923210.3389/fnagi.2017.002322-s2.0-85027179379ShendelmanS.JonasonA.MartinatC.LeeteT.AbeliovichA.DJ-1 is a redox-dependent molecular chaperone that inhibits α-synuclein aggregate formation2004211, article e36210.1371/journal.pbio.00203622-s2.0-1394426776915502874LiuC.ChenY.KochevarI. E.JurkunasU. V.Decreased DJ-1 leads to impaired Nrf2-regulated antioxidant defense and increased UV-A–induced apoptosis in corneal endothelial cells20145595551556010.1167/iovs.14-145802-s2.0-8490811418225082883BilliaF.HauckL.GrotheD.KonecnyF.RaoV.KimR. H.MakT. W.Parkinson-susceptibility gene DJ-1/PARK7 protects the murine heart from oxidative damage in vivo2013110156085609010.1073/pnas.13034441102-s2.0-8487615009723530187DongworthR. K.MukherjeeU. A.HallA. R.AstinR.OngS. B.YaoZ.DysonA.SzabadkaiG.DavidsonS. M.YellonD. M.HausenloyD. J.DJ-1 protects against cell death following acute cardiac ischemia–reperfusion injury201452, article e108210.1038/cddis.2014.412-s2.0-84896764822ShimizuY.LambertJ. P.NicholsonC. K.KimJ. J.WolfsonD. W.ChoH. C.HusainA.NaqviN.ChinL. S.LiL.CalvertJ. W.DJ-1 protects the heart against ischemia–reperfusion injury by regulating mitochondrial fission201697566610.1016/j.yjmcc.2016.04.0082-s2.0-8496717548927108530LiH.FanJ.YinZ.WangF.ChenC.WangD. W.Identification of cardiac-related circulating microRNA profile in human chronic heart failure201671334510.18632/oncotarget.66312-s2.0-8497635553326683101de la TorreJ. C.Is Alzheimer’s disease a neurodegenerative or a vascular disorder? Data, dogma, and dialectics20043318419010.1016/S1474-4422(04)00683-02-s2.0-124233182314980533XuW.LiF.LiuZ.XuZ.SunB.CaoJ.LiuY.MicroRNA-27b inhibition promotes Nrf2/ARE pathway activation and alleviates intracerebral hemorrhage-induced brain injury2017841706697068410.18632/oncotarget.199742-s2.0-8503026171029050310WangJ.SongY.ZhangY.XiaoH.SunQ.HouN.GuoS.WangY.FanK.ZhanD.ZhaL.CaoY.LiZ.ChengX.ZhangY.YangX.Cardiomyocyte overexpression of miR-27b induces cardiac hypertrophy and dysfunction in mice201222351652710.1038/cr.2011.1322-s2.0-8485779171121844895SignorelliS. S.VolsiG. L.PitruzzellaA.FioreV.MangiaficoM.VanellaL.ParentiR.RizzoM.VoltiG. L.Circulating miR-130a, miR-27b, and miR-210 in patients with peripheral artery disease and their potential relationship with oxidative stress: a pilot study2016671094595010.1177/00033197166382422-s2.0-8499084801726980776ParaskevopoulouM. D.VlachosI. S.KaragkouniD.GeorgakilasG.KanellosI.VergoulisT.ZagganasK.TsanakasP.FlorosE.DalamagasT.HatzigeorgiouA. G.DIANA-LncBase v2: indexing microRNA targets on non-coding transcripts201644D1D231D23810.1093/nar/gkv12702-s2.0-8497687516426612864PanL.LiangW.FuM.HuangZ. H.LiX.ZhangW.ZhangP.QianH.JiangP. C.XuW. R.ZhangX.Exosomes-mediated transfer of long noncoding RNA ZFAS1 promotes gastric cancer progression20171436991100410.1007/s00432-017-2361-22-s2.0-8501492828328285404ChenX.CuiY.XieX.XingY.YuanZ.WeiY.Functional role of miR-27b in the development of gastric cancer20181745081508710.3892/mmr.2018.85382-s2.0-8504325872729393383ShinD. W.SuhB.ParkY.LimH.SuhY. S.YunJ. M.ChoB. L.YangH. K.Risk of coronary heart disease and ischemic stroke incidence in gastric cancer survivors: a nationwide study in Korea201825113248325610.1245/s10434-018-6635-y2-s2.0-8505063340530043317AndreassiM. G.Non-coding RNA in cardiovascular disease: a general overview on microRNAs, long non-coding RNAs and circular RNAs201826310.21037/ncri.2018.11.03LyuQ.ZhangZ. B.FuS. J.XiongL. L.LiuJ.WangT. H.Microarray expression profile of lncRNAs and mRNAs in rats with traumatic brain injury after A2B5+ cell transplantation201726101622163510.1177/09636897177230142-s2.0-8503841754729251113