Text Mining of the Classical Medical Literature for Medicines That Show Potential in Diabetic Nephropathy

Objectives. To apply modern text-mining methods to identify candidate herbs and formulae for the treatment of diabetic nephropathy. Methods. The method we developed includes three steps: (1) identification of candidate ancient terms; (2) systemic search and assessment of medical records written in classical Chinese; (3) preliminary evaluation of the effect and safety of candidates. Results. Ancient terms Xia Xiao, Shen Xiao, and Xiao Shen were determined as the most likely to correspond with diabetic nephropathy and used in text mining. A total of 80 Chinese formulae for treating conditions congruent with diabetic nephropathy recorded in medical books from Tang Dynasty to Qing Dynasty were collected. Sao si tang (also called Reeling Silk Decoction) was chosen to show the process of preliminary evaluation of the candidates. It had promising potential for development as new agent for the treatment of diabetic nephropathy. However, further investigations about the safety to patients with renal insufficiency are still needed. Conclusions. The methods developed in this study offer a targeted approach to identifying traditional herbs and/or formulae as candidates for further investigation in the search for new drugs for modern disease. However, more effort is still required to improve our techniques, especially with regard to compound formulae.


Introduction
Natural products used in traditional medicine have historically been invaluable for drug development [1,2]. Successful examples of transformation of traditional medicines into modern drugs included quinine [3], huperzine [4], aspirin [5], and artemisinin [6,7]. However, the path from traditional medicine to pharmaceutical product is fraught with challenges. The first step is "discovery" from traditional medicine [8]. Traditional Chinese medicine, which has been "clinically" tested for thousands of years, is a rich source of therapeutic leads for drug discovery. These ancient remedies were handed down from generation to generation and recorded in the classical literatures. Nowadays, the classical medical books have become the precious cultural heritage in China, and they are important sources for drug discovery from traditional medicine. As researchers in Western countries have focused on translational medicine to develop more effective clinical strategies from laboratory results, scholars in China have begun to search for potentially effective natural products based on these historical records of medical experience [8][9][10].
However, as the years passed, diseases and their names changed, leading to the disassociation between the traditional and modern medical terminologies. Given the voluminous content of the traditional Chinese medical literature, conducting searches to identify potential drug candidates is challenging. Additionally, the effects of classical formulae for 2 Evidence-Based Complementary and Alternative Medicine the treatment of modern diseases still need to be assessed. All of these aspects present obstacles to the effective and efficient use of the classical literature resources for therapeutic product discovery. Consequently, modern approaches that can mine these classical medical records of traditional Chinese medicine need to develop. Over the last five years, through the International Research Network for Traditional and Complementary Medicine (IRN-TCM), we have developed and refined methods for text mining of the traditional Chinese medicine classical literature to identify candidate herbs and herbal combinations that show potential for further research [11][12][13].
Diabetic nephropathy is the most common cause of endstage renal disease around the world and is characterized by rapid progression and a poor prognosis [14]. With the standard therapy of angiotensin-converting enzyme (ACE) inhibitors or angiotensin II receptor blockers (ARB), combined with glucose, lipid, and blood pressure control [15], the outcome for patients with diabetic nephropathy remains poor [16]. There is a need for new therapies to improve the outcomes of diabetic nephropathy treatment. In China, after thousands of years of traditional medical practice, a great deal of valuable experience has accumulated regarding diabetic nephropathy. Therefore this study aimed to apply modern text-mining methods to identify candidate herbs and formulae for the treatment of diabetic nephropathy.
The project involved three parts: (1) identification of classical terms that could refer to diabetic nephropathy; (2) text mining of the classical Chinese medical literature; and (3) preliminary evaluation of the effect and safety of candidates on diabetic nephropathy and the selection of candidates for further drug discovery efforts.

Methods
In order to identify all the classical terms that could have referred to diabetic nephropathy, literature searches were conducted. Articles that focused on original researches related to classical medical terms and on the experience of venerable TCM doctors were retrieved from the Chinese databases CNKI, VIP, Wan Fang, CBM, and TCM online. Medical textbooks for undergraduate and postgraduate teaching issued by the state and medical monographs on diabetic nephropathy were also collected through the library of Guangzhou University of Chinese Medicine.
Two authors extracted the classical terms related to diabetic nephropathy that were mentioned in these sources and calculated the frequency of mention for each term. In order to obtain expert opinion on which terms were more corresponding with diabetic nephropathy, a questionnaire was designed and distributed to traditional medicine hospitals around China. Heads of the nephrology department in these hospitals who had more than 10 years of clinical experience in classical medical Chinese were consulted.
The consulting questionnaire included all the classical terms, classical medical records describing their clinical manifestations, and the clinical features of diabetic nephropathy according to the diagnostic criteria of modern medicine. Experts were required to gauge the degree of consistency between the classical term and the modern conception of diabetic nephropathy by comparing their clinical manifestations descriptions. Frequencies of each classical term mentioned in research articles, empirical articles, textbooks, and medical monographs were attached as a reference.
The degree of consistency was divided into five categories: completely consistent (5 points), mostly consistent (4 points), partly consistent (3 points), seldom consistent (2 points), and completely inconsistent (1 point). Experts had to tick only one category for each classical term. Total score of each classical term was calculated by adding the points experts ticked. Scoring rate of each classical term was full score divided by its total score and then multiplied by 100%. Full score was 5 points multiplied by the number of returned questionnaires.
These classical terms with scoring rate more than 50 percent were regarded as identified terms by expert consultation for further verification. Their corresponding modern diseases were retrieved in the textbooks and monographs of Chinese Internal Medicine, monographs of kidney disease of Chinese Medicine, and dictionaries of Chinese Medicine via the library of Guangzhou University of Chinese Medicine. The mentioned frequencies of each modern disease were counted.
Classical terms which have corresponding modern diseases not limited to diabetic nephropathy or targeting many organs not mainly in kidney were excluded. Classical terms with corresponding modern diseases which refer to kidney damages occurring in diabetes mellitus were included and used in ancient literature searching.
"Encyclopedia of Traditional Chinese Medicine" (CD-ROM version 4.0, published by Hunan Electronic and Audio-Visual Publishing House in 2006), which includes 1009 different Chinese medical books written before the emergence of the People's Republic of China (1949 AD) [13], was selected as the text mining resource.
The information about the treatments of these included classical terms was extracted, including the titles and completion dates of the books, all records related to therapies for disorders congruent with diabetic nephropathy, and the formulae used for treating these disorders. Ancient formulae targeting incongruent disorders with diabetic nephropathy confirmed by two authors, respectively, were excluded. Discrepancies were resolved by a third author, who made the final decision. The frequency of citation of each included formula was calculated. Formulae with higher recorded frequency were selected as candidates for further work in drug discovery for diabetic nephropathy.
A preliminary evaluation of the effect of candidates on diabetic nephropathy was conducted by searching the databases PubMed (January 1966 to June 2012), EMBASE (January 1985 to June 2012), the Cochrane Library, and clinical-Trials.gov to locate studies on the clinical application and experimental research on candidate formulae and their components. A total of 31 classical terms associated with diabetic nephropathy were collected for expert consultation ( Table 1). Frequencies of each classical term mentioned in research articles, empirical articles, medical monographs, and textbooks were attached as a reference (Table 1).

Classical Terms
Thirty-five questionnaires were returned from 4 municipalities, 17 provinces, and 3 autonomous regions in China. These did not include Shandong province, Hainan province, Gansu province, Hunan province, Qinghai province, Tibet autonomous region, and the Xinjiang Uygur autonomous region. Full score of each classical term was 175 points (5 points multiplied by 35 returned questionnaires). Scoring rates of Shui Zhong (水肿), Shen Xiao (肾消), Niao Zhuo (尿浊), Guan Ge (关格), Xu Lao (虚劳), Xia Xiao (下消), Xiao Ke (消渴), and Xiao Shen (消肾) were more than 50 percent ( Table 2). Experts who marked one classical term at least 3 points were considered approving the consistency between this classical term and diabetic nephropathy and their provinces were listed in Table 2.
To further verify the consistency between classical terms with scoring rate more than 50 percent and diabetic nephropathy, 35 textbooks of Chinese Internal Medicine, 86 monographs of Chinese Internal Medicine, 57 monographs of kidney disease of Chinese Medicine, and 12 dictionaries of Chinese Medicine were retrieved via the library of Guangzhou University of Chinese Medicine. The correspondence between these ancient terms and diabetic nephropathy was overlapping (Table 3).
Corresponding modern diseases of Shui Zhong (水肿) include renal edema, cardiac edema, nutritional edema, endocrine edema, hepatic edema, and edema of unknown reason. Besides diabetic nephropathy, renal edema also refers to acute or chronic glomerulonephritis, nephrotic syndrome, other secondary glomerular diseases (such as lupus nephritis), and chronic renal failure. Xu Lao (虚劳) is considered as chronic and consumptive disease involving multisystems and multiorgans, especially organ function decline or failure. Guan Ge (关格) is regarded as chronic renal failure, acute renal failure, uremia period, ileus, and esophageal carcinoma. Niao Zhuo (尿浊) refers to chyluria, phosphaturia, filariasis, urinary system infection, urinary system cancer, tuberculosis, and so on. Xiao Ke (消渴) mainly refers to diabetes mellitus (Table 3).
Shen Xiao (肾消), Xia Xiao (下消), and Xiao Shen (消肾) were not regarded as independent diseases in textbooks and monographs of Chinese Internal Medicine, and monographs of kidney disease of Chinese Medicine. They were mentioned in Xiao Ke (消渴) when kidney damage occurs ( Table 3).
The following three extracts are examples of descriptions consistent with DN [17]. In relation to Xiao Shen (消肾) the Bei Ji Qian Jin Yao Fang, written by Sun Si-miao during the Tang Dynasty (652 AD), provides the following description: "Patients with symptoms such as fever due to deficiency, thirst but not drinking more water, frequent urination, turbid urine Guan Ge (关格) 3 9 3 1 7 Niao Zhuo (尿浊) 3 5 9 5 Shen Xiao (肾消)    The relationship between the three ancient terms was described in 8 dictionaries of Chinese Medicine. Xia Xiao (下 消) refers to Shen Xiao (肾消) and Xiao Shen (消肾).

Discovery from the Classical Medical Literature Text Min-
ing. This study searched ancient records of Xia Xiao (下消), Shen Xiao (肾消), and Xiao Shen (消肾) via "Encyclopedia of Traditional Chinese Medicine. " Ancient records which were thought to be corresponding with the symptoms of priapism were not included for formulae extraction.

Preliminary Evaluation of the Effect and Safety of Candidates on Diabetic
Nephropathy. After identification of the candidate formulae, preliminary evaluation of their effect on diabetic nephropathy was undertaken. This began with the simple, high frequency formulae. Among the 18 formulae, "Sao si tang (缫丝汤) (also called Reeling Silk Decoction)" ranked fifth and was the simplest since it only contained one ingredient-silkworm and/or silk cocoon.
The earliest record of its use was in Yi Xue Zheng Zhuan written by Yu Tuan during the Ming Dynasty (1515 AD). In reference to the inherited formula Reeling Silk Decoction, he wrote that "it has an excellent effect on Shen Xiao with the symptoms of turbid urine, polydipsia and excessive appetite but the person loses weight. . ... the effect of the hot water used in reeling silk (i.e. Reeling Silk Decoction) is best. If this is not available, it can be replaced by a decoction of silkworm cocoon or silk floss. "-from Yi Xue Zheng Zhuan [17].
Based on this report and subsequent repeated citation of this remedy by other authors, we conducted the literature search of the modern studies regarding the silkworm, its related products, and its active ingredients, for treating diabetic nephropathy in order to investigate whether this simple formula could have the potential to be developed into a new agent for diabetic nephropathy.
No studies of Reeling Silk Decoction were located, but there have been considerable studies involving silkworm, its related products, and its active ingredients. 202 articles describing the active ingredients of the silkworm and its products for diabetic nephropathy were retrieved in a search of the modern literature ( Figure 1).
According to modern studies, the silkworm and its products are rich in various active substances such as alkaloids, flavanoids, and silk protein hydrolysates.
1-Deoxynojirimycin (DNJ) is a major component of the alkaloids in silkworm [18]. A clinical study in Japan [19] showed that the N-hydroxyethyl derivative of 1-DNJ (miglitol) decreased the urinary albumin excretion rate in Japanese patients with type 2 diabetes. One possible mechanism is related to improved insulin resistance [20]. It was reported to be safe for patients with stage 3 diabetic nephropathy [21]. However, it is not recommended for patients with renal insufficiency (serum creatinine >2 mg/dL) because it is excreted primarily via the kidney [22].   Among the flavanoids, which have been purified and identified from the sericin layer of silkworm cocoons [23], quercetin was reported to have renal protective effects. It suppressed glomerular mesangial cell hypertrophy, proliferation, and extracellular matrix accumulation, all of which occur in glomerular sclerosis [24]. Proposed mechanisms of action include inhibition of transforming growth factor-1 (TGF-1) expression [25] and amelioration of oxidative stress [26], which have been shown to be final common mediators of renal injury in diabetes [27]. Additionally, quercetin was reported to reduce nuclear factor-B (NF-B) expression, which may be involved in the pathogenesis of proteinuria in diabetic nephropathy [28,29].
Additionally, the concentrations of 1-DNJ and the activities of quercetin in silkworm are higher than in mulberry leaves, which are the only food source of silkworm, because of the biotransformation in the silkworm body [30][31][32][33][34][35].
Therefore, components of Reeling Silk Decoction have demonstrated promising potential for development as new agents for the treatment of diabetic nephropathy. However, its safety for patients with renal insufficiency should be evaluated in further investigations.

Discussion
The methods used in classical traditional Chinese medicine, which have been "clinically" tested for thousands of years, continue to play an indispensable role in the treatment of chronic diseases in Asian countries. It has also become an important source of drug discovery for Western scholars and pharmacologists. However, barriers such as the disassociation between the traditional and modern medical terminologies, and the voluminous content of traditional Chinese medical literature, have slowed the pace of drug discoverer using the resources of the classical literature. The use of modern 8 Evidence-Based Complementary and Alternative Medicine   The usual method for identifying ancient terms corresponding with modern disease is based mainly on narrative reviews of the classical literature. However, the result in this study indicated that correspondence between ancient terms and modern disease was overlapping, rather than there being a one-to-one correspondence. This phenomenon also appeared in age-related dementia and memory impairment [12]. So the usual approach narrative review was not enough to identify the classical terms of modern disease. The two-way confirmation of terminology correspondence was applied in our study. Expert consultation was used to identify the ancient terms related to diabetic nephropathy. And then the corresponding modern diseases of each term identified by expert opinion were retrieved in textbooks and monographs of Chinese Internal Medicine, monographs of kidney disease of Chinese Medicine, and dictionaries of Chinese Medicine.
Among these identified ancient terms, Shui Zhong (水肿) was named after symptom of a visible edema caused by discords in many systems. Besides diabetic nephropathy, chronic or acute glomerulonephritis, nephrotic syndrome, and other secondary glomerular diseases may result in renal edema as well, which is usually characterized by facial or lower limb swelling due to water-sodium retention or hypoproteinemia. Chronic renal failure was one of the modern diseases corresponding with Guan Ge (关格) and Xu Lao (虚劳). It was the serious end stage of all the progressed chronic kidney diseases, not only diabetic nephropathy. Xiao Ke (消 渴) was regarded as diabetes mellitus, which referred more to diabetes without kidney damage. And the modern diseases of Niao Zhuo (尿 浊) would prefer chyluria, tuberculosis, urinary system infection, and cancer, rather than diabetic nephropathy. Therefore, it was difficult to identify that if the ancient literature describing these classical terms referred to diabetic nephropathy or not. It deserved further researches specifically identifying treatment related to diabetic nephropathy in their ancient records for each of them. Xia Xiao (下 消), Shen Xiao (肾消), and Xiao Shen (消肾) which meant kidney damage occurring in diabetes were considered more corresponding with diabetic nephropathy and used in ancient literature text mining. Because Shen Xiao (肾消) also referred to Qiang Zhong (强中), which meant priapism in modern times. Formulae targeted Qiang Zhong (强中) was excluded when formulae extracting.
The two-way confirmation of terminology correspondence showed the overlap between ancient terms and modern disease more clearly. It was helpful for consistency evaluation between classical text that described these ancient terms and diabetic nephropathy in ancient text mining. However, it would be more convincing if expert consultation was included in modern diseases retrieval, just as done in the classical terms identification of diabetic nephropathy.
The systematic search of full texts of medical book firstly required the identification of a suitable collection. Our previous work located fourteen collections of traditional Chinese medical literature that could be used as resources for systematic searches [36]. The most accessible of the large fulltext collections is the Zhong Hua Yi Dian CD ("Encyclopedia of Traditional Chinese Medicine"), which allows electronic searches. So the Zhong Hua Yi Dian CD was used in our study.
Since reports about the nephrotoxicity of Chinese Medicine appeared in 1994, and a condition named "Chinese herbs nephropathy" [37] received attention, the effect and safety of Chinese Medicine on patients with chronic kidney disease have been constantly questioned. Therefore a preliminary evaluation of the effect and safety of a formula is an essential step in the drug discovery process. In this study, the primary evaluation was in the form of a review of the modern literature. This provided much useful data which had some implications for further clinical investigations and pharmacology and pharmacodynamics experiments. For example, the review indicated that the active ingredients of silkworm, such as 1-deoxynojirimycin (DNJ) and quercetin, may have a renoprotective function, but this still needs further clinical verification with a large sample and in-depth molecular mechanism research. We also learnt that the safety of silkworm in diabetic nephropathy patients with renal insufficiency had to be evaluated in further investigations because of the renal excretion of 1-DNJ.
We chose Reeling Silk Decoction, which contains only a single agent, as an example in this study, since most researchers pay more attention to individual agents than to compound formulae. This is because a single agent is simple and its effect on modern disease is easier to be elucidated using current technology. However, formulae consisting of only a single agent are not typical of the prescription used in ancient China. In fact, the compound formula containing multiple agents with different roles in treating the diseases is the essence and characteristic feature of traditional Chinese medicine [9]. In our study, a total of 80 classical formulae for treating conditions congruent with diabetic nephropathy were collected. Most of these formulae are multiherb formulae, comprising two or more herbs. If researchers only focus on single agent, it is likely that they would lose much useful information. However challenges such as the unpredictable pharmacokinetic properties of multiple components and the potential risks of agent-agent interactions in formula add to the difficulty in undertaking a preliminary evaluation of formula effect and safety. More effort is still needed to improve our modern techniques in the preliminary evaluation on the effect and safety of candidates.

Conclusions
This convergence of the results of text mining of the classical literature and searches of modern biomedical databases illustrates the value of this text-based approach to the selection of candidates for drug discovery endeavours. The use of modern technology for text mining the classical literature of traditional Chinese medicine shows potential and could be an important step towards a brighter future for drug discovery. The methods developed in this study offer a targeted approach to identifying traditional herbs and/or formulae as candidates for further investigation in the search for new drugs for modern diseases. However, more effort is still required to improve our techniques, especially with regard to compound formulae.