Development and Validation of the Quality-of-Life Assessment System for Lung Cancer Based on Traditional Chinese Medicine

Traditional Chinese Medicine (TCM) has many unique features. Thequality-of-life (QoL) instrument for lung cancer based on Traditional Chinese Medicine (QLASTCM-Lu) was the first self-reported instrument specifically developed to assess the quality of life from the perspective of TCM. Structured group methods and theory in developmental rating scale were employed to establish a general and a specific module, respectively. Quantitative and qualitative data from 240 lung cancer patients were collected to assess the psychometric properties. The three identified scales of the QLASTCM-Lu (correspondence between man and universe, unity of the body and spirit, and lung cancer specific module) and the total score demonstrated excellent psychometric properties. Test-retest reliability of all domains ranged from 0.93 to 0.96, and internal consistency α ranged from 0.86 to 0.93. Correlation and factor analysis demonstrated good construct validity. Significant differences in the QLASTCM-Lu scales and total score were found among groups differing in TCM syndrome, supporting the clinical sensitivity of the QLASTCM-Lu. Statistically significant changes were found for each scale and the total score. Responsiveness of the QLASTCM-Lu was greater than that of QLQ-LC43. The QLASTCM-Lu is a psychometrically sound and clinically sensitive measure of quality of life for lung cancer patients, which can be applied to both TCM therapy and Western medicine therapy.


Introduction
Lung cancer has been the most common cancer in the world for several decades, and by 2008 there were an estimated 1.61 million new cases, representing 12.7% of all new cancers. The majority of the cases now occur in developing countries such as China (55%) [1]. With the development of new technology, the treatment of lung cancer has been greatly improved. However, the prognosis is not optimistic, and lung cancer continues to be the most common cause of cancer death [2]. How to improve quality of life has become the focus of lung cancer research. Clinical researchers are choosing measures of QoL as primary and secondary outcomes in clinical trials. Some quality-of-life questionnaires have been developed based on Western medical theories, such as EORTC QLQ-C43 (European Organization for Research and Treatment of Cancer Quality of Life Questionnaire, the QLQ-LC13 in conjunction with the QLQ-C30) [3][4][5][6], Functional Assessment of Cancer Therapy-Lung Cancer (FACT-L) [7], Lung Cancer Symptom Scale (LCSS) [8], and the Daily Diary Card (DDC) [9].
TCM has held an important position in health care in rural areas of China and is also valued in urban and well-developed areas because of its thousands years of heritage. There is an increasing need to measure QoL from the TCM perspective. Although TCM is based on a clear rationale and a well-established theoretical framework, it is also based on a different philosophical premise [10]. In Chinese medicine, the metaphoric views of the human body based on observations of nature are fully articulated in the theory of yin-yang [11]. The meaning of yin and yang in Chinese is bright and dark sides of an object. Chinese 2 Evidence-Based Complementary and Alternative Medicine philosophy uses yin-yang to represent a wider range of opposite properties. In general, anything that is moving, ascending, bright, progressing, and hyperactive belongs to yang. The characteristics of stillness, descending, darkness, degeneration, hypoactivity belong to yin. Yin and yang are in conflict but at the same time mutually dependent. TCM emphasizes a balance and coordination of yin and yang. Within TCM philosophy, cancer results from a disturbance of yin-yang balance, and is a group of syndromes in which there is disharmony in the body-spirit-environment network [10]. The influences of environmental factors such as climatic condition and geographical locality have to be considered.
Although some scales have been developed based on Western medical theories, there is currently no validated, comprehensive, disease-specific measure to quantify QoL among lung cancer patients from a TCM perspective. A measurement tool to accurately and reliably quantify lung cancer patients' quality of life from a TCM perspective would be useful for both clinical and research purposes. To address this gap, we developed a novel, disease-specific health status instrument, the quality-of-life instrument for lung cancer based on Traditional Chinese Medicine (QLASTCM-Lu), to measure patients' perceptions of their symptoms, functional impairment, treatment concerns, and satisfaction with their health or clinical care, and to evaluate its psychometric properties in a large sample size.

Materials and Methods
The QLASTCM is composed of two basic elements: (1) a core questionnaire (QLASTCM-GM), comparing quality of life across various disease states and providing insight into improvements in general health, and (2) an additional disease-specific questionnaire (QLASTCM-Lu), designed to focus on domains, characteristics, and complaints most relevant to lung cancer.

Establishment of the General Module.
Based on a literature review of the QoL of lung cancer patients and TCM, the QLASTCM-GM should include, following two domains: "correspondence between man and universe" (天 人 相 应, which means relevant adaptation of the human body to natural environment) and "unity of the body and spirit" (形神一体, which means the body is an organic whole). An overview of the theoretical construct of the QLASTCM-GM is shown in Figure 1.
The QLASTCM-GM was developed based on interviews with Chinese cancer patients and TCM professionals to generate 278 potentially candidate questions and create a conceptual framework for how clinical manifestations of lung cancer impact the lives of patients. Because there were no estimates of means or standard deviations for the new questionnaire, sample size was defined by the collective experience of the authors and no formal power calculations could be created. To establish the validity, reliability, and responsiveness of the QLASTCM-GM, 625 cancer patients were enrolled in a pilot test. To assess experts' perceptions of the importance of each potential item, the item pool was also administered to 50 clinical experts with a rating questionnaire that provided 5 responses, ranging from "not important" (1) to "extremely important" (5). An open-ended response item was also included, so experts could add issues that were not included on the original list. As shown in Table 1, the potential number of questions was 34 after distilling (T14, T25-27, T29-31, and T42 were deleted).

Establishment of the Specific
Module. The development of the specific module was similar to the general module. 30 items about a few important categories such as physical symptoms, side effects, and psychology and social factors were proposed. After interviewing, importance analysis, and several rounds of expert group discussions, 13 items were selected and coded as F1 to F13. After that, 309 lung cancer patients were enrolled in the pilot test. The variation coefficient, correlation coefficient, cluster analysis, and the Cronbach coefficient method were used to screen each potential item. F10 was deleted.
To establish the validity, reliability, and responsiveness of the final questionnaire, we enrolled 240 lung cancer patients in the formal survey. After a brief introduction and explanation, questionnaires were distributed to patients and collected when completed. In order to get test-retest reliability, some randomly selected patients received a retest within the first 2-3 days. To observe the reaction degree and the change of quality of life, each patient received four longitudinal measures (pretreatment, 1 week, 3 months, and 6 months). As a collateral measure, the Chinese version of QLQ-LC43 [5] was also distributed to patients at the same time.
The scoring method in QLASTCM-Lu was similar to QLQ-LC43 consisting of five-point Likert responses ranging from the most severe symptoms to no symptoms. The positively stated items were directly scored from 1 to 5. The negatively stated items were scored in reverse. Scores in each domain were obtained by adding the within-domain item scores, and a total score was obtained by adding scale scores. 12 items in the general module (T18, T19, T28, T32, T33,  T34, T35, T36, T37, T38, T39, and T40) were directly scored, and the other 34 items were scored in reverse. The raw scores of 1 to 5 were transformed to a 0 to 100 scale, where a score of 0 indicated the most severe symptoms and a score of 100 indicated no limitation. Higher scores on the QLASTCM-Lu instrument indicated better quality-of-life status.

Patient Characteristics.
A total of 240 lung cancer patients (156 males) were enrolled, with a mean age of 60.3 ± 10.2 years (ranged from 32 to 87). 125 patients had junior middle school education, 70 had senior middle school education, 31 had community college education, and 14 had a four-year college education.

Content Validity.
We conducted a series of analyses to generate an initial version of the instrument which was then used in the validation, reliability, and responsiveness testing  reported in this paper. We began with a content specification and item generation phase, followed by a process of item reduction and refinement in which the instrument was reviewed by all participating personnel. The item pool was deemed to reflect the opinion of the World Health Organization regarding the connotation of quality of life [12]. The screening of items was also strictly programmed to achieve good content validity. Table 2 shows correlations between each item and its designated scale in bold type as well as correlations between each item and the other scales in normal type (unity of the body and spirit: T1-13, T15-24; correspondence between man and universe: T28, T32-41; lung cancer specific module: F1-9, F11-13). Correlations between each item and its designated scale were all significant at P < 0.001. The item-to-scale correlation of each item was high for the designated scale and weak for any other scale. T41, which showed a high correlation with "unity of the body and spirit" domain, is an exception. It hinted that T41 was included in the wrong domain, or its meaning would lead to misunderstanding. Finally, for all items, Evidence-Based Complementary and Alternative Medicine  the item-to-total score correlation was higher than all item-to-scale correlations except the designated scale. This suggests a distinct separation of scales.

Convergent and Divergent Validity.
We examined the convergent and divergent validity of QLASTCM-Lu by estimating its association with well-established questionnaires (QLQ-LC43). This was done by computing Pearson's correlation coefficients. We hypothesized that QLASTCM-Lu domains that assess "unity of the body and spirit" would correlate strongly with QLQ-LC43 physical function and emotional functioning domains, but poorly with the social function domain. As expected, results in Table 3 did show that QLASTCM-Lu "unity of the body and spirit" domain had higher correlations to QLQ-LC43 physical functioning (r = 0.81) and emotional functioning (r = 0.80) domains as compared with the social functioning (r = 0.63) domain. Table 3 further confirms the direction of the correlation as expected. These results indicated that the convergent and divergent validity was high. Conversely, QLASTCM-Lu "correspondence between man and universe" domain had lower correlations to all QLQ-LC43 domains. This provided evidence that "correspondence between man and universe" domain was a unique domain which reflected traditional Chinese culture.

Clinical Validity.
It is well known that the clinical feature would affect quality of life. To establish QLASTCM-Lu's sensitivity to TCM syndromes, as per clinicians' assessments, we compared the mean scores according to professionals' clinical categorization of patients into six basic TCM syndromes ("syndrome of phlegm dampness due to spleen deficiency," "yin deficiency syndrome toxic heat," "syndrome of deficiency of both qi and yin," "type of qi-stagnancy and blood stasis," "deficiency of lung-spleen qi," and others). As seen in Table 4, there was no statistical difference before treatment and a statistically difference after 3-month treatment. This indicates that the clinical validity of QLASTCM-Lu was good enough to reflect the differences between different TCM syndromes.

Reliability.
Internal consistency was examined using reliability coefficients (Cronbach α coefficients), which were calculated from the data from the first measure. The intraclass correlation coefficient (ICC) was used to assess test-retest reliability, which was calculated from 48 patients' data collected from the second test in the next day (Table 5).

Responsiveness. Standardized Response Means (SRM)
were used to assess clinically meaningful changes ( Table 6). 240 patients were randomly selected for a retest to evaluate the responsiveness after three months of treatment. SRM was calculated by using the paired t-test. The difference between baseline and three months was statistically significant in most domains (all except general module). However, the SRM of the QLQ-LC43 was not statistically significant in all domains, which means that the QLASTCM-Lu was more sensitive than the QLQ-LC43. There was no statistical difference in the QLASTCM-Lu general module which might be attributed to the reversal of "unity of the body and spirit" and "correspondence between man and universe."

Discussion
This paper describes the development and validation of QLASTCM-Lu which consists of 46 items and three scales: unity of the body and spirit (23 items), correspondence between man and universe (11 items), and lung cancer specific module (12 items). The QLASTCM-Lu was created partly in response to feedback from TCM professionals who felt that existing measures were not suitable for Chinese patients using TCM cancer therapy. Quality of life is a subjective concept which was often evaluated through personal feelings or one's own evaluation. It is well known that culture contributes to quality of life. Any-quality-of life measures will only apply to a defined community. Chinese culture is different from Western culture. The basis of TCM diagnosis includes palpitation, upset, choler, lethargy, night sweating, and xerostomia, which are not included in Western medicine. The focus of TCM is on the patient rather than the disease and fundamentally aims to promote health while enhancing 6 Evidence-Based Complementary and Alternative Medicine    quality of life with therapeutic strategies for treatment of specific diseases in a holistic fashion [13].
The validity of the QLASTCM-Lu was evaluated by content validity, construct validity, and criterion-related validity. The usual methods to evaluate content validity were correlation coefficient, factor analysis, and Structural Equation Model (SEM) [14]. As the number of items was not sufficient, SEM was not used in this study [15]. Content validity was evaluated using the Delphi method. Based on results from correlation analysis and exploratory factor analysis, the construct validity was good. When no clear gold standard exists for quantifying a property, the most assuring method to establish the validity of the QLASTCM-Lu is convergent validity in which the new measure is shown to be correlated with other measures that are believed to quantify the same concept. Such correlations are considered to be high when the correlation coefficient is ≥0.4. Conversely, divergent validity is demonstrated when domain items that are thought to measure different concepts have low correlations (r < 0.4). Correlation between the same or similar domain of two questionnaires was higher than that from the different domains. The convergent validity and divergent validity were satisfied in this study.
Known group validity assesses whether the QLASTCM-Lu discriminates between clinically different groups.
QLASTCM-Lu total scores were evaluated and compared between patients grouped according to physicians' clinical evaluation. Physicians' clinical assessment of their patients' syndrome was explicitly collected on the case report forms. Patients with a particular TCM syndrome were compared. The results showed good clinical validity of the QLASTCM-Lu.
Internal consistency, or reliability, examines the consistency of items within a scale and quantifies the degree to which each item is measuring aspects of the same underlying domain [16]. In this analysis the internal consistency of the QLASTCM-Lu and its subscales was examined using Cronbach α coefficient in which a value of 0.90 or higher is excellent and 0.80 or higher is sufficient [17]. The testretest reliability was evaluated by ICC, which measures how stable responses are over time, which was calculated from 48 patients' data from the second test during the following day, in which patients' quality-of-life status would be less likely to change. The internal consistency reliability and testretest reliability of the QLASTCM-Lu are both higher than 0.8. It was concluded that the QLASTCM-Lu possesses good reliability and stability.
The responsiveness of an instrument refers to its ability to detect clinically meaningful changes in a patient's quality-oflife status over time. We selected SRM as a measure of change 8 Evidence-Based Complementary and Alternative Medicine  in instruments' scores within each group and calculated it for all questionnaires. SRM is defined as mean score change divided by the standard deviation of that score change. As a benchmark for the interpretation of SRM, Cohen describes an effect size of 0.20 as small, 0.50 as medium, and 0.80 as large [18]. A series of paired t-tests were conducted to compare changes in scores for all questionnaires. The QLASTCM-Lu was considered to have good responsiveness.
The QLASTCM-Lu shares some common characteristics with the QLQ-LC43. For example, specific module and general modules were developed at the same time. Items in both measures were rated using a Likert scoring system. However, there were some distinguishing characteristics in the QLASTCM-Lu. From the view of the structure, the QLQ-LC43 general module (QLQ-C30) was constituted by nine scales and six single items [4]. There are a large number of single items on the reaction symptoms in the QLQ-C30. The QLASTCM-Lu general module was constituted by 34 items from only 2 scales. Moreover, there are only 13 items in the QLQ-LC43 specific module (QLQ-LC13), and it is analyzed in 10 dimensions. Only one dimension needs to be analyzed in the QLASTCM-Lu specific module. In other words, scoring for the QLASTCM-Lu is easier. Specifically, items in the QLASTCM-Lu capture significant Chinese cultural characteristics, such as "I feel sore and weak in my waist and knees" (T7), "I am happy with the surrounding natural environment" (T28), "I am happy with the climate of residence" (T32), and "I am afraid of wind" (F7). These items embody "correspondence between man and universe," which is specially focused in Chinese culture, and are not included in other questionnaires.
Although this study successfully developed a new qualityof-life measure and subsequently validated this instrument, there are some limitations in this study. One limitation is that the responsiveness of the QLASTCM-Lu is not high enough. It might be that the scale was too large (there were 23 items in "unity of the body and spirit") and that this counteracts some minor facets under the main domain. Minor components should be subdivided under main domains. Another limitation is that our survey excluded very ill patients; therefore, the results may not be applicable to these special groups. There were also a relatively higher proportion of some TCM syndromes in our sample, for instance, 2 out of 6 syndromes (i.e., "syndrome of phlegm dampness due to spleen deficiency" and "syndrome of deficiency of both qi and yin") were noted in 68% patients, while another 4 syndromes were only noted in 32% of the patients. Fortunately, the bias is expected to be consistent for all subjects and should not affect our conclusions.
Future research is being planned concerning the interpretation and application of the QLASTCM-Lu in different samples. Our goals are to establish separate norms for different TCM syndromes in order to determine clinically meaningful changes in QLASTCM-Lu scores.

Conclusions
On the basis of the above findings, the three identified scales of the QLASTCM-Lu and the total QLASTCM-Lu score demonstrated excellent psychometric properties. We recommend the use of the QLASTCM-Lu for the following reasons. The development of the QLASTCM-Lu has significant Chinese cultural characteristics. The concrete items in the QLASTCM-Lu are popular and straightforward, and it can be applied to both TCM therapy and Western medicine therapy. In our opinion, the QLASTCM-Lu would make a useful addition to the assessment protocol of clinical trials for lung cancer.