Objectifying Specific and Nonspecific Effects of Acupuncture: A Double-Blinded Randomised Trial in Osteoarthritis of the Knee

Introduction. Acupuncture was recently shown to be effective in the treatment of knee osteoarthritis. However, controversy persists whether the observed effects are specific to acupuncture or merely nonspecific consequences of needling. Therefore, the objective of this study is to determine the efficacy of different acupuncture treatment modalities. Materials and Methods. We compared between three different forms of acupuncture in a prospective randomised trial with a novel double-blinded study design. One-hundred and sixteen patients aged from 35 to 82 with osteoarthritis of the knee were enrolled in three study centres. Interventions were individualised classical/ modern semistandardised acupuncture and non-specific needling. Blinded outcome assessment comprised knee flexibility and changes in pain according to the WOMAC score. Results and Discussion. Improvement in knee flexibility was significantly higher after classical Chinese acupuncture (10.3 degrees; 95% CI 8.9 to 11.7) as compared to modern acupuncture (4.7 degrees; 3.6 to 5.8). All methods achieved pain relief, with a patient response rate of 48 percent for non-specific needling, 64 percent for modern acupuncture, and 73 percent for classical acupuncture. Conclusion. This trial establishes a novel study design enabling double blinding in acupuncture studies. The data suggest a specific effect of acupuncture in knee mobility and both non-specific and specific effects of needling in pain relief.


Introduction
Knee osteoarthritis is a major cause of morbidity, disability, and health care utilisation, particularly in elderly patients [1]. e primary clinical manifestations are pain and joint stiffness [2]. erapy recommendations aim to improve physical function and to relieve symptoms [3]. Unfortunately, pharmacological approaches oen render limited effects and also carry the burden of potentially serious side effects [4]. Hence, many patients try complementary medicine treatments [5][6][7]. Amongst the nonpharmacological approaches, the use of acupuncture has increased consistently during the past few decades [7].
Recent randomised controlled trials have produced rather contradictory results with respect to acupuncture's effects. Some trials have suggested a potential bene�t of 2 Evidence-Based Complementary and Alternative Medicine acupuncture beyond that of sham or minimal acupuncture [8][9][10], whereas other studies have reached the opposite conclusion [11]. ese inconsistent results have generated much discussion in the scienti�c community as to whether the effects in acupuncture were caused by mere skin penetration or by the stimulation of speci�c points [12][13][14][15][16].
In an attempt to clarify this issue, we found corresponding inconsistencies in the study designs themselves: the sham or minimal acupuncture procedures used as controls in the aforementioned trials differed systematically from the actual acupuncture groups regarding number, size and length of needles, and intensity and duration of the doctor-patient encounter. Moreover, the trials failed to achieve complete blinding [8][9][10][11][12]. Any attempt to clarify the issue of efficacy in acupuncture requires a more controlled study design.
e controversy over acupuncture extends to the issue of the most effective method of acupuncture [17]. Some practitioners favour a modern acupuncture, treating patients according to a semistandardised set of disease-speci�c points. Other practitioners adhere to an individualised classical acupuncture, which derives acupuncture points from an assessment of disease modalities and a physical examination, including Chinese tongue and pulse diagnosis and the localisation of paraesthetic pressure points [18,19].
To elucidate these open questions, we conducted a repeated measures, double-blinded, and placebo-controlled, multicentre trial in patients with chronic osteoarthritis of the knee. e study compared the effects of three modalities of acupuncture (sham, semistandardised modern and individualised classical) within two parameters: joint mobility and pain [20,21].

Patient Population.
Patients aged 35 years or older were recruited by newspaper advertisements and from the outpatient clinics of the three participating centres. Potential participants were �rst screened by telephone interview, followed by a clinical examination to ascertain the satisfaction of the diagnostic criteria of the American College of Rheumatology and the presence of a severity grade of II or III according to the radiological �ellgren classi�cation. Patients with congenital or traumatic deformations of the knee, malignant disease, autoimmune disorders, surgery or arthroscopy during the past 12 months, medication with steroids, physical therapy, or acupuncture within the last four weeks, as well as intake of opioids during the study period, were excluded from the study. Patients were allowed to continue their regular medication including NSAID or COX2-inhibitors while participating in the study, but changes in medication and dosage were not allowed. e local ethics committee approved the protocol. All patients provided written informed consent.

Intervention, Randomisation, and Double Blinding.
Patients were informed that the study aimed to identify the most effective of three acupuncture techniques, including one sham technique. Participants were allocated in random order to (a) the needling of non-speci�c points (sham), (b) a semistandardised selection of disease-speci�c acupuncture points as used in recent studies (modern acupuncture), and (c) an individualised selection of acupuncture points determined by the diagnosis according to the traditional Chinese medicine (classical acupuncture). Each patient received all three forms of acupuncture (a, b, and c) in a random order. Each session was spaced seven days apart resulting in one single treatment per week as well as one single treatment per form of acupuncture (a, b, and c). Prior to every acupuncture session, a fully quali�ed and experienced physician and acupuncturist established the Chinese medical diagnosis as de�ned by the �eidelberg Model of Chinese medicine [22]. Using three differently coloured pens at random choice, the �rst physician marked points for classical, modern, and sham acupuncture. ereaer, the �rst physician informed the study-coordinating centre about the colour allocation. e study-coordinating centre compared these colour codings to the sequence of treatment modalities according to a computer-generated randomisation table and informed a second physician about the colour of the points to be needled. In all study centres, this second physician was a novice to acupuncture in order to minimise possible biases arising from the observation of points. is apprentice practitioner was instructed to maintain a standardised method as to needle insertion or needle stimulation throughout all three sessions. Aer acupuncture, the patients redressed with light garment to cover any potential marks from needling. ereaer, the patient returned to the �rst physician, who was unaware of the used acupuncture method, for assessment of pain and knee �exibility.
2.3. Acupuncture Technique. Acupuncture was performed using 0.22 × 40 mm copper needles. Only one knee was treated in the study. Ear and hand acupuncture was not allowed. During all sessions, the number of needles, the type of needles, the depth of insertion, and the intensity of stimulation were kept identical. In each session, ten points ± two points were allowed to be stimulated. e needles were rotated immediately aer insertion and again aer 15 minutes. Needles were then withdrawn aer 30 minutes. Communication with the patient during the acupuncture procedure was minimised to an explanation of the procedure. e only systematic difference across the treatment modalities was the location of needling points. Modern acupuncture adhered to previously recommended methods for selection of points for knee pain (ST36, ST34, EX32 twice, SP9, SP10, SP6, GB34, LI 4) [11,23]. In addition, up to three further points were admissible (e.g., ashi, LI3, ST40). Non-speci�c needling used the points described in Table 1. e points for the classical acupuncture were determined individually for each patient according to the classical Chinese diagnosis, which assessed the modality of symptoms, complaints associated with certain movements, tissue tenderness along the postulated acupuncture channels, tongue diagnosis, and pulse quality. In contrast to the modern acupuncture treatment, the classical acupuncture resulted in a larger variation of needling points between patients with a certain overlap to the points selected in modern acupuncture.
Evidence-Based Complementary and Alternative Medicine 3 (Data were not shown; statistics on the selected points are available from the authors.)

Outcome Measures.
Reasoning that pain-related restrictions in knee �exibility are more direct external measure of pain than subjective self-reported measures, we a priori de�ned knee �exibility as the primary outcome measures and the WOMAC scale as the secondary outcome parameter [24,25]. Knee �exion was measured in standardised fashion by using a universal goniometer, aligned with the greater trochanter, through the lateral joint line to the lateral malleolus. e �rst physician bent the knee to the point at which pain limited further �exion. Knee �exibility was measured before acupuncture, immediately thereaer, and aer 7 days (for session two and three, the latter coincided with the baseline-measurement prior to the next treatment). e abbreviated WOMAC pain score was determined prior to acupuncture and immediately thereaer, as well as three and seven days aer treatment. Change scores for either outcome were calculated by subtraction of post-from preacupuncture measurements, with a positive change score indicating improvement. For dichotomous outcomes, a treatment success was de�ned as an improvement of the knee �exibility by 10 degrees or more or a reduction of the WOMAC pain score by 50 percent or more, respectively.

Statistical Analysis.
Knee �exibility as the primary outcome parameter served to determine the sample size. An improvement by 10 degrees was regarded as potentially clinically relevant, and a difference of 5 degrees was viewed as a marginal difference. Based on a pilot study, we estimated a required total of 100 patients to obtain a power of 90% at a type I error of less than 5% in order to demonstrate a difference between methods in knee �exibility change scores by 5 degrees (StateMate 2, Graphpad Soware Inc., San Diego, CA, USA). We aimed to recruit 125 patients to allow for dropout and noncompliance. Knee �exibility was shown to be a reliable and valid parameter in several studies [26][27][28][29].
Fisher's exact test or the Kruskal-Wallis test was employed to compare baseline characteristics of the three groups resulting from the �rst randomisation. e main analysis comprised a two-factor analysis of variance (treatment modality and time) with repeated measures. Least square means, 95% con�dence intervals for knee �exion, and WOMAC scores were estimated for each patient while taking into account the covariates of gender, premedication (yes or no), disease severity (Kellgren II versus Kellgren III), and number of needles applied. Within subject contrasts were adjusted using the Greenhouse-Geisser correction.
Repeated measures analysis of variance does not readily provide for explicit modelling of possible carry-over effects. We expected that the effect size of the intervention in weeks 2 or 3 might depend on the treatment of the preceding week. erefore, we employed multilevel, hierarchical, randomintercept, and random-slope modelling of the change scores in knee �exion. In these models, we nested the six change scores (immediately aer the treatment and 7 days aer the treatment for all three modalities) within patients. e order of treatment and the preceding treatment were entered as dummy variables. All possible interactions with the treatment modality were systematically explored with non-speci�c needling and the �rst treatment as the respective reference categories. Particular attention was given to the modelling of carry-over effects from classical acupuncture to modern acupuncture and vice versa. In a �nal step, we explored random intercepts�random slopes of the �xed effects model, as long as the −2 log-likelihood value signi�cantly improved [30,31]. Blinding was maintained during the statistical analysis.
All analyses were on an intention-to-treat basis. Analyses of variance were conducted using SPSS version 12 (SPSS Inc., Chicago, IL, USA), multilevel modelling employing MLwiN (Version 2.02, Multilevel Models Project, Institute of Education, London, UK).

Results and Discussion
One-hundred and sixteen patients (mean age 62.4 years, range = 40-83, 33% males) with chronic osteoarthritis of the knee completed the study between April 2004 and May 2005. Figure 1 displays the patient recruitment, allocation, losses to followup, and exclusions. Randomisation resulted in a similar distribution of gender, premedication, and disease severity across the allocation for the �rst treatment modality ( Table  2).
Knee �exibility improved by 10 degrees or more immediately aer the acupuncture procedure in 75 of 116 classical acupuncture sessions, giving rise to a number needed to treat (NNT) of 1.5 (95% con�dence interval 1.4 to 1.8); this compared to 41 of 116 modern acupuncture sessions (NNT = 2.9, 95% CI 2.2 to 3.8) and to 6 of 116 non-speci�c needling sessions (NNT = 19, 95% CI 9.2 to 53, ). Classical acupuncture resulted in a signi�cantly larger improvement immediately aer the treatment ( Figure  2, mean change = 10.3 degrees, 95% CI 8.9 to 12) compared to modern acupuncture (4.7 degrees, 95% CI 3.6 to 5.8), while no effect was observed for non-speci�c needling (0.34 degrees, 95% CI-0.61 to 1.3; ; , 358; ). Adjusting for the Kellgren classi�cation revealed that the difference between classical acupuncture and modern acupuncture was even larger in patients with more severe illness ( ). e analysis of the change scores employing multilevel modelling revealed signi�cant carry-over effects from the �rst to the second and from the second to the third treatment. When the �rst treatment consisted of classical acupuncture (estimated mean change = 9.1 degrees, 95% CI 6.2 to 13), the effects from modern acupuncture (mean change = 0.7 degrees, 95% CI-1.3 to 2.7) were negligible. However, when the �rst treatment consisted of modern acupuncture (mean change = 5.5 degrees, 95% CI 3.1 to 7.9), subsequent classical acupuncture resulted in a further �exibility gain (mean change = 4.3 degrees, 95% CI 2.0 to 6.6). e small differences from the values reported in the preceding paragraph arise from the adjustment for carry-over effects. (i) A point between the gallbladder and stomach conduit at the posterior edge of the �bula 2 cun above the malleolus lateralis (ii) A point 2 cun and 6 cun, respectively, above the malleolus medialis on the tibial surface (intracutaneous needling without contact to the periost with the needles pointing to the knee) (iii) A point in the middle of the thigh on a line between the patella and the anterior iliac spine (iv) A point at the top of the contracted biceps muscle To equalise the number of needles employed between the different needling modalities, the following additional points were permitted: (i) A point 3 cun above and medial to the cle of the knee joint between the spleen conduit and the renal conduit e multilevel model also suggests that the substantial variation between patients in the effects of classical acupuncture is relatively independent of the variation in the effect of modern acupuncture-in other words, the extent of improvement aer classical acupuncture is not correlated with the extent of improvement aer modern acupuncture ( for the covariance in the random part of the model). In contrast to the differences in efficacy for knee mobility, all three treatment forms resulted in some immediate improvement of pain scores (Figure 3). Classical acupuncture showed a signi�cantly larger improvement immediately aer treatment than non-speci�c needling did (post-hoc contrast, , , ). e pain relieving effect of any needling rapidly declined. At the 7-day follow-up visit, pain scores were similar across the three methods (Figure 3).

Strengths and Weaknesses
. e strength of the present study is its use of a novel study design for acupuncture which establishes blinding of both patients and the treating physicians. is design overcame major shortcomings of previous studies which failed to achieve adequate blinding and in which sham treatment usually differed substantially from acupuncture. e results of the present study offer an answer to the basic question of whether the effects in acupuncture are speci�c or caused by mere skin penetration. In our study, the needle location remained the only difference between the three treatment modalities, approximating for the �rst time the principles of randomised and doubleblinded, controlled trials in acupuncture studies.
116 patients with osteoarthritis of the knee received three treatments in a random order: acupuncture according to an individualised diagnosis of Chinese medicine (classical acupuncture), a semistandardised modern version of acupuncture usually employed in acupuncture trials (modern acupuncture) and non-speci�c needling. e main �ndings were a twice as large improvement in knee �exibility immediately aer classical acupuncture (10.3 degrees) as compared to F 2: Knee �exion before and aer acupuncture. e �gure compares the maximum possible knee movement until further �exion was blocked by pain for classical acupuncture, semistandardised modern acupuncture, and non-speci�c needling. Flexion was assessed immediately prior to treatment, directly thereaer and at a recall visit aer 7 days. Data display the means adjusted for Kellgren classi�cation, prior intake of medication, and patient gender. Error bars indicate the standard error of the mean. Knee �exion is displayed in degrees according to the neutral-�ero method.
modern acupuncture (4.7 degrees) and no change aer non-speci�c needling (0.3 degrees). e largest improvements in pain were also seen immediately aer classical acupuncture (a WOMAC score reduction by 50% or more in 85 of 116 patients)� however, non-speci�c needling also achieved considerable effects (core reduction by 50% in 56 of 116 patients, approaching two-thirds of the maximum effect observed aer classical acupuncture. erefore, the present data suggest substantial non-speci�c effects in subjective pain relief. In contrast to subjective pain relief, however, improvements in knee �exibility as objective outcome measure were only seen aer the needling of speci�cally selected points and not aer Classical Modern Nonspecific WOMAC pain score F 3: WOMAC pain scores before and aer acupuncture. e �gure compares the WOMAC pain scores for classical acupuncture, semistandardised modern acupuncture, and non-speci�c needling. Pain was assessed immediately prior to acupuncture, directly there-aer, by self-administered questionnaire at home at 3 days aer acupuncture, and at a recall visit aer 7 days. Data display the means adjusted for Kellgren classi�cation, prior intake of medication, and patient gender. Error bars indicate the standard error of the mean.
non-speci�c sham needling. To our understanding, this is the �rst study to prove speci�c effects of acupuncture and the �rst to exclude bias caused by differences in the control arms. With respect to pain relief, the present study corroborates earlier �ndings. e measure of effect observed for the sham acupuncture as well as for the semistandardised modern acupuncture was similar to those previously observed in multicentre trials. Pain relief of comparable effect can also be achieved by other methods such as transcutaneous electrical nerve stimulation, supporting the notion that neurogenic pain contributes to the symptoms in patients with degenerative changes in joints [32,33]. �owever, the non-speci�c effects of acupuncture may exceed those of mere placebo effects [34], for reasons as yet unexplained.
Interestingly aer seven days, no relevant difference in pain scale was reported, although we found the signi�cant changes in knee motility to be persistent among the three treatment groups. is gain in function (knee �exibility) may be considered an indirect measure of pain relief as pain is the main limiting factor for knee motility.
Moreover, we observed a rapid improvement of knee �exibility immediately aer classical acupuncture, which was twice the effect observed aer modern acupuncture and absent aer non-speci�c needling. Elucidating the physiological mechanisms [35][36][37][38] underlying this method-speci�c difference in effect was beyond the scope of the present study. Experimental data, however, offer some possible explanations: while the immediate effects on pain and knee �exibility exclude structural changes in the affected joints as the underlying mechanism of acupuncture in this experimental setting, they do, however, indicate an underlying neural mechanism [36]. It remains speculative as to whether this re�ex-like effect involves functional changes within higher regions of the central nervous system or whether regional effects on musculoskeletal dynamics and connective tissue structures may be the dominant mechanism. e observed immediate effects, however, make a primarily systemic or humoral effect rather unlikely. As the systematic search for acupuncture points with altered perception is an integral part of history taking and work-up for the Chinese diagnosis, it is conceivable that the individualised diagnostic approach may enhance the chance to effectively identify needling points with the potential for reducing functional limitations. e present study suggests that the methodology of arriving at acupuncture points may matter. In the present study, classical acupuncture outperformed modern acupuncture. Future acupuncture studies should, therefore, consider potential differences arising from the modality of acupuncture techniques in the study design.

Limitations.
Several caveats of the present investigation require consideration. Firstly, we studied each acupuncture technique only once in each patient, and treatments were usually one week apart. us, we are unable to infer the long-term or cumulative effects of repeated applications; the study should, therefore, be considered a proof of concept study.
e available data from the present study corroborate a rapid decline, particularly of the non-speci�c pain relief effect, within one week. Secondly, the present data suggest that effects on knee mobility are somewhat retained. However, the imperfect retest reliability of repeated knee-�exion measures aer one week suggests viewing this result with caution and encourages repetition in other studies. irdly, crossover designs are prone to carry-over effects. We cannot rule out residual carry-over effects beyond those explicitly modelled within the multilevel statistical method. Finally, while the data support the notion that the choice of needling points matters, the relevant aspects of the Chinese diagnosis still remain to be elucidated. is, however, cannot be addressed in this work.

Conclusions
In summary, our double-blinded and randomised crossover study provides a novel study design for assessing efficacy in acupuncture and establishes a framework for addressing the question as to whether the speci�c choice of acupuncture points matters. e study was conducted in osteoarthritis of the knee, and the outcome measures are self-reported pain relief and knee motility. As to the �rst, non-speci�c needling achieved about two-thirds of the subjective pain relief achieved aer classical acupuncture, suggesting considerable non-speci�c effects. With respect to knee motility, individualised classical acupuncture achieved twice the effect of semistandardised modern acupuncture. No change, however, was observed aer non-speci�c needling. is suggests a considerable speci�c effect of acupuncture in objective knee �exibility, an effect that appears to be method-speci�c as well.
In the scienti�c discussion about efficacy of acupuncture, our data suggests that it bears both speci�c and non-speci�c effects, and the selection of acupuncture points for treatment does appear to be relevant.

Abbreviations
NSAID: nonsteroidal anti-in�ammatory drug NNT: number needed to treat.