Analysis and Thoughts about the Negative Results of International Clinical Trials on Acupuncture

An increasing number of randomized controlled trials (RCTs) of acupuncture have proved the clinical benefits of acupuncture; however, there are some results that have shown negative results or placebo effects. The paper carried out an in-depth analysis on 33 RCTs in the 2011 SCI database, the quality of the reports was judged according to Jadad scores, and the “Necessary Information Included in Reporting Interventions in Clinical Trials of Acupuncture (STRICTA 2010)” was taken as the standard to analyze the rationality of the therapeutic principle. The difference between the methodology (Jadad) scores of the two types of research reports did not constitute statistical significance (P > 0.05). The studies with negative results or placebo effects showed the following deficiencies with respect to intervention details: (1) incompletely rational acupoint selection; (2) inconsistent ability of acupuncturists; (3) negligible needling response to needling; (4) acupuncture treatment frequency too low in most studies; and (5) irrational setting of placebo control. Thus, the primary basis for the negative results or placebo effects of international clinical trials on acupuncture is not in the quality of the methodology, but in noncompliance with the essential requirements proposed by acupuncture theory in terms of clinical manipulation details.


Introduction
As an integral part to the Chinese medical and health care system, acupuncture therapy is widely applied in clinical applications, effective in treatment, economical, and safe and thereby generally accepted by Chinese people. Since the sixth century AD, acupuncture has successively spread to various countries of the world, making considerable contributions to relieving people from diseases worldwide. Along with the development of evidence-based medicine, international clinical trials on acupuncture have been increasing in number and raising greater controversies on whether or not acupuncture is effective. While the majority of international clinical trial reports on acupuncture have demonstrated that acupuncture therapy is indeed effective, some research has shown that acupuncture therapy benefits patients, but is equivalent to the placebo effect [1], and some people consider acupuncture therapy to be ineffective [2]. Currently, it is widely believed that such a result is a product of the higher-quality methodology for international randomized controlled trials (RCTs) of acupuncture. The purpose of the current study was to determine the basis for the negative results or placebo effects in published acupuncture RCTs from the perspective of methodology and interventions after comprehensively reading and analyzing the published acupuncture RCTs retrieved from the 2011 SCIE database, with the exception of research conducted in China.

Randomization
Not randomized or inappropriate method of randomization The study was described as randomized The method of randomization was described and it was appropriate Double blinding Not blind or inappropriate method of blinding The study was described as double blind The method of double blinding was described and it was appropriate Withdraws and drop outs Not describing the follow-up A description of withdraws and dropouts -Full-text articles assessed for eligibility (n = 94)

Records screened (n = 601)
Records excluded (n = 507) (2) patients underwent the trial regardless of age, gender, ethnicity, or course of disease; and (3) intervention of the observation or controlled group was based on the theory of meridians and collaterals and the patients were treated by acupuncture, acupressure, and/or moxibustion.

Exclusion Criteria.
The exclusion criteria were as follows: (1) nonrandomized trials; (2) nonclinical trials; (3) the intervention did not conform to the objective of the research; (4) duplicated articles; and (5) the first author was from China or the trial was conducted in China.

Data Collection and Analysis.
Evaluation was performed independently by two authors (Yang Hao and Wan-ning Liu). Relevant full articles were sorted and cross-examined. Any discrepancies were discussed or further evaluated by a 3rd author (Wei-hong Liu). Methodology was evaluated based on the Jadad score [3]. The specific evaluation standard is shown in Table 1.

Articles Included.
Of the 867 articles retrieved, 33 studies [2, met the inclusion criteria for the current analysis ( Figure 1).  Table 2. 3.3. Jadad Score of the Trials. According to the research results, the 33 reports were classified into two types (positive results and negative results or placebo effects). The 33 reports were read and the key points were extracted. According to the Jadad score, the lowest methodology quality was scored 0, while the highest methodology quality was scored 5. The clinical trial was considered low in quality if the score was ≤2 and was considered high in quality if the score was ≥3.
The Jadad scores of the research report methodologies are shown in Table 3, and the Jadad score comparison of the acupuncture RCT methodologies is shown in Table 4. By adoption of SPSS 13.0, data in Table 4 was subjected to a 2 test; the difference between the two groups was not statistically significant ( = 1.0). The quality of the clinical trial report methodology on acupuncture with positive results is similar to the clinical trial reports with negative results or placebo effects, which indicates that the difference in quality of the methodology is not the primary reason for the different clinical research results of acupuncture.  [36][37][38][39][40][41][42] with negative results or placebo effects were analyzed together with the 10 reports with negative results or placebo effects.

Analysis of Acupuncture RCTs Intervention Details with
According to the "Necessary Information Included in Reporting Interventions in Clinical Trials of Acupuncture (STRICTA 2010), " the authors designed an intervention table for the RCTs and compared the intervention details of the RCTs with positive results, negative results, and placebo effects. The detailed information of the reports is shown in Tables 5 and 6.
It is known that the clinical treatment process of acupuncture involves not only the operation of acupuncturemoxibustion therapy, but also the rational selection of therapeutic principles and methods, rational application of acupoints, manipulation, and the correct setting of the therapy, and so on. It is apparent from the analysis of items displayed in Tables 5 and 6 that the intervention process in the negative results and placebo effects was replete with defects.

Interventions of Some Trials Are Improper.
To select a proper therapeutic method is the key to assuring a curative effect; however, the authors showed that some interventions in the 17 reports were improper. For example, in item 001, moxibustion was adopted to treat constipation; in item 009, which involved the treatment of postmenopausal women suffering knee joint pain, women without medical knowledge performed acupressure by themselves at home; and in item 005, when treating the nausea and vomiting associated with labor and delivery, only a wrist band that slightly stimulates PC 6 was used. Despite certain therapeutic function, these measures are all not the most proper choice. For example, constipation is most often treated clinically with acupuncture; however, for excess syndrome or heat syndrome constipation, it is improper to use moxibustion. Similarly, it is doubtful that the wrist band which stimulates PC 6 is satisfactory to achieve a therapeutic effect like acupuncture. With respect to the studies with positive results, the applied methods were effective intervention, such as filiform needles, electroacupuncture, blunt needles, or auricular acupuncture. Clearly, effective intervention is an important factor for the results of the trial.

Acupoint Selection in Some Studies Is Not Completely
Rational. Each acupoint has its therapeutic effect. Rich experiences in acupuncture have been accumulated through the inheritance for thousands of years in China. For example, the most effective acupoints for constipation treatment are ST25, ST36, ST28, ST29, and TE6, while the researcher for item 001 selected ST23 and ST27. Clinically, ST23 is often involved in the treatment of gastric diseases or mental diseases, such as vexation and manic-depressive psychosis, while ST27 is mainly used in the treatment of hypogastrium distention and fullness, difficult urination, hernia, spermatorrhoea, premature ejaculation, and other male diseases. The selection of improper acupoints results in a low clinical curative effect. In item 003, the researcher only selected LI4 to treat infantile colic. LI4 belongs to the large intestine meridian and plays a role in treating intestinal disease, but it is clinically known for its effect on head/face diseases, sweating disorders, and gynecologic diseases, including menstrual disorders, vaginal discharge, and parturition. If GV12 and ST36 are added in the prescription, the effect would be significantly improved. Thus, the researcher may select a single acupoint by reducing confounding factors in interventions as much as possible, but not considering that acupuncture therapy needs the compatibility of acupoints for enhancement of effect and improvement of curative effect in a practical environment. The acupoints selected in the studies with positive results were correct and most effective according to clinical experiences. The comparison indicated that the rational selection of acupoints is directly related to the validity of the trial.

Needling Response Is Neglected in Most Studies.
The famous acupuncture work, Biaoyoufu, in ancient China says, "Quick needling response results in quick action, otherwise the late needling response causes treatment failure, " which means that the patients' meridian-qi circulation should be considered when needling to realize the needling response. Of 17 reports with negative results or placebo effects, 65% (11/17) did not mention whether or not the needling response    Kashefi et al. [2] Randomized Single blinded 10 3 Penza et al. [5] Randomized Patient and examiner blinded Not mentioned 2 Landgren et al. [6] Not mentioned Nurse and parents blinded 5 2 Smith et al. [7] Computer generated randomization schedule Statistician blinded 2 4 El-Deeb and Ahmady [8] Not mentioned Double blinded Not mentioned 3 Scheewe et al. [9] Not mentioned Not mentioned 27% 2 Pastore et al. [10] Block randomization double blinded 14 4 Hamid et al. [11] Not mentioned Not mentioned 35 1 Paterson et al. [12] Simple randomization Statistician blinded 3,1 4 Rogha et al. [13] Not mentioned Not mentioned 9 1 Kim et al. [14] Block randomization open 3 3 Ecevit et al. [15] Not mentioned Not mentioned Not mentioned 0 Do et al. [16] Computer generated randomization Not mentioned 1 2 Johnston et al. [17] Block randomization Open 1 3 Cox et al. [18] Not mentioned Not mentioned 1 0 Matsubara et al. [19] Not mentioned Not mentioned Not mentioned 0 Gribel et al. [20] Block randomization Open 0 2 Shiflett and Schwartz [21] Randomized Patient and assessor blinded 19 4 Sunay et al. [22] Simple randomization Single blinded 0 3 Allais et al. [23] Simple randomization Single blinded 1 2 Whitehurst et al. [24] Not mentioned Researcher blinded 49 0 Sinha et al. [25] Simple randomization Double blinded 11 4 Pfab et al. [26] Block randomization Researcher blinded 0 3 Smith et al. [27] Block randomization Patient and assessor blinded 2 3 Enblom et al. [28] Not mentioned Assessor and nurse blinded 32 2 Liodden et al. [29] Block randomization Double blinded 32 4 Sunay et al. [30] Not mentioned Single blinded 2 2 Lomuscio et al. [31] Not mentioned Patient, assessor, and statistician blinded 0 3 Modlock et al. [32] Block randomization Patient, assessor, and statistician blinded 19 4 di Cesare et al. [33] Block randomization Assessor blinded 2 3 Zick et al. [34] Computer generated randomization Patient blinded 8 3 Lev-Ari et al. [35] Not mentioned Patient and assessor blinded 14 2    was achieved. This is a problem that should be addressed by the research design and execution staff. The physicians of acupuncture and moxibustion in China mostly have such experiences that immediately after needle insertion they must observe the patient's response, earnestly feel the sense beneath the needle tip, and repeatedly operate the needle body so that the endurable feelings of sourness, numbness, swelling, heaviness, and pain can be felt by the patients. Meanwhile, the operator also feels heaviness and tightening beneath the needle tip, which is called the needling response. If such a feeling is generated, a good curative effect can be realized, whereas if it is not, the effect is slow or not apparent. As an intervention in the observation group, no needling response suggests no real or ineffective stimulation to the acupoint. In this way, the curative effect will be reduced greatly and a negative result may be more likely to occur. It is noteworthy that 65% (15/23) of the studies with positive results did not mention whether or not the needling response was achieved. As a complicated intervention, the effectiveness of acupuncture is influenced by multiple factors.

The Requirements of the Acupuncturist Is Neglected in
Many Studies. As shown in Table 5, acupuncturists involved in the clinical trials had inconsistent qualifications. The proportion of excellent acupuncturists in the studies with negative results or placebo effects was 65% (11/17). Some of the acupuncturists work part-time and are actually nurse practitioners (003), some have achieved the lowest requirements (008), some are midwives without acupuncturist qualifications (006), and some ask the patients to operate on themselves at home (004, 009, and 012). Because of low-level acupuncturists and such simple treatments, it is really difficult to fully realize the curative effect of acupuncture. The proportion of excellent acupuncturists in the studies with positive results was 74% (17/23), which was significantly higher than the studies with negative results and placebo effects. Acupuncture is a therapeutic method that has high technical skill requirements. He [43] concluded that the proficiency and level of clinical acupuncture skill constitute decisive factors of a clinical curative effect, as well as the advantages of famous veteran physicians of traditional Chinese medicine. Inexperienced or unqualified acupuncturists undoubtedly lower the effectiveness and safety of acupuncture treatment, especially when patients are asked to treat themselves.

The Acupuncture Treatment Frequency Is Too Low in Most Studies.
Among the 17 studies listed in Table 5, eight had a treatment frequency of 1-2 times/week (002, 003, 005, 007, 008, 011, 013, and 016), accounting for 47% of the studies; 53% of the studies had a treatment frequency ≥3 times/week (001, 014, and 017). Among the studies with positive results, eight had a treatment frequency of 1-2 times/week, accounting for 35%; 65% of studies with positive results had a treatment frequency ≥3 times/week. Indeed, the studies with positive results had a significantly higher treatment frequency. According to Cai and Ma [44], the influence of acupuncture at BL23 on urinary function peaks after the acupuncture is implemented for 1 hour and then slowly declines and recovers to the original level, with the effect lasting 2-6 hours. This finding is consistent with the metabolic principles in the human body. The curative effect of acupuncture is determined by the duration of the acupuncture effect remaining in the human body and the accumulation of multiple therapeutic effects. Therefore, the best treatment frequency of acupuncture is 1-2 times per day. In the event of one treatment per 2 days or an extended interval of time, it takes more time to accumulate the acupuncture effect, leading to a slower onset of effect. Moreover, different diseases require different treatment frequencies; for chronic diseases and permanent symptoms, the treatment frequency should be higher, and for chronic neuralgia (001), irritable bowel syndrome (011), and smoking cessation (017), it is evident that a good effect is difficult to realize if the frequency is one time per week. In addition to all the factors above, based on the research demonstrating a clinical curative effect, the diseases which are best to be treated are selected. For example, for smoking cessation, a worldwide problem which is difficult to eradicate, if acupuncture is adopted at a frequency of one time per week, the effect is weak.

Reflections on Placebo Acupuncture
Settings. The 17 studies with negative results or placebo effects are generated in comparison with other therapeutic measures. At the same time, the suitability of the control settings is also worthy of further analysis. The authors have analyzed the control setting list (Table 7) in these research reports and divided the control methods into the following three types: (1) no penetration into the skin (the Park sham needle) or heat insulation acupuncture; (2) slight penetration into the skin Smoking cessation Acupuncture Nonacupoint shallow penetration (1-3 mm) or press; and (3) stimulation of the nonacupoint parts. These three points will be analyzed one-by-one as follows.

No Penetration into the Skin as a Control.
Park sham acupuncture instruments were used in items 006, 013, 014, and 015, which is the control that did not penetrate into the skin. The instrument incorporates a round and blunt needle head which can be retracted into the needle handle and does not penetrate into the skin when the needle is touching the skin. The outer surface of the needle is fixed by double-faced adhesive tape and equipped with a small pipe to prevent the patient from seeing the truth. Park et al. [45] reported that the needle head would inevitably stimulate the skin and have a vivid effect on the skin, which will result in a physiologic effect. The Park sham acupuncture changes the method and tools of stimulation; thus the control method can also generate some therapeutic effect, but the researcher considers it as the control measure that cannot generate an effect or only shows a placebo effect. Therefore, when the measures of the observation group indicate the same curative effect as that of the control group, the measure of the observation group is considered to be invalid or have placebo effect only. The measure of the control group has some therapeutic effect, so the result of the observation group is false-negative. The Park sham acupuncture method is similar to a pressing method. The acupressure is referred to as the "indicator" in acupuncture theory and exclusively used for infants, people afraid of acupuncture, nervous patients, or when the needle is lacking; acupressure is also a simple method with a treatment effect.

Slight Penetration into the Skin as a Control. The 016
and 017 studies carried out the control using the shallow stimulation method; however, in clinical acupuncture and moxibustion, shallow acupuncture itself is an effective therapeutic method. The Miraculous Pivot has recorded that the light stimulation just stimulates the skin, while the semipenetration involves the skin, but not the muscle. The A-B Classic of Acu-moxibustion has clearly described that 14 acupoints can be penetrated by one fen (approximately 3.3 mm) and 20 acupoints can be penetrated by 2 fen (approximately 6.6 mm) [46]. Another study [47] indicates that 42 patients with wrist myofascial pain were randomly distributed to the deep acupuncture group and the shallow acupuncture group with the same acupoints, and the acupuncture depth for the deep acupuncture group was 1.5 cm compared to 2 mm for the shallow acupuncture group.  [48] with formal names and main functions. Owing to all types of unfixed a shì points, it is easy to avoid the meridians, but hard to avoid the acupoints when designing the nonacupoint and nonmeridian controls. The parts avoiding the familiar meridians and acupoints are just defined as the nonmeridian and nonacupoint parts [49]. Furthermore, the area of the acupoints has not been measured until now, and the distance between the meridian or acupoint and the nonmeridian and nonacupoint part has not been determined. Therefore, the control with nonmeridian and nonacupoint parts is highly possible to apply the "point" with a therapeutic effect as the control, and the result of the observation group has a high possibility of a false-negative.
In summary, all of the three above-mentioned control methods showed a therapeutic effect; however, the researchers only think the therapeutic effect was from the placebo control and when the therapeutic effect of the observation group is similar to that of the control group, the conclusion is incorrect that the observation group therapy had no effect or was equal to the placebo. The other reason for the researcher to design the placebo control like this is possibly related to the "blind. " In view of the particularity of acupuncture, it is impossible to identify the placebo therapy meeting the blind requirement and being similar to acupuncture. Many experts [50][51][52] have written articles to discuss the methods of setting the control group in acupuncture RCTs; however, Liu [53] suggests using modern medical methods as the standard control and aiming at the most effective and most advanced method in mainstream medicine to directly discover the advantage or disadvantage of acupuncture and give full attention to the medical development of acupuncture.

Discussion
By analyzing acupuncture RCTs in the SCI database, it is discovered that the methodologic quality of research with positive results is not different from that of research with negative results or placebo effects. The methodologic quality is not the primary reason contributing to the difference in research results; however, each study with negative results or placebo effects has disadvantages on the intervention side, such as incomplete rational acupoint selection, inconsistent ability of acupuncturists, negligence of the needling response to needling, low frequency of the acupuncture treatment, and irrational setting of placebo control. Those directly weaken the positive property of the results in the observation group, and the setting of the placebo acupuncture control is opposite to the theory of acupuncture. The placebo acupuncture method has certain therapeutic effects instead of purely a placebo effect, thereby causing the false-negative property of the results in the observation group. It was shown that the sham acupuncture (placebo acupuncture) in the current acupuncture RCTs and the placebo control method was not reached by consensus. The Society of Acupuncture and Moxibustion gradually found that the clinical trials under ideal conditions are not suitable for acupuncture and moxibustion. Seeking the clinical research methods in the practical world, practical clinical research may be able to break the limit of the placebo acupuncture control and find the advantage of acupuncture therapy.
We can see that the current clinical research for acupuncture and moxibustion still reflects many methodologic problems and is not mature in terms of theory and practices. It is necessary to establish a clinical research method for acupuncture and moxibustion to meet the requirements of the acupoint theory, practice features, and clinical trials so that the clinical trial results for acupuncture and moxibustion are scientific, comply with medical ethics, completely meet the treatment effect advantages of acupuncture, and promote acupuncture to mainstream medicine.
The limitations of the research are as follows: (1) the research report is of limited duration, thus this paper inevitably suggests selection bias; (2) a common phenomenon exists in the sector that the probability of publishing of a negative article is lower than for a positive article, which will cause bias to the research conclusion; and (3) the Jadad scale is used to evaluate the methodologic quality of the article. The greatest strength of the scale is directly evaluating the verified test features related to the bias in the test effect evaluation, which is simple and clear; however, the Jadad score will be too general and arbitrary if most of the research is not defined, whether or not they are random or double blind.