Alendronate versus Raloxifene for Postmenopausal Women: A Meta-Analysis of Seven Head-to-Head Randomized Controlled Trials

Purpose. The aim of this study was to directly compare the efficacy and the safety of the two agents for postmenopausal women. Methods/Principal Findings. Electronic databases were searched for relevant articles that met our predefined inclusion criteria. Seven randomized controlled trials (RCTs) involving 4054 women were identified and included. Although Aln was more effective than Rlx in increasing bone mineral density (BMD), no statistical differences were observed in reducing the risk of neither vertebral fractures (P = 0.45) nor nonvertebral fractures (P = 0.87) up to two-year followup. Aln reduced the risk of vasomotor (P = 0.006) but increased the risk of diarrhea compared to Rlx (P = 0.01). Our subgroup analysis further indicated the difference between Aln and Rlx in fracture risk and was not materially altered by the administration pattern, the age. The weekly strategy of Aln would further reduce the upper gastrointestinal (GI) disorders and might gain more bone mass increment at lumbar spine compared to its daily treatment. Conclusion. There was no evidence of difference of fracture risk reduction between Aln and Rlx. In addition, age did not obviously influence their relative antifracture efficacy. For Aln the weekly strategy would further reduce the upper GI disorders and gain more bone mass increment compared to the daily treatment. During clinical decision making, the patients' adherence and the related side-effects associated with both drugs should also be taken into account.


Introduction
Osteoporotic fracture is a world-wide concern in the current aged society. It is estimated that annually there are 180,000 people encountering osteoporosis-related fractures in England and Wales. Postmenopausal women with bone loss were considered at high risk of bone fractures, which greatly impaired their life quality and led to mortality [1]. An appropriate and timely management for preventing osteoporotic fracture is extremely important. At present, antiresorptive agents are still the major treatments. Besides the novel Denosumab, which is a human monoclonal antibody of receptor activator of NF-B ligand (RANKL) and potently suppresses osteoclastic bone resorption, alendronate (Aln), the most widely prescribed bisphosphonates, and raloxifene (Rlx), the only Food and Drug Administration approved selective estrogen receptor modulators (SERMs), are the most evident antiresorptive agents for prevention and treatment of postmenopausal osteoporosis [2,3].
For deciding the therapeutic strategy, it is highly imperative to know an estimate of the difference in fracture risk reduction between Aln and Rlx [4,5]. Although both therapies have established efficacy from randomized controlled trials (RCTs), Rlx was suggested to be less effective compared to Aln, mainly in preventing nonvertebral fractures [3,[6][7][8] and was therefore not recommended as a first-line treatment option for this population or considered as an alternative for young women with lower nonvertebral risk [3,9].
However, so far the efficacy inferiority of Rlx under Aln for postmenopausal women, especially the most relevant 2. Methods 2.1. Literature Search. Electronic databases (PubMed, Medline, EMBASE, Clinical Trial Registry and the Cochrane Data Base of Systematic Reviews, and the Cochrane Central Register of Controlled Trials) were searched without limit by two independent investigators (Lin and Ying), which were last updated on October, 2013. The search used terms and Boolean operators as follows: "(alendronate OR bisphosphonate) AND (raloxifene OR selective estrogen receptor modulators) AND postmenopausal women AND (osteoporosis OR fracture). " Reference lists of all the selected articles were hand-searched for any additional trials.

Identification of Eligible Studies.
The trials were reviewed in which (a) the target population were consisted of postmenopausal women with low bone mass, (b) the interventions at least included both Aln and Rlx therapies, (c) the outcomes at least comprised one of the following assessments: fracture incidence, BMD, or safety profile, and (d) the trials were randomized controlled trials (RCTs). The trials were excluded if (a) patients had a prior history of metastatic bone disease, (b) phase-I or observational studies, case reports, and reviews, and (c) the same RCTs were reanalyzed. Disagreements were resolved through discussions.

Assessment of Study Quality. Two reviewers (Lin and Ying) independently assessed the study validity with
Cochrane Collaboration's tool for assessing the risk of bias, which addresses six specific domains such as sequence generation, allocation concealment, blinding, incomplete outcome data, and selective outcome reporting. Whether the included trials were similar in baseline, adopting similar cointerventions, and applying intention-to-treat (ITT) analysis was also evaluated. Disagreement was evaluated by means of kappa ( ) test and resolved by discussion [22].

Data Abstraction, Conversion, and Analysis.
For each eligible trial, two of us (Lin and Ying) independently extracted the relevant data and checked the accuracy. In particular, we abstracted study design, sample size, demographic data (age, body mass index, and baseline BMD), intervention protocol, duration of the trial, loss to followup, trial outcomes (fracture incidence, BMD, and incidence of adverse events), and industrial funding. We contacted the first or the corresponding author of each eligible trial to verify the accuracy of the data abstraction as well as our methodological assessment.
The overall incidences of vertebral or nonvertebral fractures (hip, upper leg, lower leg, pelvis, hummers, wrist/forearm, clavicle/rib, and other) in the two groups were our primary outcome. We also evaluated the BMD percentage changes from the baseline at lumbar spine (LS), femoral neck (FN), and total hip (TH) in both groups. BMD was measured by dual-energy X-ray absorptiometry (DXA). The safety profile comprised the reported discontinuations due to AEs, AEs probably related to Aln (upper gastrointestinal disorders (GI) and diarrhea), and AEs probably related to Rlx (vasomotor events and venous thrombosis).
We took fracture risk reduction, LS BMD, and risk of upper gastrointestinal (GI) disorders at the end of follow-up as our main meta-analysis on basis of their sufficient trials for subgroup analysis.
We preferentially used the ITT data from the trials whenever possible. If the data were not reported in the original article, we extrapolated them from the accompanying graphs. To maximize data availability, we applied percentage change data for BMD and serum lipid outcome. If percentage change data were unavailable in BMD outcome, we imputed the percentage change data using (endpoint data, baseline data) divided by baseline data then multiplying 100 times. For the missing standard errors (SEs) of BMD data, the maximum SEs extracted from Muscoso et al. [18] were conservatively chosen for all BMD percentage change SEs. The sensitivity analysis was performed through omitting trials with imputed SEs to assess the variation in overall effect [22].
The fracture incidence and the safety profile outcomes were presented as risk ratio (RR) with 95% confidence intervals (CI) and combined using the Mantel-Haenszel method. BMD were pooled with the inverse variance method and presented as weighted mean differences (WMD) and 95% CI. We calculated the statistical heterogeneity using a Chi-squared  ( 2 ) test with the significance at 0.1. We also assessed the inconsistency 2 to describe the percentage of the variability in effect estimates due to the heterogeneity. We considered a value greater than 50% as the substantial heterogeneity. Fixed effects model would be applied if there were no statistical heterogeneity among the studies; otherwise, we used the random effects model. If substantial heterogeneities across studies ( 2 > 50%) were detected in the index five main meta-analysis, we performed post hoc sensitivity analysis by omitting the outlier studies to determine the sources of Cochran's heterogeneity [22]. The outliers were detected as the studies with confidence interval of the estimated effect size were not well overlapping with the pooled overall effect size [23]. The subgroup analyses in the main meta-analysis were performed by baseline characteristics of the studies: patterns of treatments in Aln groups (daily or weekly), mean age of participants (>65 or ≦65), methodological quality, sample size (≧400 or <400), and industrial funding. BMI of participants and dose of agents could not be analyzed in subgroup analysis due to the difficulties in determining cutoff values. To determine the influence of outlier studies, after omitting the two detected outliers, the pooled-analysis and the subgroup analyses were repeated in the main analysis with statistical heterogeneities. Results of subgroup analysis were presented only if each subgroup comprised at least two trials.
To comprehensively identify the clinical-related modifiers, metaregression with covariates (age, BMI of participants, patterns of Aln administration) were carried out in the fracture (vertebral fracture analysis was not performed as only 3 trials included) and GI disorder analysis.
To evaluate the publication bias, we used Begg's test and Egger's test with trials from fracture outcomes analysis, including 6 trials in total fractures, 3 trials in vertebral fractures, and 4 trials in nonvertebral fractures [22].
Meta-analysis was conducted using Review Manager 5.1 software. Metaregression analysis, Begg's test, and Egger's test were performed through STATA 11.0 (Stata Corp, College Station, TX, USA). The criteria of the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) were used to evaluate the quality of evidence by each outcome [24].

Study Quality.
The methodological quality was evaluated independently by two reviewers (Lin and Ying) with Cochrane Collaboration's tool for assessing the risk of bias and showed in Table 2 [22]. Four trials [6, 7, 9, 10] described explicit adequately randomization, concealment of allocation assignment, proper blinding, and applying intention to treat analysis, which were low risk of bias [16,17,19,20], while the other three trials with inexplicit randomization and inadequate blinding were considered moderate risk of bias [15,18,21]. The weighted kappa for the agreement on the trial quality between reviewers was 0.84 95% CI: (0.75-0.93).  Figure 2). Our meta-analysis indicated moderate quality The trials could get an "Yes" if their randomization schedules were explicitly described. b Only the trials which mentioned that they concealed the process of patients assignment could get a "Yes. "

Effect of
c The trials were considered as "Double Blinded" if a placebo was adequately adopted to blind both patients and investigators. d In all the included studies, patients in both groups took calcium and vitamine D as supplementations equally.
e Less than 20% loss to follow-up rate was considered acceptable.
f ITT: intention to treat. Explicit description of the loss to followup was provided in all the included trials, but only which mentioned ITT analysis of the missing data could get a "Yes. " g The frequences of positive responses >5 means "High"; 4 or 5 means "Moderate"; ≦3 means "Low. "  evidence of equivalent efficacies between the two medications in fractures prevention ( Table 3).
All of the included studies reported BMD data measured by DXA at least at one skeleton site. Both Aln and Rlx increased BMD significantly at LS, FN, and TH after 6, 12, and 24 months related to the baseline. Aln obtained bone mass increment to a greater extent than Rlx (Table 4), and the differences were widening as the treatment continued. The evidence quality was moderate (Table 3).
Both Aln and Rlx were well tolerated, no fatal AEs related to treatment were reported. It was similar in drop out due to AEs, upper GI disorders, venous thrombosis, and vasodilatation in the both groups: (Aln versus Rlx: drop out due to AEs: 5 studies, RR: 1.03 (0.77 to 1.36), = 0.85, and 2 = 0%; upper GI disorders: 6 studies, RR: 1.10 (0.77 to 1.58), = 0.60, and 2 = 52%; venous thrombosis: 3 studies, RR: 0.52 (0.10 to 2.86), = 0.45, and 2 = 0%; vasodilatation: 3 studies, RR: 0.74 (0.54 to 1.01), = 0.06, and 2 = 0%, Figure 3). And the evidence quality for the differences among those AEs risks was moderate to high with the exception for Aln increase greater risks of upper GI disorders than Rlx, which was supported by low quality evidence (the quality of evidence turned out to be high if excluding the outlier study [20]) ( Table 3).

Heterogeneity and Outlier.
Of the main meta-analysis, substantial heterogeneities were detected in outcomes of LS BMD (at 12 months: < 0.01, 2 = 95%) and upper GI disorders ( = 0.60, 2 = 52%). In LS BMD comparison (at 12 months), Iwamoto' study was found as an outlier [15]. After omitting this study, the results showed insignificant heterogeneities across studies ( = 0.19, 2 = 35%) and the estimate effect size (WMD) in LS BMD was only reduced from 2.92 (95% CI: 2.23 to 3.62) to 2.37 (2.17 to 2.58). Sambrook's research turned out to be an outlier in the GI disorders [20]. The heterogeneities in GI disorders were simultaneously reduced to be minimal ( = 0.83, 2 = 0%) after excluding this study while the differences between Aln and Rlx in risks of GI disorders turned out to be statistical (Aln versus Rlx: RR: 1.30 (1.04, 1.63), = 0.02). Since the two outlier studies were identified, the subgroup analysis were repeated after excludion of them in LS BMD at 12 months and GI disorders respectively.

Sensitivity Analysis and Subgroup Analysis.
The overall results of main meta-analysis were not significantly altered by omitting trials with imputed SEs. Our subgroup analysis suggested that patterns of administrations in Aln groups, participants' age, methodological quality, sample size, or industrial funding of included studies were not associated with the overall effect size of the differences in fracture reduction. The outlier studies did not alter the results of the subgroup analysis in incidences of GI disorders [15,20].

Meta-Regression Analysis.
Women's age, BMI, and pattern of Aln administration have no obvious impacts on the results of fracture (total and nonvertebral fractures) analysis in our metaregression analysis. Though it was insignificant, a widening difference was observed that Aln had more upper GI disorders over Rlx when women's propensity to adopt daily Aln administration or participants' age increased

Publication Bias.
We found no evidence for publication bias both in vertebral fractures (3 trials) and nonvertebral fractures (4 trials), according to both Begg's test and Egger's test [28]. Although the Begg's test funnel plot indicated a potential absence of small size studies which favored Rlx groups in total fractures (6 trials), a trim and fill analysis suggested there were probably 2 missed small trials and the effect size (RR) would be more close to 1 by including them (Supplementary file 2).

Discussion
Our meta-analysis suggested no superiority of Aln over Rlx in reducing the risk of both vertebral fractures and nonvertebral fractures within a followup of 12-24 months. Aln was more effective in increasing BMD than Rlx. Aln reduced the risk of vasomotor by 57% but increased the risk of diarrhea by 133% compared to Rlx. Our subgroup analysis further indicated that the difference between Aln and Rlx in fracture reduction was not materially altered by administration pattern, age, methodological quality, sample size, or industrial funding. The weekly strategy of Aln would further reduce the upper GI disorders and might gain more bone mass increment compared to its daily treatment.

Strength and Evidence Quality.
Our meta-analysis was the first to exclusively comprise head-to-head RCTs, target postmenopausal women and comprehensively evaluate the fracture risk, BMD, and the adverse effects. The previous systematic reviews and the network meta-analyses had indirectly compared the two agents within their multiple agents [2,6,8]. Based on the data of the individual agent compared with the placebo, however, their results had poor consistency and great bias due to the variation in the baseline characteristics of participants and the administration pattern of drugs among the trials [10,11]. The validity of our findings was further strengthened by strictly following "Cochrane Handbook for systematic Reviews of Interventions 5.0.2" [22]. In particular, we developed the clear criteria of inclusion and exclusion, thoroughly assessed the methodological quality of the included studies, and embarked on the quantitative analysis. Identification of the outlier studies and the sensitivity analysis was to sort out the source of heterogeneity in the present analysis, with the purpose of verifying the results. We also performed the subgroup analysis to comprehensively evaluate the multiple factors potentially influencing the comparative effect. Finally, we used the GRADE system to rigidly assess the quality of evidence, which we aimed to recommend for both agents [24]. Generally, our GRADE analysis showed the evidence of moderate to high quality in most endpoints, which was higher than the previous pooled analyses [6,12] ( Table 3).

Limitations. (a)
The combined sample size in current meta-analysis was still limited. However, a large-scale comparative RCT trying to achieve significant fracture prevention difference of the two agents was destined to be infeasible and unnecessary, as the sample size would be unfortunately and unbelievably huge (given the risk fracture of Aln and Rlx in the present analysis was 2.71% and 2.96%, it would need over 100,000 patients to confirm the theoretical difference of 0.25%) [29]. In fact, a well-conducted meta-analysis would always economically and adequately reflect the results of the large-scaled RCT [30]. The present analysis involving all available comparative RCTs with moderate to high quality evidence on the two therapies for postmenopausal women may provide important information for health care providers to supplement the clinical trial evidence. (b) The heterogeneity was detected in the outcome of LS BMD and GI disorder. The cutoff of sample size was defined according to a threshold rule-of-thumb. Bold font means the statistic significances were existed.
The outlier was identified as Iwamoto et al. [15] in LS BMD and Sambrook et al. [20] in GI disorders.  [15,18,21]. One of them also encountered the loss to followup of certain degree [21]. However, our subgroup analysis suggested the conclusions were not overall influenced by the trial quality. (d) Four studies [16,17,19,20] were sponsored by the pharmaceutical companies related to the either agent. Although the bias of the selective reporting should be considered, the industrial funding was found not to alter the overall results.

Interpretation and Clinical Implications.
Aln was well proved to be more potent than Rlx in inhibition of resorption [31]. Aln could tightly bound to trabecular surfaces where osteoclasts attached and then disrupted their function after its ingestion [32]. As for Rlx, it bound to the raloxifeneestrogen receptor and activated a specific sequence of DNA known as the Raloxifene Responding Element. The subsequent increasing expression of specific cell proteins, which acted as estrogen agonist, resulted in osteoclast suppression [33]. Briefly, the more significant efficacy of Aln over Rlx in BMD increment is probably due to their different pathways of antiremodeling effect [28,34,35]. However, a discrepancy between the statistical difference in bone mass increment and their similar efficacy in vertebral and nonvertebral fracture prevention in the current analysis ought to be cautiously considered. One point should be borne in mind that the BMD decline only partially accounted for the osteoporotic bone fracture. Literatures indicated that the contribution of the increase in BMD accounted for only 4% of the reduction of vertebral fracture with Rlx compared with 17% with Aln [36][37][38][39]. Even though Rlx obtained lower bone mass increment, its adequate risk prevention of vertebral fracture has been well established in MORE studies. In addition, Aln allowed fairly accumulation of microdamage in the vertebra, which would be offset by its increase in bone volume though [40], while the positive effect of Rlx on biomechanical properties might adequately cover the inferior bone mass increment, which ultimately bridge the gap in vertebral fracture prevention between both agents.
Currently, Rlx was infrequently prescribed for women with high risk of nonvertebral fractures [2-4, 6, 8]. In the MORE study, Rlx 60 mg/day did not significant decrease nonvertebral fracture (RR: 0.91 (0.77, 1.07)) compared with placebo [4]. A recent network meta-analysis performed by Murad et al. also demonstrated Aln other than Rlx achieved a significant reduction in nonvertebral fracture compared to placebo (Aln: odds ratio (OR): 0.78 (0.66, 0.92); Rlx: 0.90 (0.76, 1.03)) [6]. But the inferiority of Rlx under Aln in nonvertebral fractures is still highly inconclusive as the definitive difference was not found in RCTs or systematic reviews. A latest database study of over 100,000 postmenopausal women using inverse probability of treatment weights (IPTWs) method for adjustment highlighted that patients treated with either Rlx or Aln had similar rates in nonvertebral fracture after 8 years of adherent treatment [14]. Our pooled data of head-to-head RCTs also questioned the difference of risk reduction in nonvertebral fractures between both agents.
Patients' adherence to drugs, highly influenced by their tolerance, would substantially affect the benefits of drugs [41][42][43]. Therefore the potential risk of side effects should be thoroughly considered during a decision making. Generally, our review suggested that both drugs were well tolerated with no fatal AEs reported. In particular, Aln increased the incidence of diarrhea while decreased vasomotor events compared to Rlx, which did not require extra medication and seldom caused discontinuation [15]. The increased phlebothrombosis was the main concern for Rlx [44,45]. However, in our metaanalysis, only 4 venous thrombosis were found (1/990 in Aln, 3/975 in Rlx), which was really rare. Nevetheless, we agreed that Rlx should be contradicted for postmenopausal women who are at high risk of deep vein thrombosis [46]. It was previously demonstrated that postmenopausal women had a greater propensity to adhere to Rlx and higher satisfaction on drug administration compared with Aln mainly due to more GI disorders associated with Aln [47,48]. In our current analysis, however, the difference of upper GI events and the discontinuation due to AEs between the two agents were balanced. Nevertheless, the greater risk of upper GI disorder of Aln over Rlx was observed when we restricted the analysis to subgroups with daily administration of Aln or subgroups with the age over 65. The results were confirmed by our metaregression analysis. It implied that the daily Aln other than the weekly Aln increased the frequency of GI irritation. Besides, the aged women had more difficulty in taking Aln properly, which contributed to the more GI symptoms [49,50]. These results provided some references to improve the compliance of Aln. Although there is not any case reported in the included studies due to the short-term followup, the long term risk of atypical fractures and jaw necrosis with Aln treatment should be under careful surveillance.

Conclusions
Although the moderate-to-high-quality evidence supported that Aln was more effective in increasing the bone mass than Rlx, the moderate-quality evidence suggested no difference in risk prevention of either vertebral or nonvertebral fractures within a followup of 12-24 months. For Aln the weekly strategy would further reduce the upper GI disorders and might gain more bone mass increment compared to the daily treatment. In addition, more diarrhea episodes but less vasomotor events with Aln should also be considered for enhancing the patient compliance during decision making. Which agent, Aln or Rlx, should be preferred for postmenopausal women remained a patient-oriented matter.

Disclosure
No sponsors participated in the design and performance of the study, in the collection, analysis, and interpretation of the data, or in the drafting, review, revision, and approval of