The Contribution of Four Immunogenetic Markers for Predicting Persistent Activity in Patients with Recent-Onset Rheumatoid Arthritis or Undifferentiated Arthritis

We assessed the contribution of four baseline markers—HLA-DRB1 shared epitope (SE), −308 tumor necrosis factor α gene promoter polymorphism, rheumatoid factor, and anticitrullinated peptide antibodies—for predicting persistent activity (DAS28 score ≥2.6) after one year of followup in a cohort of 201 patients with recent-onset rheumatoid arthritis (RA) or undifferentiated arthritis (UA) aged 16 years or older who had a 4-week to 12-month history of swelling of at least two joints. Patients had not been previously treated with corticosteroids or disease-modifying antirheumatic drugs (DMARD). In the best logistic regression model, only two variables were retained: SE positivity and number of DMARD administered (area under the curve = 76.4%; 95% CI: 69.2%, 84.4%; P < 0.001). The best linear regression model also included these two variables, explaining only 22.5% of the variability of DAS28 score. In this study, given an equal number of DMARD administered, the probability of persistent activity in patients with recent-onset RA or UA was significantly influenced by SE presence.


Introduction
The high variability of disease activity among patients newly presenting with rheumatoid arthritis (RA) or undifferentiated arthritis (UA) makes it necessary to know which patients will develop persistent disease, regardless of diagnosis, so that they can be treated more aggressively from the outset and to avoid inappropriate treatment of patients more prone to remission.
Several methodological issues must be considered when studying predictors of persistent activity in patients with recent-onset RA. First, when the disease is in its early stages, patients seldom fulfill the 1987 American Rheumatism Association (ARA) revised criteria for RA [1]. Patients who do not fulfill criteria for definite RA at first presentation might be classified as having definite RA at a subsequent time point, but many cases remain unclassifiable (UA) [2][3][4][5]. There is an important proportion of newly presenting patients who do not satisfy these criteria, but for whom there is a compelling reason to treat with disease-modifying antirheumatic drugs (DMARD), or who on followup develop persistent disease even if there is no change in their classification status. Recently, new classification criteria for RA have been developed in an attempt to increase sensitivity in recent-onset cases [6]. Whether the fulfillment of ARA criteria is useful to predict activity is unknown [7].
Second, since treatments are not randomly assigned in nonexperimental studies, disease activity may be influenced by the type of treatment patients receive. Patients with more severe disease are more likely to be treated more aggressively. This confounding effect can be controlled for by using multivariate regression models [8].

ISRN Rheumatology
Third, factors selected by different authors as potentially predictive of a poor outcome are very heterogeneous and highly variable. The combined role of genetic and immunologic factors in the development of severe RA has been the subject of recent investigations. Recent data support the hypothesis that the presence of HLA DRB1 shared epitope (SE) alleles can trigger immune reactions such as the production of anticyclic citrullinated peptide antibodies (anti-CCP) [9]. RA patients showing these antibodies in the early stages of the disease could develop more severe disease than those who lack them [10]. RF positivity seems to be related to active disease, but no definite conclusions have been reached regarding its value as a predictor of disease activity in RA [11]. Tumor necrosis factor alpha (TNFα) plays a pivotal role in regulating the inflammatory response in RA. However, there are few reports on the role of the G-to-A polymorphism at position 308 of the TNFα gene promoter (-308 TNFα) as an independent marker of disease activity in recent-onset RA [12,13]; no association has been seen in RF-positive patients in particular [14]. Additional cohort studies including -308 TNFα among the predictor variables are needed. Although −308 TNFα [12][13][14], SE alleles, [15][16][17][18][19][20][21][22] RF, [23][24][25][26][27][28][29], and anti-CCP [30][31][32][33][34][35][36][37][38] have all been studied as potential predictors for persistent activity in cohort studies of recent-onset RA, so far no study has investigated the combined effect of this particular set of factors. The combination of several markers could increase the capacity to predict persistent disease in patients with recent-onset RA [39] and the identification of markers associated with a poor outcome would facilitate the development of new drug targets [40].
Finally, since there is no consensus definition of disease activity in recent-onset RA, the use of different definitions may generate substantial variation among studies [41]. As no "gold standard" exists, a disease activity score based on a reduced joint count (DAS28), [42] or other disease activity indexes [43] can be used. A DAS28 ≥ 2.6 is considered indicative of active disease, while a DAS28 < 2.6 corresponds to fulfillment of the preliminary ARA criteria for clinical remission in RA [44].
In this study, multivariate logistic and lineal regression was used to find a model based in immunogenetic markers that predicts persistent activity in patients with recent-onset RA or UA. The study is based in a recent-onset inflammatory polyarthritis (IP) register established in Seville, Spain, in January 2002 to look into various diagnostic, prognostic and therapeutic issues [45][46][47].

Materials and Methods
We studied a prospective cohort of 201 consecutive patients with recent-onset RA or UA (disease duration ≤1 year) who were referred to our recent-onset IP unit from January 2002 through December 2006. Patients were referred from primary health care centers, emergency services, and outpatient rheumatology clinics of the Virgen del Rocío University Hospital Health District in Seville, Spain (population 774 619 according to the 2002 census). Details of the caseascertainment and follow-up procedure have been previously described [45].

Subjects.
To be included in the recent-onset IP register, patients referred to the unit had to reside in the hospital health district catchment area, be at least 16 years old, and have at least two swollen joints lasting for a minimum of 4 weeks and a maximum of 12 months.
The 1987 ARA criteria for RA [1] and international classification criteria for other rheumatic diseases [48] were used at baseline and in all follow-up assessments and cumulatively applied. Patients were classified as having RA if they fulfilled at least four of the seven 1987 ARA criteria for RA; those who did not fulfill at least four of these seven criteria and did not fulfill the classification or diagnostic criteria of any other particular rheumatic disease were classified as having UA. Cases classified as RA during any visit (at 0, 1, 3, 6, 9, and 12 months) and cases still classified as having UA at the end of followup were included in this study; patients with alternative diagnoses were excluded. Even if the new ACR/EULAR classification criteria for RA have been published after our statistical analysis was completed, we have calculated the proportion of patients who fulfill them for informative purposes [6].
From January 2002 through December 2006, 998 patients were referred to the recent-onset IP unit. Of such patients, 469 (47.0%) fulfilled the criteria for inclusion in the register, but 33 (7.0%) were lost to follow-up. This left a total of 436 registered patients, of whom 201 (46.1%) had completed the first year of followup by the time of this analysis. All patients were of Spanish descent. At baseline, no patient had previously received corticosteroids or DMARD. Blood samples for laboratory tests were collected and frozen before treatment was begun.
Samples were genotyped for −308 TNFα using a Taq-Man 5 allelic discrimination assay (Custom TaqMan SNP Genotyping Assays method, Applied Biosystems, Foster City, Calif, USA). Allele-specific probes were labeled with VIC and FAM fluorescent dyes. Polymerase chain reaction (PCR) was carried out in a total reaction volume of 8 μL with the following amplification protocol: denaturation at 95 • C for 10 min, 40 cycles of denaturation at 93 • C for 15 sec and annealing and extension at 60 • C for 1 min. After PCR, the genotype of each sample was automatically attributed using the SDS 1.3 software for allelic discrimination. The frequencies of −308 TNFα genotypes in a healthy control group from our district catchment area were 80% for GG, 17% for GA and 3% for AA.

Immunologic Markers.
Anti-CCP antibodies were tested by second-generation ELISA (QUANTA Lite CCP IgG ELISA. INOVA Diagnostic Inc., San Diego, Calif, USA; positive: >20 IU/mL), and RF by nephelometry on a BN II instrument (Dade Behring, Marburg, Germany) using the N Latex RF method (Dade Behring) [46,47]; levels > 50 IU/mL were considered positive using the optimal cutoff value reported by other authors [49]. In a healthy control group from our district catchment area, the RF level at percentile 95 was 15 IU/mL, and the highest anti-CCP level was 10 UI/mL.

Disease Activity
Measurements. DAS28 (range 0-10) was recorded for all patients after 12 months. A DAS28 < 2.6 was considered indicative of no disease activity or remission, and a DAS28 ≥ 2.6 was considered indicative of active disease [44].

Statistical Methods.
The dependent variable was the DAS28 obtained at 12 months. The independent variables were the SE status, anti-CCP and RF (either status or levels), and −308 TNFα genotype (GG or GA/AA; as there were few GA and AA cases, these two categories were collapsed) obtained at baseline. As the probability of persistent activity may be influenced by the treatment the patients received, this confounding factor was entered as an additional independent variable. The treatment given throughout the 12 months of followup was corticosteroids, categorized dichotomously (yes/no), and/or DMARD, categorized either dichotomously (yes/no) or as the number of drugs given (from 0 to 3). Multinomial regression models were also used to adjust the possible differences between disease classification (RA or UA) throughout followup.
All data were recorded in an Access 2000 database and then exported to the Statistical Package for the Social Sciences (SPSS) v. 15.0 for statistical analysis.
For an alpha level of 0.05, an anticipated "medium" effect size of 0.15 (according to Cohen's convention for multiple regression) and an assumed 10% rate of attrition, the minimum sample size required to reach a statistical power of 0.80 in a multiple regression model with eight predictor variables would be 108.
We calculated absolute frequencies and percentages for qualitative variables, and means and standard deviations for quantitative variables. Variables that are predictive for disease activity at one year were identified by univariate and multivariate logistic and linear regression models. For univariate analyses we used Student's t-test, χ 2 or Fisher's exact test, as appropriate. Normality and homoscedasticity contrasts (Kolmogorov-Smirnov and Levene tests, resp.) were undertaken for parametric tests. For multivariate analysis, Wald's statistic (logistic regression) or Student's ttest (linear regression) were used for stepwise exclusion of variables weakly associated with the dependent variable, as indicated by a P value ≥ 0.15. Since the SE variable is polytomic, it was analyzed by creating a dummy variable with the first category (−/−) used as the reference. Full and reduced models were compared with the G statistic (logistic regression) or partial multiple F-test (linear regression). The linearity of continuous variables was checked by the Box-Tidwell test. Potential interactions among the variables in the model were studied. Variables with a P value > 0.05 were analyzed as potential confounders, and they were considered as such whenever their coefficients changed by >20%. Multicollinearity among independent variables was assessed by the variance inflation factor, independence by the Durbin-Watson test, normality by the Shapiro-Wilk test, and homoscedasticity of the residues by the dispersion diagram among residues and the estimated values. Outliers were identified by means of Cook's distance. In the logistic regression, goodness of fit was assessed with the Hosmer-Lemeshow goodness-of-fit analysis, and discrimination was reported as the area under the receiver operating characteristic (ROC) curve. In the linear analysis, goodness of fit was assessed with the corrected determination coefficient (R 2 ). All contrasts were two-tailed, and the significance level was set at <0.05.

Results
The characteristics of the study population are shown in Table 1 In univariate analyses, qualitative variables significantly associated with a DAS28 ≥ 2.6 at one year were positive SE (P < 0.001), fulfillment of the 1987 ARA criteria (P = 0.002), and treatment with DMARD (P = 0.003). As for quantitative variables, only anti-CCP levels (P = 0.030) and the number of DMARD (P < 0.001) were significantly associated with a DAS28 ≥ 2.6 ( Table 2). Table 3 shows the results of univariate and multivariate logistic regression for DAS28 at one year. In univariate regression analyses, only positive SE (P < 0.001) and the number of DMARD given during followup (P < 0.001) were associated with a DAS28 ≥ 2. 6  P < 0.001), but not by any other variable. That means that for any two patients administered the same number of DMARD, the probability of persistent activity at 1 year is almost 5 times greater in a patient with SE than in a patient without SE. For a cutoff value of 0.05, the model had a sensitivity of 81.9% and a specificity of 56.1%, with an AUC of 76.4% (95% CI: 68.9%, 83.8%), that is, significantly higher than 50% (P < 0.001), indicating that the model showed fair discriminatory power. The model had also fair accuracy (i.e., it correctly predicted 74.6% of the cases). Table 4 shows the results of univariate and multivariate linear regression for DAS28 at one year. In univariate regression analyses, anti-CCP status (P = 0.003), RF status (P = 0.004), SE heterozygosity (P < 0.001), SE homozygosity (P = 0.017) and the number of DMARD (P < 0.001) were associated with higher DAS28 at one year. In the linear regression analysis, a higher DAS28 was significantly predicted by SE heterozygosity (β coefficient: 0.67 [95% CI: 0.32, 1.01]; P < 0.001), SE homozygosity (β coefficient: 0.73 [95% CI: 0.11, 1.35]; P = 0.021) and the number of DMARD (β coefficient: 0.63 [95% CI: 0.43, 0.82]; P < 0.001), but not by any other variable (partial F tests = 0.115; P = 0.995; df = 6.195). That means that for any two patients administered the same number of DMARD, the DAS28 score at 1 year will be 0.73 points greater in a patient homozygous for SE than in a patient without SE. This model explained only 22.5% of the variability of the dependent variable (R 2 = 0.225).
In these models, no significant interactions among variables were noted, and no variable was a confounder. All criteria for the use of multivariate linear regression were fulfilled: independence, normality and linearity of the independent variables, absence of multicollinearity among them, and homoscedasticity of the residues. No patient showed a Cook's distance >1.

Discussion
Several cohort studies of populations similar to ours have investigated the value of different combinations of variables, including HLA-DRB1 SE alleles, −308 TNFα, RF, and anti-CCP for predicting disease activity among patients with recent-onset RA . These studies differed methodologically in terms of referral and recruitment procedures, inclusion criteria, disease duration, variables assessed at presentation, followup until assessment of outcome, and disease activity scoring methods. Our study is the first to investigate this particular set of four immunogenetic markers using multivariate regression. Moreover, the potentially confounding effects of the classification criteria (RA versus UA) and the type of treatment given were controlled for by including these variables in the regression analyses.
Some studies have found a significant association between SE alleles and disease activity in recent-onset RA [15,17,19], and some have not. [16,18,[20][21][22] Several have not used multivariate statistical methods [15,17,21]. Our results show that persistent activity at one year, assessed with the DAS28, is significantly influenced by the presence of SE in patients with recent-onset RA or UA. This finding is consistent across univariate and multivariate logistic and ISRN Rheumatology 5    linear analyses (Tables 2, 3, and 4). However, since RA is a multigenic inflammatory disorder, it is likely that other factors are involved in its outcome. The possibility that the −308 TNFα may have prognostic implications is currently being debated. In a seropositive RA inception cohort, no statistically significant differences were seen in DAS between patients with GA or AA genotypes and those with the GG genotype [14]. Other studies that, like ours, were not confined to seropositive RA patients have also suggested that the −308 TNFα is not a genetic marker for disease activity in recent-onset RA [12,13]. In this study, the GA/AA genotypes were not retained in any model, either logistic or linear, and not even in the univariate analyses (Tables 2, 3, and 4).
Of the 201 patients analyzed, only 42.3% were positive for RF. This low percentage resembles the values found in other studies. [9,18,28] Besides the fact that our patients had recent-onset RA or UA rather than long-term RA, another possible explanation for the low frequency of RF positivity may be that, as recommended by some to predict outcome, [49] we used high cutoff values for RF (>50 IU/mL, instead of >40 IU/mL, >20 IU/mL, or even >10 IU/mL in other studies). Had we used a cutoff value of ≥40 IU/mL, the frequency of RF positivity would have been 58.7%, instead of 42.3%. Several studies have reported that RF is a good predictor of disease activity [23,24,[26][27][28][29][30][31]. However, in our univariate analyses RF, treated either as a qualitative or a quantitative variable, was not significantly associated with DAS28 (Table 2). Additionally, in the multivariate analyses, RF was not a prognostic factor for disease activity (Tables 3  and 4). Similar results have been found in other cohorts of recent-onset RA patients, both in Spain [22] and elsewhere [21,25,30,32].
In this community-based cohort, only 88 (43.8%) of the 201 patients with recent-onset RA or UA were positive for anti-CCP at baseline. A low frequency of positivity at presentation has been recorded in other recent-onset RA cohorts [30,32,[34][35][36], and it may be indicative of earlystage disease. The usefulness of anti-CCP for predicting disease activity in patients with recent-onset RA has been evaluated in several cohort studies. Some have suggested it is a marker for active disease, as measured with either the SJC [30,[33][34][35][36]38] or the DAS28, [21,30,36], but others have not confirmed an association. [22,31,32,37] Only a few of these studies have used multivariate statistical methods [22,30,32,38]. Predictive value may depend on whether anti-CCP status or titers are considered. In our univariate analyses, patients who were positive for anti-CCP at presentation had not more disease activity at 1 year than patients who were negative (Table 2). When quantitative values were used, anti-CCP antibodies were significantly associated with DAS28 (Table 2). However, this marker was not a predictor of this outcome in regression models (Tables 3 and 4). Similar results have been found in other studies in which multivariate analyses have been performed [22,32,38].
The number of patients who fulfilled ≥4 ARA criteria for RA increased with length of followup. Thus, it is advisable to use a cumulative approach to the classification of disease. In the community-based Norfolk Arthritis Register (NOAR), the percentage of patients classified as having RA using the above criteria increased from 38% at baseline to 66% at 5 years. [7] In our cohort of 463 patients with recent-onset IP, 108 (23.3%) fulfilled ≥4 ARA criteria for RA when first seen, and 142 (30.7%) at 1 year. The number of patients fulfilling the new RA classification criteria [6] increased from 7 145 (72.1%) at baseline to 154 (76.6%) after 1 year. The 1987 ARA classification criteria for RA, derived from patients with long-standing established RA, were not designed to identify patients with recent-onset disease, and the current management of RA is intended to prevent patients reaching a stage when they satisfy these criteria. In this cohort, we included patients fulfilling ≥4 of the 7 ARA criteria for RA and UA patients, since, regardless of diagnosis, DMARD therapy was used as an indicator of the physician's opinion that the patient was at risk of developing persistent disease in 94.5% of patients. The value of these criteria to predict active disease in patients with recent-onset disease has been questioned [7]. In this cohort, the fulfillment of ARA criteria for RA was not predictive for disease activity at 1 year (Tables  3 and 4).
In our study, based on routine care, the treatment given over the 12 months of followup was included in the univariate and multivariate analyses and was significantly and negatively related to disease activity in every analysis (Tables 2, 3, and 4). Since treatment was not a confounder in multivariate analyses and DMARD have limited efficacy, this could indicate that, at least in a subgroup of patients, persistent disease activity might be related not to insufficient treatment with DMARD but to a failure to respond to conventional DMARD. A post hoc analysis of data from the BeSt study has shown that patients who failed to respond to methotrexate were unlikely to respond to other conventional DMARD, [50] and a recent study from the community-based NOAR has identified SE positivity as the strongest predictor of methotrexate monotherapy inefficacy in patients with early inflammatory polyarthtitis [51]. The ability of DMARD to prevent radiological damage has also been questioned [52]. In a previous study we have found that erosive damage at 1 year in patients with recent-onset RA is significantly influenced by SE homozygosity and the presence of baseline erosions, but not by RF status, anti-CCP status, −308 TNFα genotype or treatment with conventional DMARD [53].
In conclusion, for patients with recent-onset RA or UA treated with the same number of DMARD, the probability of persistent activity is significantly influenced by SE presence. Positive RF and anti-CCP at baseline, as well as the presence of the AA or GA genotypes of −308 TNFα or the fulfillment of criteria for RA, as opposed to UA classification, were not good predictors of disease activity.

Funding
Immunogenetic tests were partially funded by grant FIS-07/0061 from the Fondo de Investigaciones Sanitarias (Spain).