Serum Peptidomic Profile as a Novel Biomarker for Rheumatoid Arthritis

Over the last decades, there has been an increasing need to discover new diagnostic RA biomarkers, other than the current serologic biomarkers, which can assist early diagnosis and response to treatment. The purpose of this study was to analyze the serum peptidomic profile in patients with rheumatoid arthritis (RA) by using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS). The study included 35 patients with rheumatoid arthritis (RA), 35 patients with primary osteoarthritis (OA) as the disease control (DC), and 35 healthy controls (HC). All participants were subjected to serum peptidomic profile analysis using magnetic bead (MB) separation (MALDI-TOF-MS). The trial showed 113 peaks that discriminated RA from OA and 101 peaks that discriminated RA from HC. Moreover, 95 peaks were identified and discriminated OA from HC; 38 were significant (p < 0.05) and 57 nonsignificant. The genetic algorithm (GA) model showed the best sensitivity and specificity in the three trials (RA versus HC, OA versus HC, and RA versus OA). The present data suggested that the peptidomic pattern is of value for differentiating individuals with RA from OA and healthy controls. We concluded that MALDI-TOF-MS combined with MB is an effective technique to identify novel serum protein biomarkers related to RA.


Introduction
Rheumatoid arthritis (RA) is the most common form of inflammatory arthritis that results in the destruction of articular cartilage and bone erosion [1]. Immunological studies revealed the presence of specific serological markers such as anti-cyclic citrullinated peptide antibody (ACPA) and rheumatoid factor (RF) that aid in the diagnosis of RA [2]. Indeed, the diagnosis is usually easy in established stages of the disease when the lesions are clinically and radiologically apparent. However, in early phases, when the introduction of appropriate therapy would be most useful, the diagnosis is usually challenging especially in early seronegative RA patients [3]. The 2010 American College of Rheumatology/European League Against Rheumatism (ACR/EULAR) classification criteria [4] for RA have been focused on features at earlier stages of disease rather than defining the disease by its late-stage manifestations. This will reinforce attention on the urgent need for earlier diagnosis to prevent, or at least decrease, the occurrence of the undesirable sequelae of RA.
Therefore, there has been an increasing need to discover new diagnostic biomarkers that can help early diagnosis and assist in evaluating disease activity, severity, and treatment response [5]. Over the last decade, the identification and quantification of novel biomarkers are a new area of interest in the clinical management of RA. Proteomics is the method of studying protein expression, structure, function, modifications, and interactions, as well as how these proteins change in different environments and conditions [6]. The discovery of proteomic technologies has led to advances in the analysis of synovial fluid, blood, and urine samples collected from RA patients. Furthermore, proteomics may help to identify newer autoantibodies and novel inflammatory acute-phase proteins beyond C-reactive protein (CRP) and serum amyloid A [7].
Different techniques have been used in proteomics. Among these techniques, matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is emerging as a promising technology for the detection of complex proteins, as many protein mixtures of certain disorders have been identified [8]. Owing to its broader image, it allowed a better understanding of the intracellular protein composition, structure, and activity, thus aiding in the discovery of new biomarkers and achieving better comprehension of disease pathogenesis. Different types of biological samples collected from RA patients could be used including synovial tissue/fluid, blood, and urine [9]. Despite a wide range of availability of different protein species, serum represents a rich medium for the discovery of disease-specific biomarkers and helps identify new therapeutic targets.
In this context, several attempts have been made and some interesting proteins have been identified [10][11][12]. Yet, the integration of potential biomarkers resulting from proteomic analysis in RA is not fully established, and so, several studies are needed to confirm the efficacy of these approaches. The purpose of this study was to use MALDI-TOF-MS to identify differentially expressed disease-related peptide in patients' serum with RA.

Materials and Methods
This study was approved by the local ethics committee of the institution (no. 20/181), and written informed consent was obtained from each subject before the study. A total of 105 serum samples were included in this study, among which, there were 35 from patients with RA, 35 from patients with primary (idiopathic) OA as the disease control (DC), and 35 from healthy volunteers as the healthy control (HC) with matched age and sex. All RA patients were diagnosed according to the 2010 ACR/EULAR classification criteria for RA [4] and selected to be biologic disease-modifying antirheumatic drugs (bDMARDs) naïve. Osteoarthritis patients were selected with primary (idiopathic) OA of knee joints without any underlying cause of secondary OA and were classified by a five-grade scale according to the Kellgren and Lawrence (KL) radiographic classification scheme [13].
All patients were subjected to history taking, musculoskeletal examination, laboratory workup, as well as plain radiographic tests for hand and knee joints. In addition to the test of rheumatoid factor (RF) and anti-cyclic citrullinated peptide antibody (ACPA), calculation of the disease activity was performed for the RA group by utilization of the disease activity score-28 (ESR version) (DAS-28) [14].

Serum Sample Collection, Storage, and Preparation.
Peripheral venous blood samples were obtained from the participants in the morning and were drawn into tubes that were placed in an ice pack for transport. Each sample was centrifuged in a cooling centrifuge at 5°C for 15 minutes at 1800 × g. Then, serum was separated after acquisition by centrifugation and was aliquoted and stored immediately at -80°C until further analysis.

MALDI-TOF-MS Analysis.
MALDI-matrix α-cyano-4hydroxycinnamic acid (CHCA) was chosen for the peptide profiling experiment on polished steel targets. One μl of the sample was applied to a target spot and left to dry at room temperature; then, 1 μl of the matrix was applied on the spot. The matrix consisted of CHCA (3 mg/ml) in 50% acetonitrile/2% TFA and prepared by mixing 1.2 mg HCCA, 200 μl acetonitrile, 160 μl deionized water, and 40 μl of the 10% TFA.
The mixture was then left to dry at room temperature. Spectrum acquisition was done using the positive linear mode (1-10) kDa of the MALDI-TOF/TOF UltrafleXtreme mass spectrometer from Bruker Daltonics (Bremen, Germany). For optimum performance, the ClinPro standard (CPS) was used as a standard sample. Using the FlexControl™ software, peaks with a signal/noise (S/N) ratio above 3 were only chosen from the spectra generated.

Expression Profile Analysis and Statistical
Analysis. The ClinPro Tools software 3.0 (Bruker, Daltonik, Germany) was used for processing and data analysis. The mean spectrum obtained from each subject data set was used for the statistical analysis. A difference with p value < 0.05 was considered statistically significant. A class prediction model was adjusted by applying 3 different machine-programming algorithms: Supervised Neural Network (SNN), genetic algorithm (GA), and Quick Classifier (QC) algorithms. Crossvalidation was implemented to determine the accuracy of the class prediction [16]. International Journal of Rheumatology characteristics of the participants are shown in Table 1, and the disease parameters of the RA group are shown in Table 2.

MALDI-TOF-MS Spectrum Analysis and Model Generation.
By using the spectral data from the three groups, three different classification models for the three groups were generated using GA, SNN, and QC algorithms. The GA model showed the best sensitivity and specificity in the three trials (RA versus OA, RA versus HC, and OA versus HC).

RA versus Disease Control (OA).
Among the peaks ranging from 1 to 10 kDa, 113 protein peaks significantly varied between RA and disease control (OA group) (p < 0:05) and discriminated sera of patients with RA from those with OA. Of which, 73 peaks were significant (p > 0:05) and 40 peaks were nonsignificant. Compared with the disease control group, 57 peaks were upregulated and 53 peaks were down-regulated in the RA group, whereas 3 peaks were equally expressed in both groups. All these 113 protein peaks were entered into the ClinPro Tools software 3.0 to generate an optimal decision classification tree in the testing set. The machine learning genetic algorithm (GA) showed the most representative classification tree that comprised five integrated peaks with mass to charge ratios (m/z) of 75 : 7767.82, 40 : 2953.29, 41 : 2991.59, 48 : 4054.75, and 68 : 6434.51; all of them were significant and were selected as the best biomarkers of RA in the classification tree ( Table 3).
The integrated peaks showed that all peaks were downregulated except peak 40 which was upregulated. External validation was performed for the three trials. The external validation of RA versus OA showed 97.8% sensitivity and 97.9% specificity.
The pseudogel view in ClinPro Tools was applied, and as shown in (Figure 1), each peak is represented by a vertical line. The difference in intensity between the lines in the two involved groups in the comparison represents the differential peak expression between every two groups of the three trials for RA versus OA, RA versus HC, and OA versus HC, respectively.
3.3. RA versus Healthy Control. The results revealed 101 peaks that discriminated RA from the control group; 53 were significant (p > 0:05), and 48 were nonsignificant. Sixty-two peaks were upregulated, and 39 peaks were downregulated. The GA model showed the most representative classification tree that comprised 5 integrated peaks that can discriminate    Table 3). All the integrated peaks were overexpressed except peak 62 which was downregulated ( Figure 2). The sensitivity and specificity in the data of the training set were 97.5% and 95.3%, respectively.  (Table 3). All the integrated peaks were upregulated except peaks 18 and 23 which were downregulated ( Figure 3). The sensitivity and specificity in the data of the training set were 96.6% and 99.7%, respectively.
Results of the validation test for the training set data showed that the detected protein peaks could differentiate RA samples from those of the disease control and healthy control with sensitivity and specificity of 96.66% and 100.0%, respectively. It yielded the highest diagnostic value with an area under the curve (AUC) of 0.988 compared to other laboratory RA tests such as ACPA (AUC = 0:875) and RF (AUC = 0:720) (Table 4, Figure 4).

Discussion
In the management of RA, the interruption of the inflammatory cascade before it is fully established is the most effective. Therefore, it is evident that therapeutic intervention will have a greater effect on the outcome if started early and ideally if commenced before the occurrence of articular damage [17]. Therefore, the availability of new biomarkers will be useful in the diagnosis of early preradiographic disease and possibly the most promising way to improve RA management [18,19].
Analysis of proteomic/peptidomic profile is one of the most promising methods for the identification of proteins and peptides connected with rheumatic diseases [20]. However, MALDI-TOF-MS cannot enable complete quantification of proteins. In this context, identifying the protein species depicted by the peaks on the spectra would provide further evidence that they are indeed biologically significant disease-related molecules [21].
In the current study, MALDI-TOF-MS combined with MB-HIC C8 were applied for identifying serum protein profiles, to establish a serological classification tree model, for RA patients in comparison with OA and HC. The low molecular weight proteins have been isolated by magnetic beads, and several considerable up-and downregulated proteins were recognized in the RA group compared to OA and HC groups. In our study, 113 peaks discriminated the RA group from the OA group with 5 integrated peaks (GA model); all integrated peaks were significant.
On performing the external validation for the three trials, RA-related protein peaks yielded a sensitivity of 97.8% and specificity of 97.9% when compared with the OA group and a sensitivity of 97.5% and specificity of 95.3% when    International Journal of Rheumatology compared with the HC group. On the other hand, the external validation of the OA group against the HC group revealed 96.6% sensitivity and 99.7% specificity. These findings support the ability of MALDI-TOF-MS coupled with magnetic beads to discover differentially expressed serum protein biomarkers in RA and OA patients, and this may reflect the differences in the pathogenesis between RA and OA.
In a previous study [22], differential proteomic and peptidomic analysis of plasma and synovial fluid was performed for patients with RA, OA, and reactive arthritis by applying mass spectrometric structure characterization of gel-separated proteins. Fibrin degradation products were detected in both groups. On the other hand, Calgranulin B (MRP14) and serum amyloid A have been exclusively identified in synovial fluid samples derived from RA and have not been observed in synovial fluids or plasma from OA patients.
In another study [23], exosomes were isolated from serum samples obtained from 43 subjects: 12 with active RA, 11 with inactive RA, 10 with OA, and 10 healthy donors. Two hundred and four (204) protein spots were detected by 2D-DIGE; among them, the protein spot identified as Tolllike receptor 3 (TLR3) showed approximately 6-fold higher intensity in the active RA group than in other groups. This may reflect the pathophysiology of active RA.
In the current study, in a comparison between RA and HC, 101 peaks were identified to be related to RA; among them, five peaks were integrated. This classification model could differentiate patients with RA from HC with a sensitivity of 97.5% and specificity of 95.3%. The peak patterns were almost similar among the samples for respective subjects in the repeated trials, indicating good reproducibility of the analysis.
In a similar study [24] involving the same range of protein detection 1-10 kDa, proteins were recognized by using MALDI-TOF-MS combined with WCX-MB-HIC C8, with a discrepancy between RA patients and the control group. The decision tree model included four protein peaks at m/z 4966.89, 5065.3, 5636.97, and 7766.87. Among the four peaks, m/z 4966.89 was overexpressed in RA cases, which might help in the diagnosis of RA by the detection of such protein peaks. On the contrary, three other protein peaks were downregulated. It is worth mentioning that a common peak of similar mass, 7767.8/7766.8, was observed to be integrated and downregulated in both studies in accordance with our work.
Additionally, Zhang et al. [25] studied proteomes in serum samples from 60 RA patients and 36 healthy controls using MALDI-TOF-MS, and a total of 33 peaks were identified to be related to RA, of which 5 peaks were used to be significant for RA diagnosis by pattern recognition software. The blind testing data indicated a sensitivity of 86.7% and a specificity of 90.0% for diagnosing RA. They reported that the decision model tree, based on the five candidate biomarkers, could provide a powerful and reliable diagnostic method for RA with high sensitivity and specificity.
An earlier study using the same technical approach identified protein biomarkers of early RA patients. Four peaks with m/z of 8133.85, 5844.60, 13541.3, and 14029.0 were recognized; the first three peaks were upregulated, while the last one was downregulated in RA compared to the control [26]. In another study, three peaks with m/z of 2490, 5910.07, and 6436.73 were identified in patients with RA, of which the first one was overexpressed and the last 2 were underexpressed in RA compared to controls. Moreover, the peaks with m/z of 1014.92 and 1061.38 were significantly overexpressed in the early RA group compared to those with established RA. This might provide an additional advantage of this technique that helps in differentiating early from established RA [27].
Yan et al. [8] used the same technique in RA and identified proteins with m/z of 3939, 5906, 8146, and 8569. These four proteins were again assessed for diagnostic accuracies and demonstrated a sensitivity of 100.0% and a specificity of 96.0% for differentiating RA patients from healthy controls. Their results had shown high levels of sensitivity and specificity, 100.0% and 81.2%, respectively. This finding is in line with our results, as peak number 53, with m/z 5906.3, was detected in our RA group and was reported by Yan et al. as well.
In the present study, 95 peaks were identified when comparing OA with HC. These peaks discriminated OA patients from the control group with 5 integrated peaks by the GA model. All were upregulated except peaks 18 and 23 which were downregulated. We thought that these differentially expressed protein peaks came from the production of fragments of extracellular matrix proteins as a result of structural damage in subchondral bone and articular cartilage degradation in OA. These results will add to the value of peptidomic analysis as a potential technology to discover serum biomarkers of cartilage degradation and will aid in further understanding of the underlying mechanisms of OA.
Takinami et al. [28] conducted a study on a total of 69 plasma samples (25 OA patients with radiographic progression, 33 nonprogressive OA patients, and 11 healthy donors). Three biomarkers significantly differentiate between progressor and nonprogressor OA. Moreover, they used MALDI-TOF-MS combined with surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS), and subsequent analyses indicated that these peaks corresponded to apolipoprotein C-I and C-III and an N-terminal truncated form of transthyretin, respectively. They reported that these 3 peaks are expected to be prognostic biomarkers for knee OA, In another study [29], sera were used from moderate and severe OA patients and compared to healthy controls. Serum protein levels were analyzed using isobaric tags for relative and absolute quantitation (iTRAQ) and MALDI-TOF-MS. More than 300 different proteins were isolated from sera of OA patients, and more than 250 have been quantified by calculation of their iTRAQ ratios. Three sets of proteins were significantly changed in OA samples compared to controls. These included some complement components, lipoproteins, von Willebrand factor, tetranectin, and lumican.
In our study, the Receiver Operating Characteristic (ROC) curve has been used to test the accuracy of the training set data in diagnosing RA. Comparing the detected protein peaks of RA patients (35 samples) with those of overall control subjects (70 samples) showed that the significant integrated peaks could differentiate RA from control subjects (OA and HC) with sensitivity and specificity of 96.66% and 100.0%, respectively, with an AUC = 0:988. That was higher than the diagnostic performance of the anti-citrullinated peptide antibody (ACPA) assay with a sensitivity of 83.43% and specificity of 86.67% (AUC = 0:875) and rheumatoid factor (RF) assay with a sensitivity of 80.9% and specificity of 46.67% (AUC = 0:720) in diagnosing RA. Thus, the good performance indicated that the detected protein peaks could be potential diagnostic biomarkers for RA and could effectively distinguish RA patients from individuals with osteoarthritis or healthy people.
Although MALDI-TOF-MS looks very promising, it is still in its infancy. Because proteomics expresses the overall picture of the intracellular protein structure, it is capable of identifying noninvasive diagnostic biomarkers, giving the chance for further molecular researches, and improving the understanding of RA pathogenesis. However, utilization of MALDI-TOF-MS analysis requires skilled personnel and subsequently increases the costs; therefore, their broad use especially in the developing countries is limited.

Conclusion
MALDI-TOF-MS combined with MB-HIC C8 is a potentially effective technology to identify novel serum protein biomarkers related to RA. The present data suggested that the peptidomic pattern is of value for differentiating individuals with RA from OA and healthy controls.

Data Availability
The data used to support the findings of this study are available from the first author/corresponding author upon request.

Conflicts of Interest
The authors declare that they have no conflicts of interest.