Cross-Calibration of GE Healthcare Lunar Prodigy and iDXA Dual-Energy X-Ray Densitometers for Bone Mineral Measurements

In long-term prospective studies, dual-energy X-ray absorptiometry (DXA) devices need to be inevitably changed. It is essential to assess whether systematic differences will exist between measurements with the new and old device. A group of female volunteers (21–72 years) underwent anteroposterior lumbar spine L2–L4 (n = 72), proximal femur (n = 72), and total body (n = 62) measurements with the Prodigy and the iDXA scanners at the same visit. The bone mineral density (BMD) measurements with these two scanners showed a high linear association at all tested sites (r = 0.962–0.995; p < 0.0001). The average iDXA BMD values were 1.5%, 0.5%, and 0.9% higher than those of Prodigy for lumbar spine (L2–L4) (p < 0.0001), femoral neck (p = 0.048), and total hip (p < 0.0001), respectively. Total body BMD values measured with the iDXA were −1.3% lower (p < 0.0001) than those measured with the Prodigy. For total body, lumbar spine, and femoral neck, the BMD differences as measured with these two devices were independent of subject height and weight. Linear correction equations were developed to ensure comparability of BMD measurements obtained with both DXA scanners. Importantly, use of equations from previous studies would have increased the discrepancy between these particular DXA scanners, especially at hip and at spine.


Introduction
Reliable follow-up of bone mineral density (BMD) by dualenergy X-ray absorptiometry (DXA) scans is essential both in clinical practice and in medical research. However, aging or defective DXA technology may compromise reliability of subsequent DXA measurements and, thus, require change of machinery. International Society for Clinical Densitometry (ISCD) recommends an in vivo cross-calibration procedure if the old DXA system is replaced with a different DXA model regardless of whether it is from the same or different manufacturer [1]. Cross-calibration is important because systematic differences between the instruments may even exceed the annual biological BMD changes [1]. Some crosscalibration studies have suggested that inclusion of anthropometrical measurements may improve agreement between the DXA densitometers [2,3]. Unfortunately, discrepancies exist among the previous studies investigating the agreement of BMD measurements between GE Healthcare Lunar Prodigy and iDXA devices [4][5][6][7][8].
The Kuopio Osteoporosis Risk Factor and Prevention (OSTPRE) Study in Eastern Finland, started in 1989 [9], evaluates also long-term BMD changes in a Caucasian female population born between 1932 and 1941 using DXA densitometry repeated at 5-year intervals. At present, its 25th-year measurements are currently ongoing. Until now four different DXA scanners from the same manufacturer have been used.
In the present study, we (1) evaluate the agreement of bone mineral measurements between GE Healthcare Lunar (Madison, WI, USA) Prodigy and iDXA narrow-angle fan beam densitometers; (2) compare cross-calibration results acquired from human and phantom data; (3) calculate potential correction coefficients for Prodigy results to match with those of iDXA; and (4) evaluate the effect of body anthropometry parameters on the agreement of BMD measurements between the instruments.

Materials and Methods
Study subjects ( = 72, aged 21-72 years) of over 20 years of age were recruited from the volunteers in University of Eastern Finland. Exclusion criteria included pregnancy and bilateral hip prostheses. Total body DXA could be conducted for a smaller group of subjects ( = 62) due to technical limitations, that is, too narrow scan table in Prodigy for one severely obese (126 kg) subject and two subjects with metal in their body, as well as due to reluctance of some subjects for total body measurements.
The Prodigy scanner, equipped with a narrow fan beam at an angle of 4.5 ∘ and orientated parallel to the long axis of the body, applies constant peak X-ray energy at 76 kV and a current of 3 mA. Further, a Samarium K-edge filter produces energies at 38 and 70 kV [5]. The Prodigy system employs 16 Direct-Digital high-definition (HD) detectors, made of energy sensitive cadmium zinc telluride (CZT), 5 cm long, and they allow rapid photon counting. The iDXA uses a peak X-ray energy of 100 kV as well as an array of sixty-four Direct-Digital CZT-HD detectors, which eliminate dead space between the detectors, thereby creating a high resolution image and improvement of precision for the scan [5]. The improved image resolution of iDXA comes at the cost of a slightly higher effective radiation dose, as compared to Prodigy. In both devices, however, typical radiation dose for a subject is less than 10 Sv.
Left proximal femur (total hip, femoral neck, shaft, Ward's triangle, and trochanter) and anteroposterior (AP) lumbar spine (L2-L4, L2-L3, and L3-L4) of 72 women were scanned by using both GE Healthcare Lunar Prodigy (software version 11.4) and iDXA (software version 14.0) narrow-angle fan beam densitometers. In addition, 62 women completed also total body BMD (g/cm 2 ), bone mineral content (BMC, g), and bone area (cm 2 ) measurements during the same visit between June and September of 2012. In order to minimize the potential operator bias all scans on both devices were performed in the same room by two experienced nurses. Subjects were carefully repositioned between the scans to minimize errors that could be related to changes in the measurement geometry [10]. For analysis, the automatic edge detection was always used; however, all BMD analyses were thoroughly checked for errors and manually corrected if needed. The GE Healthcare Lunar algorithm that automatically finds the area of the lowest BMD in proximal femur, that is, Ward's triangle, was used in our study. Subject weight and height were measured during each visit. All study subjects provided informed consent and the research protocol was approved by the ethics committee of Kuopio University Hospital (KUH). During the study period, quality control scans on spine phantom (L2-L4) were performed along the guidelines of the manufacturer.

Statistics
To reveal the association and agreement between the measurements of two densitometers the data were analyzed by using the linear regression analysis, Deming regression, paired -test, Pearson's correlation analysis, and Bland and Altman analysis [11]. If the assumption of normal distribution was violated, the nonparametric Wilcoxon signed-rank test was used to analyze the differences between devices (e.g., Ward's triangle BMD, BMC, and area). For BMD Prodigy versus BMD iDXA scatter plots the statistical significance of the intercept of each regression line was tested. If the intercept was not different from zero then the regression analysis was repeated with the intercept forced through the origin [1]. The effect of body height, weight, and mass index (BMI, kg/m 2 ) on the cross-calibration was studied using the stepwise multivariate linear regression analysis. Height and BMI as well as weight and BMI were not included in the same model due to multicollinearity based on variance inflation functions and tolerance statistics. The accuracy of corrections, obtained with a regression line, was expressed as the standard error of the estimates (SEE). The Bland and Altman method was used to evaluate the bias in results between the devices [11].
Hologic anthropometric lumbar spine phantom, European Spine Phantom (ESP), and GE Healthcare Lunar aluminum spine phantom [12,13] were scanned 10 times during a period of one week to calculate the short-term precision error  (coefficients of variation, CV% = (SD/mean) × 100%) of the instruments [14]. Statistical analyses were performed with the R Statistical Software (Foundation for Statistical Computing, Vienna, Austria) and with the Statistical Package for Social Sciences (IBM SPSS Statistics for Windows, IBM Corp., Armonk, NY, USA, version 19.0). A value below 0.05 was considered to be statistically significant.

Results
In vitro scans of the three phantoms indicated that iDXA measured 0.4% and 1.0% higher BMD values with Lunar and ESP phantoms compared to Prodigy, respectively, whereas iDXA measured −0.4% lower BMD values with Hologic phantom compared to Prodigy (Table 1). Short-term precision (repeated phantom measurements) of BMD values in vitro was slightly lower (i.e., CV% higher) when measured with Prodigy (0.34-0.50%) than with iDXA (0.13-0.42%).
Age and anthropometrical variables were recorded in subjects within the cross-calibration sample ( Table 2). BMD scatter plots (Prodigy versus iDXA) showed a close linear relationship over the entire range of BMD values for the spinal, femoral neck, total hip, and total body scans ( = 0.962-0.995, < 0.0001) ( Figure 1). In vivo BMD values of iDXA were, as compared to Prodigy, 1.5% (0.017 g/cm 2 ) higher at the lumbar spine (L2-L4), 0.5% (0.005 g/cm 2 ) higher at the femoral neck, and 0.9% (0.009 g/cm 2 ) higher at the total hip (Table 3). In contrast, total body BMD values by iDXA were −1.3% (−0.016 g/cm 2 ) lower than those by Prodigy ( Table 4). The differences between the iDXA and Prodigy regional total body BMD values ranged from −11.6% (arms) to 26.1% (ribs) ( Table 4). In particular, for total body BMD values, the difference was strongly dependent on the mean BMD: at high BMD values iDXA showed higher values than Prodigy, whereas at low BMD values the opposite was found ( Figure 2). After the correction equations were applied, difference in BMD values between the Prodigy and iDXA devices was negligible. The variables included in the final multiple regression models were based on statistical significance, 2 (coefficient of determination), and SEE values (Appendices A and B). Height, weight, or BMI were not included in the final models ( > 0.05) for femoral neck, total hip, lumbar spine, and total body BMD regions of interest (ROIs).

Discussion
The present study, based on a sample of 62-72 women, indicated systematic differences between the GE Healthcare Lunar Prodigy and iDXA DXA devices. BMD scatter plots (Prodigy versus iDXA) showed a close linear relationship over the entire range of BMD values. However, in all ROIs the regression slope was significantly different from unity demonstrating the need for cross-calibration. The BMD values measured using iDXA were 1.5% higher at the lumbar spine (L2-L4), 0.5% higher at the femoral neck, and 0.9% higher at the total hip, whereas total body BMD measurement values were −1.3% lower, compared to those obtained with Prodigy.
Three phantoms were measured with both instruments and a variable BMD disagreement up to 1.4% (from −0.4% to 1.0%) between the instruments was registered. In comparison of in vitro and in vivo results the disagreement between ESP phantom measurements and in vivo spinal BMD and femoral neck values was ±0.5%. The discrepancy between Hologic phantom results and in vivo measurements ranged from 0.9% (femoral neck) up to 1.9% (spine). Furthermore, the closest agreement between the DXA devices was observed in the Lunar Phantom (0.4%) and the in vivo femoral neck BMD values (i.e., 0.5%), but not at spine with a disagreement of 0.9%. Calibration using ESP phantom may agree more closely with the in vivo cross-calibration results, as compared to use of Lunar aluminum phantom with straight edges [12]. Calibrations between different densitometers, as based on   The total body BMD difference was strongly dependent on mean BMD: at high BMD values iDXA showed significantly higher values than Prodigy, whereas at low BMD values the opposite was found. The hip and spine BMD difference was less dependent on the mean BMD values. Table 4: Mean (SD) values and linear correlation coefficients ( ) of the in vivo dual-energy X-ray absorptiometry (DXA) measurements with Prodigy and iDXA ( = 62). Bland and Altman analysis results with relative mean differences % a as well as limits of agreement [ ± (1.96 × SD)]. Simple linear regression analysis of Prodigy (dependent) versus iDXA (independent) BMD data with standard error (SE) and standard errors of estimates (SEE). Regional total body BMD values differed considerably between Prodigy and iDXA. BMD discrepancy was smaller but significant at total body region of interest. After linear regression correction equations were applied the differences were negligible.  [15]. However, a cross-calibration between two DXA modalities based on only phantom measurements can also be inaccurate, especially at hip and total body ROIs [15]. According to ISCD phantom-based cross-calibration is adequate after hardware change or after replacing the DXA system with the same model from the same manufacturer. Instead, in vivo cross-calibration is necessary if the old DXA system is replaced with a different model from the same or different manufacturer [1]. Cross-calibration between the devices is essential as the mean systematic differences between instruments may exceed the annual biological BMD changes [1]. Typically, differences of below 1% are encountered between similar or different devices from the same manufacturer [2,16,17]. In the present study, iDXA measured BMD values at the lumbar spine (L2-L4), femoral neck, and total hip were higher than those measured with Prodigy (i.e., 1.5%, 0.5%, and 0.9%, resp.). In contrast, previous studies have reported that BMD iDXA values were lower than BMD Prodigy values at the lumbar spine (ranging from −0.25% to −1.2%), femoral neck (ranging from −0.7% to −2.0%), and total hip (ranging from −0.1% to −0.2%) ROIs [4,6,7]. As an exception Choi et al. measured BMD iDXA values to be 0.3% higher than BMD Prodigy values at the total hip ROI [4]. Thus, compared to previous results [4,6,7], the discrepancy could be 2.7%, 2.5%, and 1.1% at spine [4], femoral neck [4], and total hip [7], respectively. Importantly, linear correction equations between iDXA and Prodigy, as derived in these previous cross-calibration studies [4,6,7] and implemented in the present study, would have increased the disagreement, especially at the femoral neck, the total hip, and the lumbar spine ROIs, between our two DXA devices. Furthermore, no significant correlations were observed between BMD Prodigy and BMD iDXA differences and mean BMD values of the two devices in femoral neck, total hip, and lumbar spine ROIs [4]. According to ISCD, correction equations are needed if the difference of two densitometers exceeds 1% [1]. In the present study, the femoral neck BMD values differed by 0.5% between Prodigy and iDXA, indicating no true need for cross-calibration. However, after applying correction coefficients the difference between these two densitometers was negligible. A 0.005 g/cm 2 difference of femoral neck BMD values between the Prodigy and iDXA devices found in the present study corresponds to a typical mean femoral neck bone loss during a one-year period [18].
In the present study, the average total body BMD measurement values acquired with iDXA were −1.3% lower than those with Prodigy and are, thus, in accordance with previous studies (ranging from −1.4 to −3.5%) [5,8]. Still, the maximum discrepancy was 2.2% between the previous [5] and present study. The total body BMD values, especially at high bone density levels, were considerably higher with iDXA than with Prodigy [8]. Furthermore, significantly higher differences occurred in regional BMD values between the two devices. For example, mean BMD values of iDXA in arms ROI and in legs ROI were −11.6% and −6.4% lower, respectively, as compared to Prodigy. The differences between the devices in regional BMD values in arms (−13.9% [8] versus −11.6% (present study)), legs (−4.5% versus −6.4%), pelvis (−11.9% versus −11.0%), and ribs (26.1% versus 26.1%) ROIs were nearly similar to those found earlier. In contrast, in the present and previous [8] studies iDXA overestimated and underestimated total body spine BMD by 4.2% and by −3.3%, respectively, when compared to Prodigy.
We also calculated SEE values to analyze the accuracy of our linear predictions. The present SEE values were similar as presented earlier at femoral neck, at lumbar spine [5,6], at total body [7], and at total hip [6]. Total hip SEE values were even slightly lower than previously [5]. Importantly, the larger the SEE, is the more difficult it is to detect true BMD changes after a scanner change. Limits of agreement for BMD values are compatible in most ROIs or slightly wider in some ROIs compared to previous reports [4,7].
There are obvious reasons for some differences in the results of present and previous studies [4][5][6][7]. As BMD values and anthropometrical variables may differ between Caucasian and Asian people [19][20][21], the correction equations obtained from a study of Asian people [4] or multiethnic subjects [5] may not be implemented for the present study population. However, it is not possible to address whether this BMD discrepancy [4] is mainly due to variation in output of individual DXA instruments or to some extent due to ethnicity. Indeed, up to 5% differences may exist between the same types of devices from the same manufacturer in worst-case scenario [22]. Further, both BMD values and hip geometry parameters differ significantly by gender [23]. Previous cross-calibration studies between Prodigy and iDXA have included both genders [4][5][6][7]. Gender affects cross-calibration at hip [4,6] and at spine [24]. Earlier, separate correction equations between Prodigy and iDXA were derived for both male and female subjects in a single study [5]. As only women are included in the OSTPRE study [9], correction equations derived from women only may be more appropriate in the context of OSTPRE study. However, ISCD makes no remark whether gender should be taken into account during cross-calibration studies [1]. Also different software versions may affect BMD calibration level and produce systematic differences or errors [25,26]. In our research strategy we are conservative towards changes in software versions; however, they are inevitable in longitudinal studies such as OSTPRE. To evaluate any drift or change in the measurement accuracy during the study period, quality control scans on spine phantom (L2-L4) were performed along the guidelines of the manufacturer. Thus, the present results apply only to Caucasian women as well as to these two particular DXA devices and software programs examined. Furthermore, all OSTPRE study subjects have been measured with this same set of DXA devices by the same trained staff, thus reducing the DXA device uncertainty compared to that in the multicenter studies with different DXA devices from the same manufacturer or even from different manufacturers [27]. Importantly, the present study population exceeds the ISCD recommended 50 individuals for a cross-calibration study [1].
Deviation of BMD results between two devices suggests significant variations in the DXA technology and possibly, for example, in edge detection algorithms. Through the developments, the image resolution of iDXA is superior to that of Prodigy [5]. Body composition measurements also differ significantly between iDXA and Prodigy [5,28,29] and this disagreement is affected by gender [5]. Inclusion of anthropometrical measurements may improve the agreement of BMD measurements between DXA densitometers [2,3]. Indeed, inclusion of the femur thickness and percentage femur fat was earlier found to improve the agreement between iDXA and Prodigy in femoral neck and total hip BMD, respectively [7]. Spinal BMD agreement was only slightly but still significantly affected by subject height, but not BMI or weight [4]. In the present study, inclusion of subject weight, height, or BMI did not improve the agreement of BMD measurements between the devices at the femoral neck, total hip, lumbar spine, or total body ROIs.
In the present study, both linear and Deming regression were used to analyze the data. Both methods yielded similar results (data not shown). The linear regression has been criticized as it assumes no random error in the dependent variable and may underestimate the slope of the true linear relationship [13]. However, the original input data seems to have more influence on the reliability of the linear regression data than the particular regression procedure applied [30]. In Deming regression we compare true values of two variables (with no experimental error); that is, the regression takes into account the random error of both the dependent and independent variables, typical situation in BMD measurements of the same subjects by two instruments. However, different subjects are measured in other iDXA studies that need implementation of the correction equations. Then, we have only experimental iDXA values in use, leading to obvious contradiction. To enable easier comparison and continuation with our earlier and future DXA studies, the results based on linear regression are presented in the tables.
In conclusion, iDXA measured higher BMD values than Prodigy at spine (1.5%), at femoral neck (0.5%), and at total hip (0.9%) ROIs, whereas total BMD values were lower (−1.3%), respectively. At these ROIs the differences in BMD values of the two devices were found to be independent of the anthropometrical parameters. The differences in total body BMD values were dependent on the mean BMD. Differences between the two devices were negligible when in vivo correction coefficients were applied. The current results apply to these two particular DXA devices and software programs used in the study. Importantly, the present results differed significantly from the results of previously published cross-calibration studies between iDXA and Prodigy.