Metabolic Syndrome Derived from Principal Component Analysis and Incident Cardiovascular Events: The Multi Ethnic Study of Atherosclerosis (MESA) and Health, Aging, and Body Composition (Health ABC)

Background. The NCEP metabolic syndrome (MetS) is a combination of dichotomized interrelated risk factors from predominantly Caucasian populations. We propose a continuous MetS score based on principal component analysis (PCA) of the same risk factors in a multiethnic cohort and compare prediction of incident CVD events with NCEP MetS definition. Additionally, we replicated these analyses in the Health, Aging, and Body composition (Health ABC) study cohort. Methods and Results. We performed PCA of the MetS elements (waist circumference, HDL, TG, fasting blood glucose, SBP, and DBP) in 2610 Caucasian Americans, 801 Chinese Americans, 1875 African Americans, and 1494 Hispanic Americans in the multiethnic study of atherosclerosis (MESA) cohort. We selected the first principal component as a continuous MetS score (MetS-PC). Cox proportional hazards models were used to examine the association between MetS-PC and 5.5 years of CVD events (n = 377) adjusting for age, gender, race, smoking and LDL-C, overall and by ethnicity. To facilitate comparison of MetS-PC with the binary NCEP definition, a MetS-PC cut point was chosen to yield the same 37% prevalence of MetS as the NCEP definition (37%) in the MESA cohort. Hazard ratio (HR) for CVD events were estimated using the NCEP and Mets-PC-derived binary definitions. In Cox proportional models, the HR (95% CI) for CVD events for 1-SD (standard deviation) of MetS-PC was 1.71 (1.54–1.90) (P < 0.0001) overall after adjusting for potential confounders, and for each ethnicity, HRs were: Caucasian, 1.64 (1.39–1.94), Chinese, 1.39 (1.06–1.83), African, 1.67 (1.37–2.02), and Hispanic, 2.10 (1.66-2.65). Finally, when binary definitions were compared, HR for CVD events was 2.34 (1.91–2.87) for MetS-PC versus 1.79 (1.46–2.20) for NCEP MetS. In the Health ABC cohort, in a fully adjusted model, MetS-PC per 1-SD (Health ABC) remained associated with CVD events (HR = 1.21, 95%CI 1.12–1.32) overall, and for each ethnicity, Caucasian (HR = 1.24, 95%CI 1.12–1.39) and African Americans (HR = 1.16, 95%CI 1.01–1.32). Finally, when using a binary definition of MetS-PC (cut point 0.505) designed to match the NCEP definition in terms of prevalence in the Health ABC cohort (35%), the fully adjusted HR for CVD events was 1.39, 95%CI 1.17–1.64 compared with 1.46, 95%CI 1.23–1.72 using the NCEP definition. Conclusion. MetS-PC is a continuous measure of metabolic syndrome and was a better predictor of CVD events overall and in individual ethnicities. Additionally, a binary MetS-PC definition was better than the NCEP MetS definition in predicting incident CVD events in the MESA cohort, but this superiority was not evident in the Health ABC cohort.


Introduction
Metabolic syndrome (MetS) is a constellation of interrelated cardiovascular risk factors that increase the risk of developing both atherosclerotic cardiovascular disease and Type 2 DM [1]. It was recognized as a syndrome in 1988 [2] as it reflects joint action of specific cardiovascular risk factors whose underlying pathophysiology is thought to be related to insulin resistance. Since then, various expert committees have tried to refine the definition of MetS [1,[3][4][5][6], and some have concluded that MetS in its current form is imprecisely defined, leads to loss of critical information due to dichotomization of continuous variables [7,8], and has uncertain value as a cardiovascular disease marker calling for further research in this field. Thus, one school of thought regarding the true nature of MetS asserts that MetS is a representation of the true underlying pathophysiologic pathways of dysmetabolic components [2]. Another viewpoint is that MetS is a purely operational, arbitrary, and convenient scoring methodology that summarizes risk [7,8]. It is deficient in that it omits smoking and LDL-C and is therefore not as efficient as the Framingham risk score and it does not add new elements [7,8]. The intention of this paper is to highlight a methodology to improve upon the current NCEP definition of MetS, in the sense of improving CVD risk prediction using the same components as are included in the MetS as currently defined [1].
The commonly used definition of MetS from National Cholesterol Education Program/Adult Treatment Panel (NCEP ATP III) [1,9] is a score based on the presence or absence of five dichotomized risk factors. When a subject exceeds the cut point for three or more of these factors, the syndrome is deemed to be present [1,3]. However, it is accepted that the risk for CVD from each element of the MetS is continuous, and dichotomization above and below certain thresholds does not lead to presence or absence of CVD risk. Studies show that dichotomization of the elements of MetS based on ad hoc cut points may lead to misclassification of risk in individuals, despite minimal changes in the actual elements or cardiovascular risk [10,11]. Further, the summation of components into a unitary diagnosis assumes that each dichotomized risk factor carries the same risk, yet some factors included in NCEP MetS definition are more strongly predictive of CVD than others [8]. Additionally, the NCEP definition of MetS does not account for racial differences and may not accurately predict cardiovascular risk in non-Caucasian populations [12][13][14][15][16].
Principal component analysis is an analysis strategy designed to summarize multidimensional correlated data [17,18] and provide a continuous MetS score unlike factor analysis [17,18] which aims to determine the underlying structure of the syndrome by identifying latent variables. The unrotated first principal component is a linear combination of the individual variables that captures the maximum variance in the data among all possible linear combinations. Since MetS is characterized by concomitant derangements in multiple factors, the first principal component can be used to indicate the extent to which any individual's metabolic risk factors are consistent with having the MetS. A few studies have applied PCA to derive a continuous MetS score and how this score relates to incident diabetes and cardiovascular disease (CVD) [19][20][21][22][23]. This study is an extension and validation of such findings in a multiethnic cohort.

Study Populations and Data Collection. The Multi-Ethnic
Study of Atherosclerosis (MESA) design has been previously described [24]. Briefly, MESA is a prospective cohort study that began in July 2000 to investigate the prevalence, correlates, and progression of subclinical CVD. It included 6814 men and women aged 44-84 years old recruited from 6 US communities (Baltimore, MD; Chicago, IL; Forsyth County, NC; Los Angeles County, CA; northern Manhattan, NY; St. Paul, MN). MESA cohort participants were 38% Caucasian (n = 2622), 28% African American (n = 1893), 22% Hispanic (n = 1496), and 12% Chinese (n = 803). Individuals with a history of physician-diagnosed myocardial infarction, angina, heart failure, stroke, or transient ischemic attack, or who had undergone an invasive procedure for CVD (coronary artery bypass graft, angioplasty, valve replacement, pacemaker placement, or other vascular surgeries), were excluded from the study at baseline (2000)(2001)(2002). This study was approved by the Institutional Review Boards of each study site, and written informed consent was obtained from all participants.
The Health, Aging and Body composition (Health ABC) study is a longitudinal, prospective study investigating the associations among body composition, weight-related Health conditions, and incident functional limitations in older adults [25]. The Health ABC study cohort consists of 3,075 well-functioning black and white men and women aged 70-79 at baseline (1997)(1998). After excluding participants with missing data and prevalent cardiovascular disease at baseline, 2,159 participants were analyzed for 8.6 years for incident CVD events.

Laboratory and Anthropometric Measurements.
Participants completed standardized medical history questionnaires. Fasting blood glucose and lipids were analyzed at a central laboratory. Serum glucose was measured by Vitros analyzer (Johnson & Johnson Clinical Diagnostics). Impaired fasting glucose was defined as a fasting blood glucose ≥100 mg/dL and <126 mg/dL. Diabetes was defined by self-reported history of adult onset diabetes, fasting glucose ≥126 mg/dL, or use of insulin or oral glucoselowering medications. Plasma lipids (HDL cholesterol and triglycerides) were measured using a standardized kit (Roche Diagnostics). LDL cholesterol was calculated with the Friedewald equation [26]. Resting seated blood pressure was measured three times using an automated oscillometric sphygmomanometer (Dinamap PRO 100; Critikon, Tampa, FL). The average of the last two measurements was used in analysis. Elevated blood pressure was defined by use of blood pressure medication or systolic blood pressure ≥130 mm Hg or diastolic blood pressure ≥85 mm Hg. Waist circumference was measured at the umbilicus to the nearest 0.1 cm using a steel measuring tape with standard 4-oz tension.
The National Cholesterol Education Program/Adult Treatment Panel (NCEP ATP III) [1,9] definition was used to classify participants having MetS in the MESA cohort. Three of five components are required for diagnosis. (1) Waist circumference ≥ 102 cm: men, ≥88 cm: women, (2) hypertension ≥ 130 mm Hg systolic or ≥85 mm Hg diastolic or use of medications for hypertension, (3) fasting blood glucose ≥ 100 mg/dL or treatment for impaired fasting glucose, (4) triglycerides ≥ 150 mg/dL or specific treatment, (5) HDL-C ≤ 40 mg/dL in men and ≤50 mg/dL in women.

Cardiovascular Events.
A detailed description of events and the process of adjudication can be found at the MESA website (http://www.mesa-nhlbi.org/). Briefly, participants were contacted every 9-12 months to inquire about hospital admissions, cardiovascular diagnoses, and deaths. Hospital records were abstracted for possible CVD events and were sent for review and classification by an independent adjudication committee. For the purposes of this study, a CVD event was defined as incident myocardial infarction, resuscitated cardiac arrest, definite angina, probable angina if followed by revascularization, stroke, stroke death, coronary heart disease (CHD) death, other atherosclerotic death, and other CVD deaths as defined by the MESA protocol.

Statistical
Analysis. An ANOVA test was performed to compare the mean values of the components of the MetS across the four ethnic groups. A chi-square test was performed to compare the proportion of individuals exceeding the NCEP cut points for each component of the MetS across the four ethnic groups. A principal component analysis using studentized residuals of fasting log glucose, log triglycerides, log HDL cholesterol, waist circumference, systolic blood pressure, and diastolic blood pressure after adjusting for age and gender was performed. A correlation matrix was generated to measure the correlation between the individual elements of the metabolic syndrome and the first principal component. Principal component analysis was used for the extraction of the initial factors. Principal component analysis transforms the original variables into a new set of uncorrelated factors (principal components) that account for the maximum proportion of the variance in the data, with each component being a linear combination of the original observed variables. The first principal component is the linear combination of variables that accounts for the largest proportion of variance in the data, and the second component is the combination that accounts for the next largest proportion, and so on. Only components with eigenvalues (the sum of the squared factor loadings, representing the variance attributable to each principal component) >1.0 are considered significant. A Kaplan-Meier survival analysis was performed across quartiles of MetS-PC to predict CVD events and a log rank test for trend overall and in each ethnic group was performed. After excluding subjects with missing data, a Cox regression proportional hazards analysis was used to determine the association between MetS-PC and 5.5-year incident CVD events in the MESA cohort after adjusting for potential confounders including age, gender, race, smoking, and LDL-C in nested models.
In a separate sensitivity analysis, participants with diabetes were excluded, the first principal component was recalculated, and the association between MetS-PC and CVD events was performed. Similar sensitivity analyses were performed after excluding participants on anti-hypertensive medications and anti-lipid medications. The patterns of association observed between MetS-PC and CVD events in these sensitivity analyses were qualitatively similar to the results observed when using the first principal component generated using the entire cohort (sensitivity analyses data not shown), and thus the results for the entire cohort are shown.
In a separate analysis, MetS-PC was recalculated using the original six variables plus high-sensitivity C-reactive protein (hsCRP). This new MetS-PC with seven components was included in separate Cox proportional hazards models with potential confounders to predict incident CVD events.
To facilitate comparison of MetS-PC with the dichotomous NCEP definition, a MetS-PC cut-point (0.475) was chosen to yield the same 37% prevalence of metabolic syndrome as when using the NCEP definition (37%) in the MESA cohort. Separate Cox regression proportional hazards models including NCEP MetS and a binary MetS-PC to predict cardiovascular events were developed to compare the relative strength of the association of the two MetS definitions, and their model chi-square values were compared.
Similar analyses were performed using the Health ABC cohort to replicate and validate the findings from the MESA cohort. All statistical analyses were performed using JMP Version 8 (SAS Institute Inc., Cary, North Carolina).   associated with CVD events with a univariate hazard ratio of 1.56, 95%CI 1.42-1.72 (Table 3). In a full model adjusted for age, gender, race, smoking, and LDL-C, MetS-PC per unit difference of SD remained significantly associated with CVD events (HR = 1.71, 95%CI 1.54-1.90). Additionally in similar analysis stratified by ethnicity, hazard ratio remained significant in each ethnicity separately in both unadjusted and adjusted models ( Table 3).
The prediction of CVD events using MetS-PC with seven components including hsCRP was not significantly different than the more parsimonious MetS-PC definition with the original six variables (data not shown).
Finally, we performed an independent validation of the analysis using the Health ABC cohort. In a full model adjusted for age, gender, race, smoking, and LDL-C, MetS-PC per 1-SD (HABC) remained significantly associated with CVD events (HR = 1.21, 95%CI 1.12-1.32) overall, and by each ethnicity, Caucasian (HR = 1.24, 95%CI 1.12-1.39)

Discussion
In this ethnically diverse population of 6,780 individuals aged 45-84, the first principal component derived from elements of the MetS explains 33% of the variance of the metabolic measures. This continuous metabolic syndrome score (MetS-PC) is a significant predictor of 5.5-year incident clinical cardiovascular events in the total population, and in each of the four major race/ethnicity subgroups separately. Additionally, a binary definition of the MetS based on this continuous measure was a better predictor of clinical cardiovascular events compared to the NCEP definition. It should be acknowledged that MetS is not an absolute risk indicator, because it does not contain many of the factors that determine absolute risk, for example, age, sex, cigarette smoking, and low-density lipoprotein cholesterol levels [6]. Risk predictors such as the Framingham risk score are much superior if risk prediction is the goal; however, the primary goal of this analysis was not risk prediction per se, but rather improving upon the current NCEP definition of MetS, using the same components that make the MetS and comparing the two definitions using incident CVD events as a criteria measure.
The current binary definitions of the metabolic syndrome, including the NCEP definition, were developed by expert committees as a clinically useful means of identifying high-risk individuals [1,[3][4][5][6]. However, dichotomization of the continuous variables leads to loss of information [7,8]. A minor improvement in one component could result in an individual no longer being classified as having MetS, despite no meaningful change of actual cardiovascular risk [11]. Additionally, the NCEP MetS may not predict risk of CVD events in all ethnicities equally as the data for constructing the NCEP definition were derived from predominantly Caucasian populations [12][13][14][15][16]. The evaluation of the MetS as a continuous score derived from a multiethnic population is potentially a more informative and generalizable approach to defining the syndrome and determining its clinical correlates. Additionally, a continuous measure may also improve the ability to identify lifestyle, environmental, molecular, and genetic etiologic factors that are specific for the MetS, and in future, these etiological factors could be incorporated in the definition of the continuous measure providing a stronger CVD predictive value [7].
Principal component analysis is a mathematical technique that transforms a number of correlated variables into a reduced number of uncorrelated variables called principal components, of which the first principal component captures the maximal variance [17,18,27]. Since the MetS is a metabolic condition characterized by the co-occurrence of multiple metabolic abnormalities, it follows that the first principal component of the measures for these traits would be an efficient way to quantify the presence of the syndrome. In our data, MetS-PC correlated well with all the components of the MetS (correlation coefficients 0.44-0.61, Table 2), consistent with a definition of a syndrome.
The metabolic syndrome is a known risk factor for cardiovascular disease, and MetS-PC predicts clinical cardiovascular events extremely well in this multiethnic cohort. MetS-PC was significantly associated with a more than twofold, fully adjusted increased risk of incident CVD in this cohort. We choose to show the analysis in the entire cohort including also diabetics as the analysis after excluding diabetics was qualitatively similar to the results obtained with the entire cohort. In analyses stratified by ethnicity, MetS-PC is a significant predictor of CVD across the four ethnic groups with similar point estimates for the hazard ratios. Additionally, there are suggestions that perhaps for better risk prediction; the definition of metabolic syndrome should be broadened to incorporate other elements such as highsensitivity C-reactive protein (hsCRP) in the MetS definition [7,28]. We performed an analysis wherein we added hsCRP in the MetS-PC, but the addition did not lead to any substantial improvement in risk prediction of incident CVD events. Our findings parallel similar results where addition of hsCRP to MetS did not lead to improvement in prediction of incident CVD events [29] or atherogenesis [30].
Further, we compared the MetS-PC definition of MetS with the NCEP definition in predicting CVD events in the MESA cohort. To facilitate comparison of MetS-PC with the dichotomous NCEP definition, a MetS-PC cut point (0.475) was chosen to yield the same 37% prevalence of MetS as when using the NCEP definition (37%) in the MESA cohort. The strength of association with incident cardiovascular events was compared using the NCEP and the derived binary MetS-PC definitions, in a fully adjusted model. The binary MetS-PC has a stronger association with incident cardiovascular events compared to the NCEP MetS as the point estimate for hazard using the binary MetS-PC definition is not included in the confidence interval of the NCEP hazard ratio. Additionally, the chi-square values were much higher when using the binary MetS-PC definition compared to the NCEP MetS definition. However, when an independent validation of a similar strategy was applied in the Health ABC cohort, the association of MetS-PC with CVD events remained, but the superiority of the MetS-PC definition versus NCEP MetS definition was no longer evident. This could be the result of the inherent differences between the two cohorts such as differences in age. The mean age of the MESA cohort is 62 years versus 74 for the Health ABC cohort. Additionally, the overall mean age and genderadjusted systolic and diastolic blood pressures are much lower in the MESA cohort as compared to the Health ABC cohort. This is reflected in the loading for the first principal component. Both the systolic and diastolic blood pressures are well represented in the first principal component in the MESA cohort (loading factors: systolic BP = 0.61, diastolic BP = 0.56). In contrast, the systolic and diastolic BP were less heavily represented in the PC analysis for the Health ABC cohort (loading factors: systolic BP = 0.13, diastolic BP = 0.05), perhaps secondary to the greater prevalence of essential hypertension uncorrelated with the other metabolic derangements of the Met syndrome. In a previous Health ABC study evaluating CVD risk in older adults with MetS without past history of coronary heart disease and heart failure, the proportion of MI (6.1% versus 4.8%, P = 0.18) and HF hospital stay (5.6% versus 4.3%, P = 0.17), although higher among those with MetS compared to those without MetS, did not reach statistical significance [31]. This finding perhaps explains the lower hazards for CVD events in the Health ABC cohort compared to the MESA cohort findings.
A few studies have applied PCA to identify factors and their relationship to incident diabetes and cardiovascular disease (CVD) using the elements of the MetS, measures of obesity, and insulin resistance [19][20][21][22]. Hillier et al., [22] found increased odds of cardiovascular events in 5,024 middle-aged French cohort; men, 1.7 (1.4, 2.1), and women 1.7 (1.0, 2.7), which is similar to our findings of 1.71 (1.54, 1.90). Lempiainen et al., [19] applied factor analysis to 1069 subjects 65 to 74 years old from Finland followed for a period of 7 years and found similar hazards of coronary events. Similarly, Pyörälä et al., [23] applied factor analysis to 970 healthy men aged 34 to 64 years in the Helsinki Policemen Study and found that the insulin resistance factor increased the hazard for coronary heart disease to 1.28 (95% CI 1.10-1.50) during 22 years of followup. The current study builds on these earlier efforts by applying similar methods to a larger and much more ethnically diverse cohort. The strength of the association between the continuous MetS-PC score and CVD events, its utility for predicting CVD events in multiple ethnicities, and its improved performance relative to the NCEP definition provide compelling evidence that this approach may be a superior strategy for defining the metabolic syndrome.
The strength of this study is the inclusion of four different ethnicities from six different recruitment sites in the United States, and stringent quality control procedures, as well as an average 5.5-year followup for identification of incident cardiovascular events. Additionally, we performed a replication of the strategy using an independent cohort and found results to be qualitatively similar. Among its limitations, the exclusion of individuals with known cardiovascular disease calls for caution in generalizing results to the total population. It should also be emphasized that acculturation to diet and environment plays a role in the development of the cardiovascular disease, and therefore, our findings may not apply to the same ethnicities in other parts of the world. Additionally, the sample size for Chinese Americans was small, and the event rate was low, making the estimates of risk less certain in the Chinese American participants compared with other groups. Further, this continuous score will need to be validated in larger cohorts before finding general applicability.

Conclusion
In this multi-ethnic population, the first principal component derived from the elements of the metabolic syndrome represents a continuous metabolic syndrome score which significantly predicts clinical cardiovascular events overall and across all ethnicities. Additionally, when a binary score derived from the first principal component was compared to the NCEP definition of MetS in the MESA cohort, the binary MetS-PC score was a better predictor of incident cardiovascular events than the NCEP definition of the metabolic syndrome. However, in the Health ABC cohort, although results were qualitatively similar, the MetS-PC definition was not found to be superior to the NCEP definition of MetS. This strategy will need to be replicated in other large multi-ethnic cohorts and, if also predictive of incident diabetes, it could be considered as an alternative to the conventional metabolic syndrome definition. More research will be required to clarify if and how such a score could be incorporated into clinical practice.