Serum Lipidomics Profiling to Identify Biomarkers for Non-Small Cell Lung Cancer

Non-small cell lung cancer (NSCLC) is the leading cause of cancer death worldwide, which ranks top in both incidence and mortality. To broaden our understanding of the lipid metabolic alterations in NSCLC and to identify potential biomarkers for early diagnosis, we performed nontargeted lipidomics analysis in serum from 66 early-stage NSCLC, 40 lung benign disease patients (LBD), and 40 healthy controls (HC) using Ultrahigh Performance Liquid Chromatography-Quadrupole Time-of-Flight Mass Spectrometry (UHPLC-Q-TOF/MS). The identified biomarker candidates of phosphatidylcholines (PCs) and phosphatidylethanolamines (PEs) were further externally validated in a cohort including 30 early-stage NSCLC, 30 LBD, and 30 HC by a targeted lipidomic analysis. We observed a significantly altered lipid metabolic profile in early-stage NSCLC and identified panels of PCs and PEs to distinguish NSCLC patients and HC. The levels of PCs and PEs were found to be dysregulated in glycerophospholipid metabolism, which was the top altered pathway in early-stage NSCLC. Receiver operating characteristic (ROC) curve analysis revealed that panels of PCs and PEs exhibited good performance in differentiating early-stage NSCLC and HC. The levels of PE(16:0/16:1), PE(16:0/18:3), PE(16:0/18:2), PE(18:0/16:0), PE(17:0/18:2), PE(18:0/17:1), PE(17:0/18:1), PE(20:5/16:0), PE(18:0/18:1), PE(18:1/20:4), PE(18:0/20:3), PC(15:0/18:1), PC(16:1/20:5), and PC(18:0/20:1) in early-stage NSCLC were significantly increased compared with HC (p<0.05). Overall, our study has thus highlighted the power of using comprehensive lipidomic approaches to identify biomarkers and underlying mechanisms in NSCLC.


Introduction
Lung cancer is the leading cause of cancer death worldwide, which ranks top in both incidence and mortality [1]. Nonsmall cell lung cancer (NSCLC), which accounts for 80% of all lung cancer cases, involves adenocarcinoma (ADC), squamous cell carcinoma (SqCC), and large cell carcinoma. Despite the great progress made against NSCLC in recent years, the five-year survival rate of NSCLC is 15% approximately [2,3]. Currently, NSCLC clinical diagnosis mainly depends on chest X-rays and computed tomography, but these techniques have low sensitivity and specificity. Biopsy is not desirable to frequently detect tumor because of its invasiveness [4][5][6]. In addition, the common tumor biomarkers used in NSCLC, such as carcinoembryonic antigen (CEA) and cytokeratin 19 fragment (CYFRA21-1), show poor diagnostic values, which are not suitable for early detection of NSCLC [7][8][9]. Therefore, it is necessary to search for novel biomarkers for the early diagnosis of NSCLC.
Metabolomics can adapt nonconventional technology to tumor biomarker research and has been used in pharmacological analysis and disease diagnosis [10]. As an important branch of metabolomics, lipidomics is a system-based study of all lipids aiming at comprehensive analysis of lipids in the biological system [11][12][13]. Lipids are the fundamental components of biological membranes as well as the metabolites of organisms, which play a critical role in cellular energy storage, structure, and signaling [14][15][16]. The lipid imbalance is closely associated with numerous human lifestyle-related diseases, such as atherosclerosis [17], obesity [18], diabetes 2 BioMed Research International [19], Alzheimer's disease [20], and cancer [21]. Lipidomics has been accepted as a lipid-related research tool in lipid biochemistry [22], clinical biomarker discovery [21], and disease diagnosis [23] and in understanding disease pathology [24]. Lipidomics will not only provide insights into the specific functions of lipid species in health and disease, but will also identify potential biomarkers for establishing preventive or therapeutic programs for human diseases. The application of lipidomics in NSCLC biomarker discovery provides the opportunity for gaining novel insights into biochemical mechanism of NSCLC [25]. It has been reported that phospholipid and sphingolipid profiles changed in NSCLC, which may have important biological implications and may have significant potential for biomarker development [26][27][28]. But, up to now, few researches have clarified the changes of lipid profiles among early-stage NSCLC, lung benign disease, and healthy controls, and the potential lipid biomarkers for early diagnosis have also not been found. HPLC-MS has been widely used in lipidomics because it provides accurate qualitative and quantitative analysis. In this study, we used UHPLC-Q-TOF/MS to profile, identify, characterize, and quantify lipid compounds because of its high scanning speed, resolution, and sensitivity.
To broaden our understanding of the metabolic alterations, especially the lipid metabolic alterations in NSCLC and to identify potential biomarkers for early diagnosis, untargeted lipidomics evaluation was performed in sera from 66 early-stage NSCLC (35 ADC and 31 SqCC), 40 LBD, and 40 healthy controls (HC). In the subsequent pathway analysis, glycerophospholipid (GPL) pathway emerged at the top of these significantly altered metabolic pathways. The identified biomarker candidates of phosphatidylcholines (PCs) and phosphatidylethanolamines (PEs) were further externally validated in a cohort including 30 early-stage NSCLC, 30 LBD, and 30 HC by a targeted lipidomic analysis.

Patients and Sample
Collection. Serum samples were collected from NSCLC, LBD patients, and HC at Huzhou Central Hospital from January 2015 to July 2016. The patients were selected according to the following criteria: (1) all patients were diagnosed and confirmed by pathology; (2) patients with NSCLC were at the early stages (Stages I, II) according the clinical staging method; (3) patients had no other diseases which might affect lipid metabolism such as hyperlipidemia, diabetes, and other cancers; and (4) none of the patients received preoperative adjuvant chemotherapy or radiotherapy. LBD are defined as benign nodules, epithelioid granuloma, hamartoma, and inflammatory lesions. Serum samples from HC were collected from healthy volunteers with no history of carcinoma. Histopathology results for all cancer patients were confirmed by surgical resection of the tumors, while clinicohistopathological characteristics and tumor stages were assessed based on biopsy results. No preoperative chemotherapy or radiotherapy was administered to the cancer patients included in this study.
All samples were collected in accordance with ethical guidelines, and written informed consent was received. All patients were approached based on approved ethical guidelines, and patients who agreed to participate in this study were required to sign consent forms before being included in the study. The study was approved by Research Ethics Committee of Huzhou Center Hospital (No. 20150801). We also confirmed that all methods were performed in accordance with the relevant guidelines and regulations.
Before the collection of serum samples, patients and healthy volunteers fasted at least 12 hours. Briefly, for serum isolation, blood was collected into "increased silica act clot activator, silicone-coated interior, BD Vacutainer" and centrifuged at 700 g for 10 min at 4 ∘ C within 2 hours of venipuncture. The supernatant was removed and centrifuged in the same way for the second time. The resultant serum was transferred into a clean tube and stored at -80 ∘ C until use.

Sample Preparation.
To perform the serum lipid analysis, 100 L of sample was added to 480 L of extraction liquid (V MTBE : V methanol = 5:1) and vortexed for 30 s. The mixtures were allowed to stand for 20 min and then centrifuged at 3000 rpm for 15 min. A 400 L of the supernatant (MTBE extract) was transferred to a clean vial and dried in a vacuum concentrator. Dried samples were reconstituted with 100 L of dichloromethane/methanol (1:1, v/v).

Chromatography and Mass
Spectrometry. Lipid profiling was performed by a UHPLC system (1290 series, Agilent Technologies, USA) coupled with a quadruple time-of-flight mass spectrometer (Triple TOF 6600, AB SCIEX, USA). Phenomenex Kinetex C18 100 A column (1.7 m, 2.1×100 mm) (Phenomenex, USA) was used for the lipid extracts separation. The column was maintained at 25 ∘ C. The linear gradient started from 60% to 0% A (10 mmol/L ammonium formate, ACN: H2O = 6:4) and 40% B (10 mmol/L ammonium formate, IPA: H 2 O = 9:1). Gradient conditions were as follows: 0-12 min linear gradient from 40 to 100 % B, 12-13.5 min 100 % B. The flow rate was 300 L/min. The injected sample volume was 1 L. Data acquisition and processing were performed with the acquisition software Analyst TF (version 1.7.1, AB SCIEX, USA), which could acquire high resolution MS and tandem-MS data simultaneously by TOF MS full scan and information-dependent acquisition (IDA) in both ESI(+) and ESI(−) modes. The source parameters were set as follows: GAS1: 60 psi; GAS2: 60 psi; CUR: 30 psi; TEM: 250 ∘ C; ISVF: 5500 V in positive mode and -4500 V in negative mode, respectively, DP: 100 V, CE: 10 eV. MS raw data files were converted into the mzXML format using MSconverter, and processed by R package XCMS (version 1.41.0). The preprocessing results generated a data matrix that consisted of the retention time (RT), mass-to-charge ratio (m/z), and peak intensity. R package CAMERA was used for peak annotation after XCMS data processing [29]. Lipids identification was made by matching the acquired MS/MS data against MS/MS data in in-house developed database. The cutoff for match score was set as 0.8 and the minfrac was set as 0.5. All the m/z errors are less than 30 ppm and all the RT errors are less than 60 s. The data were normalised and the distribution was evaluated by MetaboAnalystR.

Statistical
Analyses. Data were presented as mean ± SD. SIMCA-P 14.1 (Umetrics, Umca, Sweden) was employed for multivariable analysis, including the principal components analysis (PCA) with mean-centered (ctr) scaling and orthogonal partial least squares discriminant analysis (OPLS-DA) with unit variance (uv) scaling. PCA was first used to reduce the dimensionality of the multidimensional dataset, while giving a comprehensive view of the clustering trend for the multidimensional data. OPLS-DA was then used to understand global lipid changes among NSCLC, LBD patients, and HC, and corresponding variable importance in the projection (VIP values) was calculated in OPLS-DA model as well. A sevenfold cross-validation method was used based on the OPLS-DA model to estimate the robustness and the predictive ability of our model. Potential metabolic biomarkers were selected with a VIP value greater than 1, and a p value of Student's t-test less than 0.05. In addition, the differentially abundant metabolites were crossreferenced to the pathways by further searching commercial databases, including KEGG (http://www.genome.jp/kegg/) and MetaboAnalyst (http://www.metaboanalyst.ca/).

Selection of Metabolites for Targeted Lipidomics.
Many factors were considered to select the appropriate lipid metabolites for targeted lipidomics. Because metabolomics is the study of metabolic profiles in living systems, the affected metabolic pathways containing affected metabolites were the principal criteria for selecting the biomarkers. In addition, the similarity values for the accuracy of compound identification and the number of differentially abundant metabolites detected in each test sample were important reference factors. (1:1, v/v) and subjected to UHPLC-MS/MS analysis. 6 L of each sample was taken and pooled as quality control (QC) samples.

Data
Processing. The data was processed by an absolute quantitative lipidomics method [30]. MS raw data files were converted to the mzXML format using MSconverter and processed by R package XCMS (version 1.41.0). The preprocessing results generated a data matrix that consisted of the retention time (RT), mass-to-charge ratio (m/z), and peak intensity. Lipids identification was made by matching the acquired MS/MS data against MS/MS data in in-house developed database. The cutoff for match score was set as 0.8 and the minfrac was set as 0.5. All the m/z errors are less than 30 ppm and all the RT errors are less than 60 s. The metabolic features detected less than 50 % in all the QC samples were discarded [31]. The absolute concentrations (ng/ml) of each PC and PE were calculated based on the peak areas of the PC and PE identified in the sample and the peak areas of the internal standards of PC(15:0/18:1) and PE(15:0/18:1) corresponding to the sample. 19.0 (SPSS Inc., Armonk, NY, USA) was used for statistical analyses. Data were presented as mean ± SD. The differences on the levels of PC and PE among the three groups were evaluated by oneway analysis of variance (ANOVA) with Fisher's least significant test. ROC curve analysis was used to calculate the area under the ROC curve (AUC), sensitivities, and specificities. Differences were considered statistically significant when p values were less than 0.05 and fold change was larger than 1.5.    of lipid profiles is provided in Figure 1. The PCA score plots obtained for NSCLC group, LBD group, and HC group are shown in Figure 2. PCA revealed a clear separation between NSCLC patients and HC (Figure 2(a)). The parameters of the OPLS-DA score plots ( Figure 3) were showed in Table 2. As shown in Figure 3(a), the OPLS-DA score plot revealed a clear separation between NSCLC patients and HC, with good fitting and predictive performances (

Discovery and Identification of Potential Lipid Biomarkers.
The lipid metabolite features with variable importance in projection value (VIP) > 1.0, fold change (FC) >1.5, and P value < 0.05 were as the potential different lipid metabolites. As summarized in Tables 3-5. There were 60 specific lipid metabolites that can distinguish NSCLC from HC, 8 for NSCLC from LBD, and 44 for LBD from HC. PCs and PEs were significantly upregulated in serum of earlystage NSCLC compared to HC and LBD, which should be further externally validated by a targeted lipidomic analysis. The pathways that matched based on Kyoto Encyclopedia of Genes and Genomes (KEGG) database included glycerophospholipid metabolism, glycosylphosphatidylinositol-(GPI-) anchor biosynthesis, linoleic acid metabolism, alphalinolenic acid metabolism, and glycerolipid metabolism ( Figure 4). Table 6 listed the detailed results of the pathway analysis. Glycerophospholipid (GPL) pathway emerged at the top of these significantly altered lipid metabolic pathways.

Targeted Metabolomics Analysis.
We analyzed the change in the concentrations of 85 PCs and 53 PEs in the early-stage NSCLC, LBD, and HC groups. The levels of PCs and PEs were compared among the three groups using ANOVA with LSD test. The fold changes of the average of the concentrations of PCs and PEs were also calculated among them. As shown in Table 7  To estimate the diagnostic value of the targeted PCs and PEs, ROC analysis was further performed. The sensitivity, specificity, and area under the curve (AUC) of each lipid metabolite and the combination of PCs and PEs were presented in Table 7. It was found that single PC and PE did not have good diagnostic performance in distinguishing NSCLC from LBD or HC. However, as showed in Figure 7

Discussion
NSCLC is the most frequently diagnosed cancer with high mortality, partly ascribed to late diagnosis and poor prognosis. Many of the commonly used serum tumor biomarkers are limited to late-stage disease and have low sensitivity and specificity [32,33]. Currently, there are a handful of validated small molecular biomarkers for NSCLC that can be used to avoid the necessity of tumor biopsies for classifying NSCLC. But a new diagnostic technique with high accuracy for the diagnosis of NSCLC, particularly for distinguishing early cancer from benign lesions, is still needed in clinical practice.
Lipids were hydrophobic or amphipathic small molecules that originate entirely or in part by carbanion-based condensations of thioesters and/or by carbocation-based condensations of isoprene units [34]. Many studies have reported that   dyslipidemia, as a major component of metabolic syndrome, played an important role in the carcinogenesis of various cancers, including breast cancer [35], prostate cancer [36], and ovarian cancer [37]. For NSCLC, it has been well documented that lipidomics have shown potential for cancer diagnosis [27,38,39]. In our study, we identified PCs and PEs showing significant differences of serum concentration among HC, early-stage NSCLC, and LBD patients. GPL metabolism was the top altered pathway in the NSCLC samples. The serum concentrations of PCs and PEs were shown to increase in the  NSCLC patients, while the others decreased. These results might be caused by the regulation mechanisms of cellular metabolism. Phospholipids, one of the major components of cell membranes, participate in various biological functions, and their levels are altered in various human cancers [40,41]. PCs were known as the most abundant bilayer-forming phospholipids found in eukaryotic membranes and can contribute to proliferative growth in cancer cells [42,43]. Abnormal PC metabolism has been reported in cancer cells. Increased PCs levels have been reported in lung cancer, colorectal cancer, gastric cancer, pancreatic cancer, and so on and thus might be interpreted as a requirement for the high rate of cancer cell proliferation [44]. Additionally, increased levels of PCs may be correlated with the overexpression of choline kinase in various cancers [45]. In our study, the levels of PC    function, endocytosis, autophagy, stress responses, apoptosis, and aging. PE was also a target of potent anticancer natural products [46]. In our study, the levels of PEs in early-stage NSCLC patients were significantly increased compared with LBD and HC. Consistent with our finding, Fahrmann et al. also found that PEs tended to be elevated in serum from lung cancer patients compared to those with benign nodules [47]. Aberrant PE metabolism was also detected in other cancers, such as hepatocellular carcinoma, colorectal cancer, and breast tumor [48]. Huang et al. previously illustrated that A549 lung adenocarcinoma cells increase secretion of PE binding protein (PEBP), which was overexpressed in lung cancer and had been shown to modulate development, invasion, and metastatic potential of tumors [49]. Thus, we speculated that the elevation in PEs may, in part, act as agonists of PEBP-mediated signaling transduction. PE was found to consistently increase in tumors similar to PC. In our study, we found that single PC and PE did not have good diagnostic performance in distinguishing NSCLC from LBD or HC. Panels of PCs and PEs exhibited good performance in differentiating NSCLC, LBD patients, and HC, which should be further validated by a larger sample sizes.

Conclusions
We observed a significantly altered lipid metabolic profile in early-stage NSCLC using UHPLC-Q-TOF/MS-based nontargeted lipidomic analysis and identified panels of PCs and PEs to distinguish NSCLC, LBD patients, and HC. The identified PCs and PEs were further externally validated by a targeted lipidomic analysis. ROC analysis revealed that a panel of 14 PCs and PEs exhibited good performance in differentiating HC and early-stage NSCLC patients. A panel of 10 PCs and PEs exhibited good performance in differentiating HC and LBD patients. A panel of 2 PCs and PEs exhibited good performance in differentiating early-stage NSCLC and LBD patients. Our study has thus highlighted the power of using comprehensive lipidomic approaches to identify biomarkers and underlying mechanisms in NSCLC.

Data Availability
The data used to support the findings of this study are included within the article and its supplementary information files.

Conflicts of Interest
The authors declare no conflicts of interest regarding the publication of this paper.