Immune Infiltration Analysis with the CIBERSORT Method in Lung Cancer

Background Immune infiltration of lung cancer (LC) is tightly related to clinical results. Nevertheless, past researches have not elucidated the diversities of functionally different cellular types making up the immunoresponse. Methods In the present research, on the foundation of a deconvolution algorithm (CIBERSORT) and clinically annotated expression profiles, our team studied the tumor-infiltrating immune cells (TIICs) presenting in 502 LC samples and 49 normal samples in a comprehensive way. The fraction of 22 immunocyte subgroups was assessed to identify the relationship among every cellular type and survival and reaction to chemical therapies. Results Consequently, profiles of immunity infiltration change remarkably between paired tumor and precancerous tissues, and the change can describe the diversity of individuals. Of the cellular subgroups studied, cancers without dendritic resting cells or with a decreased quantity of follicular helper T (Tfh) cells were related to the poor prognosis. Correlation analysis between different stages of LC and 22 immune cell subpopulations revealed that the amount of 14 immune cells in LC was remarkably related to tumor stage. The high expression of resting dendritic cells and follicular helper T cells predicted better prognostic value, and univariate analyses proved that two TIICs were significantly associated with patients' prognosis. Conclusions To sum up, the data herein reveal that there may be subtle differences in the cell constituents of the immune infiltrate in LC, and those diversities may be vital determinating factors of prognostic results and reactions to therapies.


Introduction
Tumor-infiltrating immune cells (TIICs) are remarkably associated with prognostic results and determination of immune therapy targets in lung cancer (LC) [1]. LC is the major cause of tumor-related death across the globe and causes 1.6 million deaths each year [2]. Metastasis is responsible for the high death rate in LC While significant progress in treatment choices for LC sufferers was developed, patients suffered from LC have worse prognosis and limited treatment options [3,4]. Moreover, LC is often diagnosed with terminal stage cancer, which makes only palliative treatments acceptable. The genome variations in carcinoma are deeply studied to distinguish sufferer subtypes with diverse prognostic results [5]. Accumulating evidence proved that the abnormal phenotypes of carcinoma are determined by the sophisticated mutual effects of a variety of cellular types in the TME, especially TIICs [6,7].
As an immune sensitive cancer, LC infiltrated by an inhomogeneous immune cell subpopulation of TIICs, such as T cells, DCs, macrophagus, neutrophils, mast cells, and the type, density, and site of TIICs in LC contains significant prognostic value [8]. Previous studies mainly relied on immunohistochemistry and flow cytometry to assess the profile of TIICs subtypes. However, flow cytometry requires precise and careful processing of samples, and immunophenotyping cannot identify enough immune populations [9,10]. After Newman developed a bioinformatic tool-CIBERSORT algorithm, offer a speculation of the quantity of member cell types in a blended cellular population with genetic expression data [11,12]. The CIBERSORT has the advantage in accurately evaluating the defined fraction of the 22 closely related type immune cells by only applying signature gens of bulk tumor samples. CIBERSORT enables immunocyte profiling by deconvolution of genetic expression micro array data [13]. In the present research, our team applied deconvolution algorithm (CIBERSORT) to assess the relative proportions of immunocytes in 49 adjacent samples and 502 LC samples.
In this research, we made use of CIBERSORT to evaluate the 22 TIIC subtypes in LC to elucidate TIICs' dedicated association with molecule subgroup, survival ratio, and reaction to chemical therapies. This study explains the association between the inhomogeneity of TIICs and illness development in LC.

Materials and Methods
2.1. Data Acquisition. We acquired genetic expression profiles of LC (n = 502) and healthy specimens (n = 49) and clinic features including medication history, histologic grade, pathologic stage, and survival information for patients with LC from the relevant sufferers from TCGA. In the case of whose prognosis data were not correlated with their expression profiles, our team went through the supplements of these missing information. RNA sequence data were standardized via the average-variance model at the observation level approach, which transformed enumeration data to result more like those from microarrays, as clinic data, excluding LC sufferers who had missing information of age, gender, TNM stage, local invasion, survival time, and disease-free survival. Then, LC patients (n = 463) with complete information were included. We manually arranged every expression information and corresponding clinical data. Our study followed the instruction of profiling TIICs with CIBERSORT [9].

Analyzation of TIICs.
CIBERSORT analysis tool is a gene expression-based arithmetic, which uses a series of bar code genetic expression results (a "signature matrix" of 547 genes) to assess data of immunocyte constituents from bulk cancer specimens [14]. To realize the precise quantification of the fraction of 22 immunocyte types in LC samples, standardized genetic expression data sets were employed and sent to the CIBERSORT web portal (http://cibersort .stanford.edu/) and set the quantity of permutations to 1,000. An overall of 22 TIIC types and CIBERSORT metrics, such as CIBERSORT p value, Pearson's correlation coefficient, and root mean squared error (RMSE), was subjected to quantification for every specimen [15]. The statistic significance of the deconvolution results of the entire cellular subgroups was represented by the CIBERSORT p value, which was employed to exclude the deconvolution with less remarkable fit precision. For the purpose of meeting the demand of a CIBERSORT p ≤ 0:05, healthy specimens (n = 49) and LC specimens (n = 520) were chosen. Every sample was quantified under 22 types of TIICs and CIBER-SORT metrics as Pearson correlation coefficient, CIBER-SORT p value, and RMSE. CIBERSORT p value offers an identification of confidence in the outcomes. As p < 0:05, the evaluation of the inferred portion of immunocyte subsets evaluated by CIBERSORT was regarded reliable.

Statistics.
Our team completed the statistic assays by R software 3.5.2 and IBM SPSS Statistics 20.0. The ideal prognostic model was analyzed by the LASSO Cox regression to evaluate immune cell subtypes using the glmnet package in R [16]. The layer clustering of immunocyte fractions was applied to reveal different immunocyte infiltrations among diverse specimens. We valued the levels of 22 TIIC subpopulations between 0 and 1 in this assessment. Applied R packages "Corrplot," "Pheatmap," and "Vioplot" determinate variations in the mixture of TIICs among these groups. Wilcoxon test was employed to assess the association among cancer grades and molecule-level subgroups of cancer and TIICs. Log rank test and Kaplan-Meier (K-M) curve was also applied to confirm the relationship between TIICs and survival. Multivariate analysis was employed for in-depth study to select independent predicting factors. AUC and cut-off value were acquired from ROC curve. "Limma" package was applied to analyze the differentially expressed gene (DEG), and filters were set at |log 2 FC | >1:3219 and FDR < 0:05. We verified that variation between inferred levels of TIIC cellular subset and survival was examined via Cox regressive method. Exerted assays for patients with/without LC to elucidate the basic difference in LC and for known violations of the Cox proportion risk hypothesis in TIIC levels may be an inherent characteristic that can characterize the diversities between individuals. Finally, the proportions of immune cells from 463 LC patients' tissues and 49 adjacent samples displayed distinct group-bias clustering and individual differences.

Composition Difference of Immune Cells in LC Samples
and Adjacent Samples. After the operation of manual selection, we enrolled 463 tumor tissues and 49 adjacent tissues as the training and validation cohorts, respectively, initially, normalizing the gene expression data with "Limma" package, followed by assessment of the difference of immune infiltration of LC specimens in 22 subtypes of immunocytes with the CIBERSORT algorithm and define the sum of 22 subsets immunocytes in every specimen as 1. Figure 1 depicts the fraction of the entire 22 subtypes of immunocytes in each sample and as the hierarchical clustering revealed TIICs, such as NK cell resting, monocytes, and plasma cells displayed distinct distribution differences in LC samples and adjacent samples (Figure 2).

Correlation Degree of 22 Immune Cell Subgroups in Each
Sample. Notably, it was the fractions of immunocytes that changed remarkably in LC specimens and adjacent samples. We could easily find that T cell CD 4 memory activated and T cell CD8 exerted a remarkable positive association; 2 Disease Markers nevertheless, an obvious negative association between T cell CD8 and macrophage M0 was showed by average linkage clustering (Figure 3).   3 Disease Markers days' survival time and manually organized the expression profiles of every specimen and relevant clinic data. Total sample was randomly separated into the experiment group and validation group, and the ratio is 7 : 3 (experiment group: validation group). Univariate analyses were used to value immune cell infiltration and corresponding survival time. Table 1 and Figure S1 show the survival analysis results of 22 immune cell subpopulations. Figure 5 shows the high expression of dendritic cells at rest (p = 0:045) and T cell follicular helper cells (p = 0:021). It has a good predictive value for the prognosis. The univariate analysis proves that the two TIICs are significantly related to the patient's prognosis and are of great significance for postoperative immunotherapy of lung cancer.

Discussion
In this study, we report an extensive evaluation of LC TIICs in 502 tumor samples and 49 adjacent tissues. The CIBER-SORT analytical tool gives us a great advantage to specifically analyze the essential fractions of 22 subpopulations TIICs in bulk cancer specimens. And the insight of TIICs may be helpful to explain the initiation and development of LC. Moreover, genes which are uniquely expressed in LC samples could be precious predictor in diagnosis and prognosis, but little research has highlighted the differential distribution of immunocytes between diverse constituents.
The complex and unique communities of cell life are called microenvironments by scientists. The microenvironment has many characteristics that affect cell growth, behavior, and how to communicate with other cells nearby [17][18][19]. Different types of tumor cells interact with different types of immune cells. These immune cells have the function of helping or attacking tumors [20,21]. The hierarchical clustering revealed that TIICs, such as NK cell resting, monocytes, and plasma cells, displayed distinct distribution differences in LC samples and adjacent samples. The violin plot indicated that an obvious difference existed in the   And we could easily find that T cell CD4 memory activated and T cell CD8 exerted a remarkable positive association; nevertheless, an obvious negative association between T cell CD8 and macrophage M0 was showed by average linkage clustering. CD4+ T cells serve as a vital immunocyte in the immunosystem of mankind. CD4 is primarily expressed in Th cells, which can realize the binding to the nonmultipeptide areas of MHC class II molecules and participate in the recognition of antigens by T cell antigen receptors (TCR) [22,23]. Signal transduction was as follows: relevant research has discovered that in tumor immunity, CD4+ T cells can activate CD8+ T cells through a variety of mechanisms to differentiate into cytotoxic T lymphocytes (CTL), while maintaining and strengthening the antitumor response of CTL [24]. In recent years, studies have found that macrophages account for 50% of the total weight of tumors. These cander-related macrophagus not only stop T cells from eliminating oncocytes but excrete growth factors to facilitate oncocytes and cancer angiogenic activities, causing the spread of cancer cells [25][26][27][28].
Univariate analyses were used to value immune cell infiltration and corresponding survival time. Highly expression of resting dendritic cells (p = 0:045) and follicular helper T (Tfh) cells (p = 0:021) predicted a better prognostic value, and univariate analyses prove that two TIICs were significantly associated with patients' prognosis. Dendritic cell (DC) is an important antigen-presenting cell (APC), which    [29,30]. It is the main APC that activates naive T cells in the body, and Tfh cells are a new CD4+ helper T cell subgroup. More and more studies have shown that Tfh cells and their cell factors are vital for tumors and autoimmune diseases [31][32][33].
In conclusion, our study revealed distinct immune phenotypes for molecular LC subclasses. Hence, our team suggests that differences in TIIC fractions may be an inherent characteristic that can characterize the difference of individuals. Those discoveries strengthen the comprehension of immunoresponses in LC cancers and might exert an indispensable effect on the design of effective immunotherapeutic strategies.

Data Availability
The original data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
The authors declare no competing interests.