Flow Cytometric Immunophenotyping of Mature Lymphatic Neoplasias Using Knowledge Guided Cluster Analysis

Flow cytometry is widely used for the immunological characterization of hematopoietic malignancies. Discrimination of normal and malignant cellular immunophenotypes is the most critical step in data analysis, especially if multi‐color analysis is performed on highly heterogenous cell suspensions. We therefore investigated, whether adaptive, simultaneous multiparameter gating allowed automated, operator independent analysis of data obtained from the immunophenotyping of blood or bone marrow samples with regard to the presence of non‐Hodgkin lymphoma cells. The identification of physiological and malignant cells was achieved by predefining population boundaries, based on the expectations of the population’s location in two‐dimensional dot plots. The prospective application of these predefined region boundaries in 52 blood and bone marrow samples enabled identification of lymphoma cells with regard to their presence and immunophenotype, based on the correlation of markers as defined in multiple tubes. Our data confirm that highly standardized data analysis methods can reduce the variability of analysis and support the expert in establishing a rapid classification of the sample.


Introduction
As a fast, objective and widely available method, flow cytometry has become an important tool for the * Correspondence to: Prof. Dr Gerd Schmitz, Institute for Clinical Chemistry and Laboratory Medicine, University of Regensburg, D-93042 Regensburg, Germany. Tel.: +49 941 944 6201; Fax: +49 941 944 6202; E-mail: gerd.schmitz@klinik.uni-regensburg. de. immunological lineage assignment and maturational analysis of hematopoietic malignancies [1][2][3]. Due to the rapid improvement of the sensitivity and specificity of the method and the availability of an increasing number of directly conjugated antibodies as well as of fluorochromes for multi-color analysis, distinct lymphoma cell immunophenotypes could be identified based on complex antigen coexpression patterns. Consequently, immunological data have been introduced into current classification schemes and consensus protocols [4][5][6][7]. Flow cytometric identification of physiological as well as malignant cell populations mainly depends on the recognition of specific antigen patterns in combination with light scatter signals [8]. Sensitivity and specificity of the method increase, the more information can be obtained simultaneously on the single cell level [9,10]. Concomitant with this increase in data from multi-color staining techniques, methods of data analysis become crucial with regard to adequate data interpretation, timely reporting as well as inter-laboratory comparability of results. Identification of malignant cells by gating procedures is the most critical step in flow cytometric data analysis, especially if multi-color analysis is performed on highly heterogeneous cell suspensions such as bone marrow. Quality assessment schemes have revealed, that standardization of data analysis beyond the standardization of staining and acquisition techniques further improves agreement in immunological diagnosis [11,12].
Beside the visual recognition of pathological cell populations solely dependent on their scatter characteristics, backgating with the help of a pan-leukocyte antigen (e.g., CD45) is currently perhaps the most common method for the identification of lymphocytes and further characterization of their immunophenotype [13]. The combination of scatter and fluorescence signals, such as CD45 fluorescence versus side scatter, has been shown to be superior to traditional scat-ter gating [14,15] and to allow to identify physiological bone marrow cell populations as well as leukemic blasts, with recources necessary to perform reproducable automated gating [16][17][18]. However, for the analysis of lymphoma cells, gating on populations upon scatter signals and CD45 fluorescence may exclude relevant populations with an abnormal scatter signal such as plasma cells or hairy cells, which are often outside the signal range in linear analysis, or show a missing or low CD45 expression (plasma cells) [6,19,20]. Therefore, a further evaluation of the sample for relevant abnormal phenotypes still is necessary to ensure, that all relevant cells are included in the gate. As the introduction of CD45 fluorescence for lymphocyte gating markedly improved gating accuracy, definition of cell populations based on a complex pattern of markers such as several fluorescence and scatter signals should result in an even more accurate gating of distinct cell populations.
Therefore, we evaluated the use of an adaptive, simultaneous multiparameter gating for lymphocyte immunophenotyping, in order to develop this approach for the analysis of lymphocytes in the blood or bone marrow of patients suspected to have non-Hodgkinlymphoma. The simultaneous analysis of multiple scatter and fluorescence parameters on single cell level is achieved by predefining population boundaries according to the expectations on the location of normal and pathological cell populations in multiple two parameter dot-plots. Furthermore, a simultaneous analysis of normal B-cells, T-cells and myeloid cells improves the overall discrimination of abnormal cells, and allows the description of concomitant secondary alterations of non-malignant bystander cell populations.

Patients
Diagnosis and classification of lymphomas from patients presenting with suspect for or during follow up of B-CLL, hairy cell leukemia, plasmocytoma, lymphoplasmacytoid lymphoma/immunocytoma or follicular center lymphoma were based on peripheral blood lymphocyte morphology and histology of bone marrow and/or lymph nodes. The final classification of the type of non-Hodgkin lymphoma was performed according to the REAL-classification [4]. The histologic reviewers were blinded to the flow cytometric results.

Three color fluorescence immunophenotyping
Blood or bone marrow samples were analyzed by flow cytometry according to a standardized protocol. Samples of whole blood or bone marrow were collected using preservative-free heparin as anticoagulant and prepared for flow cytometric analysis within 6 hours. The same combination of antibodies, directly conjugated either to fluorescein isothiocyanate (FITC), R-phycoerythrin (R-PE), peridinin chlorophyll (PerCP) or tandem conjugates of R-PE and indocarbocyanine (Cy5) was used for surface staining of membrane antigens to determine the immunophenotype (Table 1) Surface staining of antigens was performed by incubating 100 µl of unseparated bone marrow or blood with saturating concentrations of antibodies (4 • C, 15 min), followed by lysis of erythrocytes using the erythrocyte lysing solution from Becton Dickinson (FACSlyse) and a washing step with 3 ml of Dulbecco's phosphate buffered saline without Ca 2+ or Mg 2+ (PBS, Biochrom, Berlin, Germany). In the case of biotin-labeled antibodies, cells were further incubated for 10 min with R-PE/Cy5 conjugated streptavidin followed by two washing steps. Staining for surface expression of light chains was performed after washing cells from the whole blood sample three times prior to further incubation steps, in order to remove unspecifically bound and plasma immunoglobulins. For the analysis of the intracellular expression of immunoglobulin light chains, fixation of 50 µl of sample was carried out with Fix&Perm reagent A (An der Grub, Vienna, Austria) for 5 min at room temperature. This was followed by a washing step and further incubation with saturating concentrations of CD38, antikappa and anti-lambda antibodies for 15 min in presence of the permeabilizing reagent B, and a final washing step with PBS. The final pellet was resuspended in PBS and analyzed using a FACScan flow cytometer with LysisII or CellQuest software (Becton Dickinson, San Jose, CA, USA).
The cellular light scatter signals and three fluorescence signals of at least 20,000 nuclear cells per tube were analyzed in list mode at 1024 channel resolution using forward scatter as the trigger parameter. Compensation was adjusted using FITC and R-PE coated microbeads (Becton Dickinson) and peripheral blood lymphocytes stained with CD3 (PerCP), CD4 (PE) and CD8 (FITC) as a biological control.

Data analysis
Data analysis was performed using the software "Attractors" from Becton Dickinson. Within a pilot study, which included healthy controls (n = 14) and patients with known infiltration of the bone marrow by B-CLL (n = 5), hairy cell leukemia (n = 3) or plasmocytoma (n = 3), population boundaries ("attractors") were defined for the simultaneous gating procedure, based on the location of single populations of cells in a set of two-dimensional dot plots. Five parameters (FSC, SSC, FL1, FL2, FL3) were used to create 10 two-dimensional dotplots for each single tube. As predefined antibody combinations were used (Table 1), it was possible to define one set of attractors for each tube of antibody combinations, resulting in the identification of normal cells as well as lymphoma cells based on their characteristic phenotype.
Subsequently, 52 samples were prospectively analyzed using the predefined set of attractors. The singletube-level-protocols were combined into a batch protocol, for the analysis of a set of data files in a given sequence. Data were written into a spreadsheet and transferred into a final report sheet after calculation of cell populations and correlation of markers from different tubes.

Analysis of cell populations
Abnormal cells may strongly vary regarding coexpression patterns of antigens as well as scatter characteristics. Therefore, classification might be improved by the initial identification of normal cell populations, in order to facilitate the discrimination of the aberrant phenotypes. For that reason attractors for known and identifiable cell populations, such as for the characterization of granulocytes, monocytes, T-and Blymphocytes were defined within each tube. Specific antigens respectively antigen patterns were used to identify distinct cell populations in combination with scatter signals: granulocytes were identified by scatter signals and expression of CD45+, HLA-DR−, Table 1 Antibody combinations used for the characterization of lymphocyte populations, and selected according to antigen coexpression patterns during normal maturation of cells. PE-conjugated antibodies were mainly selected for antigens of low expression density  with low expression of CD45 and negativity for CD14 and HLA-DR. Red blood cell precursors were characterized as CD45 negative in combination with lymphocyte like scatter signals (Fig. 1).
The identification of physiological myeloid and lymphatic cell populations by attractors defined upon characteristic scatter and fluorescence signals offered the possibility to use multiple cell populations known to express a distinct antigen pattern as positive/negative internal staining control, respectively as internal checks for quality control of measurement and data analysis by the knowledge guided cluster analysis. Following the automated classification, single cell populations were displayed by different colors for better optical discrimination, and all events, that could not be assigned to predefined attractor regions were displayed as black dots, supporting the recognition of not classified events (Fig. 1). The accuracy of the staining procedure as well as of the data analysis procedure could further be demonstrated by comparing the percentage of B-cell populations analysed independently by different lineage specific antibodies in different tubes (Fig. 2).

Identification of clonal B-cells
Automated data analysis should allow the automated flagging of pathological or suspect results. Evaluation of the immunophenotyping results from the pilot Table 2 Immunological characterization of B-cell derived lymphoproliferative malignancies (according to Rothe   Plasmocytoma could be identified in all cases by further analysis of those samples containing more than 0.3% plasma cells (% of nuclear cells) and detection of clonality, whereas only one sample from healthy donors contained more than 0.3% of plasma cells.

Diagnostic accuracy of automated lymphoma immunophenotyping
Regarding the identification of malignant cells according to their distinctive patterns of antigen expression, several immunophenotypes were predefined by attractors according to the literature on antigen expression of lymphatic neoplasias ( Table 2) as well as typical samples analysed during the pilot study. Attractors, defining the location of lymphocyte populations with a diagnostic antigen coexpression pattern (e.g., CD19+/CD5 strong ) as well as attractors, fixed re- garding their upper and lower limit of fluorescence intensity enabled classification of lymphatic cells due to antigen coexpression as well as antigen density (high/low) (Fig. 3).
Summarizing the classification results for all tubes within a predefined spreadsheet for final calculation of percent antigen positive lymphocytes allowed the characterization of the immunophenotype for the lymphocyte populations, based on the correlation of results from different tubes. In case of clonal excess of an immunoglobulin light chain or an excess of plasma cells was demonstrated, classification of the lymphoma type was performed manually by the operator using a given classification tree (Fig. 4), and, finally, data were compared to the morphological results as a reference ( Table 3). All B-CLL samples exceeded the limit of 50% CD5 strong /CD23+/CD22+ and sIg weak B-cells by far, whereas no sample from the control group or from patients with hairy cell leukemia or plasmocytoma presented with comparable high ratios of this B-cell immunophenotype. By combining the results for % CD5+ B-cells and % CD22 low expressing B-cells, a scatter diagram could be drawn that clearly sepa- rated a cluster for the B-CLL-immunophenotype from other B-cell immunophenotypes (Fig. 5). Six samples were correctly classified by the automated knowledge guided cluster analysis as containing hairy cells. At least 7 samples were classified as lymphoplasmacytoid lymphoma/immunocytoma or follicular center lymphoma based on morphological data. These samples differed in their immunophenotye, 4 presenting with the above mentioned B-CLL immunophenotype. The other 3 samples could not be precisely classified solely according to their immunophenotype, although the samples were recognized as containing pathological cells.

Discussion
The identification and characterization of non-Hodgkin-lymphoma cells can be supported by flow cytometric immunophenotyping techniques, especially by analyzing (1) the κ/λ-ratio of immunoglobulin pro-ducing B-lymphocytes [21], as well as (2) the immunophenotype, which has been included into current classification schemes and is at least in part characteristic for distinctive disease entities. The relationship between antigen coexpression pattern, antigen density and cell size/granularity adds valuable information for the classification of single cell populations, but on the other hand contributes to a high complexity of data [22], and the time consuming identification of normal and abnormal cell populations by conventional, non-standardized gating procedures demands special expertise of the operator.
In this article we present a new approach to data analysis, using a knowledge guided cluster analysis, enabeling adaptive, simultaneous multiparameter gating for the identification and characterization of B-cell lymphoma as well as physiological cells in blood and bone marrow. Antibody clones, combinations and conjugations were kept constant during the time of our study, resulting in reproducible locations of single cell populations in two-parameter dot plots. These comparable staining results supported the application of a knowledge guided cluster analysis for data analysis, as multispace population boundaries had to be predefined for physiological as well as pathological immunophenotypes, based on the expectations about the location of single cell populations in two parameter dot plots. In addition, the software automatically adjusted these boundaries within user-specified limits to the populations' center, which may drift between different measurements, e.g., because of differences in antibody conjugation efficiency between different lots. As the result of the classification process, all classified cell populations are encoded by different colors for better visual presentation of results and all events, which cannot be assigned to a predefined region, are displayed as black dots, leading to a markedly improved visual control of results, e.g., for staining deficits or abnormal cells. With regard to the reproducable processing of list mode files the "Attractors" software has already been demonstrated to be capable of classifying normal T-cell subsets from peripheral blood [23]. As several antigens applied to the identification of B-cell phenotypes can also be found on T-cells or even myeloid cells, these populations could be characterized in parallel. The simultaneous classification of B-cells, T-cells and myeloid cells within the sample supported the discrimination of abnormal B-cell phenotypes because of a higher purity of the B-cell defining attractor regions and offered additional controls with regard to plausibility of staining results. The analysis of all cellular el-ements within the sample should furthermore promote the analysis of concomittant secondary alterations of non-malignant bystander cells, which may add valuable information with regard to, e.g., concomitant bone marrow dysplasia or even for the classification of the lymphoma type itself [24].
The quantitative analysis of antigen expression densities, as currently established, e.g., for surface immunoglobulin, CD20 or CD22 in non-Hodgkin-lymphomas [25][26][27][28] or CD45 and CD20 in childhood Bprecursor acute lymphoblastic leukemia [29] enhances the information available for the characterization of lymphomas and other hematopoietic malignancies. Using the Attractors software, a semiquantitative analysis of antigen densities could be achieved by introducing regions fixed regarding their upper and lower limit of fluorescence signal intensity. This enabled the classification of cells into antigen strong and low expressing cells, adding valuable information especially for the diagnosis of B-CLL.
Regarding the recognition of B-NHL based on an abnormal κ/λ-ratio (reference range 0.5 to 2.5), all pathological samples were flagged correctly, without need for reanalysis of single data files. Two out of 8 controls were indicated false positively because of an abnormal κ/λ-ratio based on the above mentioned reference range, which is similar to reference ranges reported by others [22]. Seven samples classified as lymphoplasmacytoid lymphoma/immunocytoma or follicular center lymphoma according to the morphological data differed in their immunophenotype. Four samples presented with the above mentioned B-CLL immunophenotype, 3 samples could not be classified based on their immunophenotype, although the samples were correctly recognized as containing pathological cells. These results mirror the current problems in obtaining a definitive diagnosis in some cases of non-Hodgkin-lymphomas. As already recognized within the REAL-classification, especially the group of lymphoplasmacytoid lymphomas/immunocytomas does not appear to define a single entity, as many B-cell lymphomas may occasionally show differentiation to plasmocytoid or plasma cells, including B-CLL, mantle cell, follicular center, and marginal cell lymphomas. Flow cytometric analysis may help to classify these disorders according to their major features, whereas the term lymphoplasmacytoid lymphoma/immunocytoma should be restricted to lymphomas lacking expression of CD5 and characteristic features of other lymphoma subtypes. In general, these problems may at least in part be resolved by the use of further improved immunological classification schemes and the availability of additional specific marker antigens, such as CD79a (B-cells), CD103 (hairy cells) or CD138 (plasma cells).
Using the possibilities of batch analysis, knowledge guided cluster analysis reduces work load regarding the time consuming individual gating procedures on the lymphoma cell population, and, as the data analysis algorithms are predefined and constant, a better comparability between different samples can be achieved. Transfer of data obtained by the cluster analysis directly into a worksheet allows rapid calculation of single populations' immunophenotype by correlation of results from different tubes as well as reporting and documentation of results. Especially lymphomas of the T-cell origin can only be identified based on the analysis of the antigen coexpression pattern in flow cytometry, which may proof, e.g., the loss of a pan-T-cell antigen. Therefore, the reproducible correlation of different markers is a critical step in data analysis, that can be standardized by the use of a knowledge guided cluster analysis. The definition of individual attractors to analyze abnormal cell populations of a patient will also support the monitoring for residual malignant cells upon therapy, if the attractor combination is used for the evaluation of subsequently collected samples, and it can be performed even in acute leukemias, as the phenotypic features relevant for the identification of leukemic blasts seem to be usually unaltered at relapse [30].
In conclusion, the results demonstrate the ability of knowledge guided cluster analysis techniques to recognize complex immunological phenotypes in peripheral blood and bone marrow samples, allowing a rapid detection and classification of non-Hodgkin-lymphoma cells.