Evaluation of Residual Cellularity and Proliferation on Preoperatively Treated Breast Cancer: A Comparison between Image Analysis and Light Microscopy Analysis

Histopathology has been suggested as a reliable method for tumour reduction evaluation of preoperatively treated breast cancer. Immunocytochemistry can be used to enhance the visibility of residual tumour cellularity and in the evaluation of its proliferative activity. We compared Image Analysis (IA) with Light Microscopy Analysis (LMA) on sections of breast carcinomas treated with preoperative chemo‐ or chemo/radiotherapy in the evaluation of the Neoplastic Cell Density (NCD) (69 cases) and the Proliferation Index (PI) (35 cases). NCD was expressed as the immunoreactive area to cytokeratin over the total original neoplastic area and PI was expressed as the number of immunostained tumoural nuclei with MIB1 MoAb over the total of tumoural nuclei. The intraobserver agreement and that between IA and LMA for both indices were estimated by the common (Kw) and the jackknife weighted kappa statistic (K~w). The extent of agreement of each considered category was also assessed by means of the category‐specific kappa statistics (Kcs). The intraobserver agreement within LMA for NCD and PI and that between IA and LMA for PI were both satisfactory. Upon evaluation of the NCD, the agreement between IA and LMA showed unsatisfactory results, especially when the ratio between the residual tumour cells and the background was critical.


Introduction
Preoperative chemotherapy has become an integral part of the primary treatment of breast carcinoma. Such treatment has been performed in both advanced tumours, to reduce tumour size and allow conservative breast surgery [8,22,47], and in small breast cancers, to reduce the risk of local recurrence in limited surgery [39,42,44]. The correct interpretation of tumour response to chemotherapy has long been recognized as an important concept in medical oncology [23], but it remains highly controversial due to its subjectivity and inaccuracy in clinical and radiographic assessment. To improve the evaluation of tumour response to therapy, histopathology has been suggested as a more reliable method [38]. The pathologist, however, is faced with the problem of residual tumour cellularity recognition on routine sections, because of a large spectrum of cyto-histologic changes induced by chemotherapy [10,23,42] which prevent an accurate evaluation of tumour cytoreduction. In this case, immunocytochemistry can be used to enhance the identification of residual tumour cellularity, taking advantage of the immunoreactivity of cytokeratins preserved within neoplastic cells even after preoperative treatment [23,24]. Moreover, due to the impossibility of grading the tumour because of post-therapy cyto-histologic changes, the use of immunocytochemistry permits the evaluation of the proliferative activity of the residual neoplastic cells providing an important tool for the prediction of subsequent clinical behaviour.
One of the most important problems related to immunoreactivity evaluation is its subjectivity. Many authors, however, argue that this problem can be solved by combining immunocytochemistry with quantitative image analysis [2,3,41].
The aim of this study is to compare quantitative immunocytochemistry by Image Analysis (IA) with traditional semiquantitative immunocytochemistry by Light Microscopy Analysis (LMA) in the evaluation of both residual cellularity and its proliferation activity on breast cancer treated preoperatively with chemo-or chemo/radiotherapy. To evaluate the concordance pattern between the two methods an extention of an already published statistical approach was adopted [48].

Patients
We considered 73 consecutive cases of female breast carcinoma less than 2.5 cm in diameter, treated at the Istituto Nazionale per lo Studio e la Cura dei Tumori of Milan (Italy) from February 1991 to August 1992 and who underwent preoperative chemo-or chemo/radiotherapy. Patient records were retrieved from institutional clinical files. The diagnosis of breast carcinoma was initially established on the basis of a fine-needle biopsy. The tumours were treated with primary chemotherapy alone (Adriamycin: 9 cases, or Fluorouracil-Novantrone-Endoxan (FNC) regimen: 22 cases) or with primary chemotherapy (FNC: 22 cases, or Fluorouracil-Carboplatin-Endoxan (FCC): 20 cases) with subsequent radiotherapy before surgery (tumourectomy plus axillary node dissection) and staged according to the residual pathological tumour node metastasis classification (pTNMR) [20,46].

Histology and immunocytochemistry
Surgical specimens were processed according to a standardized procedure. Histological classification was performed according to Rosen and Obermann [40]. Formalin (35 sections) or Bouin (38 sections)-fixed paraffin-embedded sections were routinely stained with Hematoxylin and Eosin and were examined in order to determine the area of the tumour, the presence of residual tumour cells and their relationship with the surrounding stroma. Due to the difficulty in measuring the original tumour area, we defined the portion of mammary gland showing definite histologic alterations (such as the presence of densely collagenized connective tissue) as the area previously occupied by the primary tumour and which we consider the so called "bed of the tumour" in accordance with the literature [42]. This area contains residual tumour cells with an asymmetrical distribution or cells increased in size with vacuolated cytoplasm or histiocyte-like cells. These findings are reported in prior studies describing tissular and cellular changes in the region of the original breast tumour mass [8,17,23]. Five micron thick sections on polilysinated slides were immunostained for Neoplastic Cell Density (NCD) evaluation, defined as the cytoplasmic immunoreactivity to a pool of low molecular weight cytokeratin MoAbs (35betaH11, dilution 1 : 100, Dako, Milan, Italy; CAM 5.2, dilution 1 : 50, Becton Dickinson, Mountain View, CA, USA) and for Proliferation Index (PI) determination, defined as the nuclear immunoreactivity to MIB1 MoAb (dilution 1 : 100, Immunotech, Marseille, France). Only formalin-fixed tumour samples were selected for the PI because of the unreliability of other fixatives [6]. Immunostaining was performed by a sensitive streptavidin-biotin immunoperoxidase method (streptavidin HRP: Horse Radish Peroxidase) as a modification of the avidin-biotin complex method. Briefly, the sections were incubated in methanol/H 2 O 2 for 30 min to suppress endogenous peroxidase. For MIB1 immunostaining, slides were placed in a citrate buffer solution 5 mM, boiled in a microwave (700 W; 15 min) and then allowed to cool down to room temperature (30 min); for cytokeratin immunostaining, slides were placed in a 0.05% trypsin solution 37 • C (17 min). Subsequently, both procedures shared the same treatment: slides were briefly washed with Tris-Buffered-Saline (TBS), blocked with goat normal serum (30 min), and then covered with primary antibody overnight (4 • C). Following rinses in TBS, biotynilated secondary antibody (30 min) (Dako) and streptavidin-peroxidase complex (Dako) were applied in succession (30 min). The sections were then exposed to a solution of aminoethylcarbazole (10 min), washed in distilled water, counterstained with hematoxylin and mounted with Kaiserglycerolgelatin.

Light microscopy analysis (LMA)
For LMA, the sections immunostained with cytokeratin and with MIB1 were blindly submitted twice to an experienced pathologist (time elapsed between the first and the second examination: 2 months). We consider for quantitation all immunoreactive neoplastic cells, irrespective of staining intensity. The immunoreactivity was classified into 4 categories, according to the percentage of area immunoreactive to anti cytokeratin Ab for NCD and to the fraction of neoplastic nuclei immunostained for MIB1 MoAb for PI (category I: 25%, category II: 25-50%, category III: 50-75%, and category IV: 75-100%). In our experience, this categorization was the most appropriate scaling system for a semiquantitative evaluation of immunoreactivity.

Image analysis (IA)
Quantitative IA was performed on a VIDAS-CIRES cell image processor (Zeiss, Kontron Elektronik, Oberkochen, Germany). The analyzer was linked to a Zeiss Axioskop microscope, the images were scanned by a color video camera (JVC; 3-CCD, KyF30) and digitized into 512 × 512 pixels corresponding to 256 grey levels per colour component. The analyzer was coupled with a computer for subsequent data processing. The software packages for immunoreactivity determination for cytokeratin and MIB1 MoAbs on the tissue sections were adapted for this purpose. After calibration of illumination, the calibration of threshold levels to identify positive objects (areas for cytokeratin and nuclei for MIB1) and then interactive gray segmentation to identify all objects (positive plus negative) was performed.
Sections immunostained with anticytokeratin MoAbs were evaluated at 25× magnification (2.5× microscope objective lens and 10× oculars). Number of fields evaluated for cytokeratin was relative to extension of the tumoural area. The results of each evaluation were expressed as the mean percentage For the MIB1 analysis, the sections were evaluated at 400× magnification (40× microscope objective lens and 10× oculars). After exclusion of stromal and inflammatory components and the separation of overlapping particles, the count was expressed as the fraction of neoplastic nuclei immunostained with MIB1 MoAb over the total number of nuclei analyzed.
To compare IA with LMA, we categorized the continuous IA results into four classes by dividing the range of distribution into four equally spaced classes (categories I-IV).

Concordance analysis
To correctly evaluate the agreement between IA and LMA it was first necessary to investigate the intraobserver agreement (reproducibility). A poor intraobserver agreement excludes the possibility of a good agreement between IA and LMA [43]. To evaluate the type of concordance patterns (both intraobserver and IA and LMA agreements), an approach was adopted based on the weighted kappa statistic computation that allows one to correct the observed agreement for chance. Keeping account of the relative seriousness (i.e., the distance between the categories) of the different kinds of disagreements, this statistic is the most widely accepted measure of concordance when, as in our case, the considered data arise from an ordinal scale. Values for weighted kappa statistic usually lie between zero (chance agreement) and 1 (perfect agreement). It is actually possible obtain a negative value from situations where there seems to be less than a chance agreement [9]. Due to the small sample size, besides the common (κ w ) [14] one, the jackknifed ( κ w ) [15,25] estimate of the weighted kappa statistic was also performed. This consists in a useful statistical method for reducing the bias of an estimation procedure and for obtaining robust small confidence intervals for a parameter [36]. Finally, in order to evaluate the contribution of each category to the overall unweighted agreement, the kappa category-specific statistics (κ cs ) and their weighted averages (the unweighted kappa statistic (κ u )), were estimated [12]. All the kappa statistic values were presented with the relative Asymptotic Standard Error (ASE) and 95% Confidence Interval (CI).
Each kappa statistic value was interpreted in a qualitative manner adopting the Landis and Koch [26] classification criteria (Table 1).

Concordance analysis
The pattern of agreement observed is reported in the concordance Table 3. For both cytokeratin and MIB1 determinations (Table 3(a) and (b)), the reproducibility level was satisfactory (almost perfect agreement). The relative weighted kappa statistic values were quite overlapping using both the estimate approaches (common and jackknifed) ( Table 4). The contribution of each category to the κ u was satisfactory ( Table 5). The only exception was represented by category II for the cytokeratin determination: it showed a moderate agreement with a κ cs lower than the correspondent κ u .
Regarding the agreement between IA and LMA, a different pattern for cytokeratin and MIB1 was found. For cytokeratin (Table 3(c)) the κ w was smaller than the κ w . Neither was satisfactory (fair and substantial agreement, respectively) ( Table 4). Considering the contribution of each category, only the first showed a substantial agreement, while the remaining three categories presented a moderate (category IV) or slight (categories II and III) agreement. In addition, for the "middle" categories (categories II and III) the κ cs was lower than the corresponding κ u . Conversely, for MIB1 (   the level of agreement between IA and LMA was almost perfect according to both the κ w and κ w values. All categories contributed in a satisfactory manner to the κ u . In particular, categories III and IV showed a perfect agreement (Table 5).

Discussion
The correct histological analysis of the residual cancer after induction chemotherapy provides useful information on chemotherapy efficacy and assists in predicting disease recurrence. In fact, it has been shown that patients in whom remission is complete are likely to remain free of disease [8,13,17,23,32,38].  [16] yes -4×/25× -random 3 Charpin et al. [11] yes ---random 4 Guillaud et al. [19] yes -50× 300 grid 5 Isola et al. [21] yes 40× 400 random 6 Wintzer et al. [51] yes 40× -3 0 7 McCormick et al. [29] yes 40× 1000 -8 Veronese et al. [49] yes 40× 1000 8/10 9 Barbareschi et al. [4] yes -40× 500 10 10 Makkink-Nombrado et al. [28] yes -40× -10/15 11 Mink et al. [31] yes --200 -12 Ostrowski et al. [33] yes -40× -2 5 13 Pinder et al. [34] yes ---random 14 Albonico et al. [1] yes -40× -a l l 15 Ghidoni et al. [18] yes -500 random 16 Querzoli et al. [37] yes In assessing the presence or absence of residual tumour, the pathologist is faced with two major problems, i.e., the recognition and quantification of the residual tumour and the evaluation of the proliferative activity of residual tumour cells. The former mainly lies in the morphologic changes in residual tumour cells, which are often misinterpreted as reactive histiocytes [23]. Regarding the latter, it is well known that the proliferative activity evaluation of tumours assessed by mitotic indexing is the most reliable and best documented predictor of subsequent disease outcome. However, the validity of mitotic assessment or grading has been questioned in treated breast cancer [17,27,38,42]. In this context the evaluation of the proliferative activity seems to be a useful immunocytochemistry-based method to obtain predictive information on the clinical behaviour of the tumour. Immunocytochemistry can also assist in and further complement the qualitative and quantitative evaluation of residual cellularity.
Regarding residual tumour cell evaluation, the moderate agreement observed in NCD assessment between IA and LMA was unsatisfactory. Considering that the human visual system is known to have difficulty in the detection of target objects (the search objects) when the background contains distracting objects, it is possible to provide an explanation for this unsatisfactory result by interpreting residual tumour cells as search objects, and non-neoplastic cells (macrophages, xanthic cells, fibroblasts, etc.) as well as stromal alterations (fatty changes, elastosis, necrosis and calcifications) as distracting objects [23,45]. This could be particularly true for categories II and III of this study, where the ratio between search and distracting objects is critical. However, the assessment of this hypothesis requires further studies, performed by stratifying the specimens according to the "distracting grade". To our knowledge, the present is the only study that obtained a concordance evaluation for cytokeratin immunoreactivity between LMA and IA on breast tissue sections. In another study, Mesker [30] evaluated the immunoreactivity for cytokeratin on breast cancer by both IA and LMA, finding a high correlation between the two approaches. However, this study was performed on a cell line (and not on tissue sections) and provided a correlation coefficient estimate not interpretable as a measure of concordance.
With respect to PI evaluation, we found an almost perfect agreement between IA and LMA MIB1 immunoreactivity. In particular, the agreement was perfect for categories III and IV. IA, over LMA, required more time because the nuclei have to be analyzed with a high resolution microscope objective, also fewer cells can be analyzed per HPF and more fields must therefore be assessed. However, even if this method does not provide considerable advantages over LMA in terms of speed, it might represent a standardized tool in PI evaluation. In fact, in breast carcinoma, as well as in other tumours, the highly divergent PI values reported in the literature might actually be explained by the different evaluation methods applied. The only standardized criterion, with a few exceptions or omissions, is the use of a 40× objective (Table 6).
In conclusion, our results show that on preoperatively treated breast cancer, LMA and IA are overlapping methods in PI evaluation, whereas this is not true for NCD. Standardization of procedures for the rigorous technical quality of tissue preparation, image cytometry instrumentation and measurements, as recently suggested [7], will favor the diffusion of the IA method [35].