Intrinsic Near-Infrared Spectroscopic Markers of Breast Tumors

We have discovered quantitative optical biomarkers unique to cancer by developing a double-differential spectroscopic analysis method for near-infrared (NIR, 650–1000 nm) spectra acquired non-invasively from breast tumors. These biomarkers are characterized by specific NIR absorption bands. The double-differential method removes patient specific variations in molecular composition which are not related to cancer, and reveals these specific cancer biomarkers. Based on the spectral regions of absorption, we identify these biomarkers with lipids that are present in tumors either in different abundance than in the normal breast or new lipid components that are generated by tumor metabolism. Furthermore, the O-H overtone regions (980–1000 nm) show distinct variations in the tumor as compared to the normal breast. To quantify spectral variation in the absorption bands, we constructed the Specific Tumor Component (STC) index. In a pilot study of 12 cancer patients we found 100% sensitivity and 100% specificity for lesion identification. The STC index, combined with other previously described tissue optical indices, further improves the diagnostic power of NIR for breast cancer detection.


Introduction
Do intrinsic quantitative optical biomarkers exist for malignant tumors that can be measured non-invasively? If this will be found, then optical biopsy using noninvasive methods could be performed with great advantages in terms of cost, number of procedures, speed and availability upon existing methods. Optical spectroscopy is not invasive and offers a unique view of tissue because it is sensitive to compositional and functional characteristics. Spectroscopic methods have employed absorption, scattering, and fluorescent contrast (endogenous and exogenous) to detect tumors in breast, cervix, skin, and esophagus over wavelengths ranging from the ultraviolet through the infrared. Depending upon the optical wavelengths, light can interrogate superficial to cm-thick tissues [34]. Although the penetration of light in tissues is strongly wavelength dependent, light may be delivered to most tissues either non-invasively (topical application) or by the use of intra-operative or minimally invasive probes (i.e., endoscopy). Our contribution will deal exclusively with breast tissues exposed to NIR light, but the general principles of our method could be applied to any cancer type.
Within the NIR spectral region (650-1000 nm), light penetrates deeply enough to transilluminate cm-thick tissues such as the breast. NIR is sensitive to the four major absorbing components in breast tissues: oxyhemoglobin, deoxyhemoglobin, water, and bulk lipids [8,26].
The main hypothesis of NIR cancer detection has been that malignant transformation changes the bal- ance of hemoglobin (oxygenation state), water, and bulk lipids in tissue. Differences in optical signatures between tissues are manifestations of multiple physiological changes associated with factors such as vascularization, cellularity, oxygen consumption, edema, fibrosis, and remodeling [32]. Several groups have employed NIR optical methods to measure subtle physiological alterations in healthy breast tissue [2,5,7,24,29], cancerous breast tissues [9,11,20,25,30,32,33,35,36], and malignant tumors treated with neoadjuvant chemotherapy [14,28,37]. The transport of NIR light in tissues is dominated by multiple-scattering, which complicates the quantitative recovery of the amounts of these four components from the absorption spectra from tissues. Light transport can be modeled as a diffusive process where photons behave as stochastic particles much like the bulk movement of molecules or heat. The most accurate tissue optical absorption spectra can only be obtained by separating light absorption from scattering, which can be performed using a model-based approach alongside time-or frequency-domain measurements [10,21,22]. These absorption spectra are then translated into hemoglobin, water, and bulk lipids, using the Beer-Lambert law along with known component extinction coefficients ( Fig. 1) (for a detailed description see [2]).
After many years of research it is realized that although changes in NIR measured components are altered significantly by tumors, they are not specific for cancer; the same components are found in both normal and malignant tissues. Thus the detection of cancer is based upon crossing a threshold in hemoglobin, and sometimes water, and bulk lipids. Complications can arise in separating malignant from benign lesions, since benign lesions are also known to change water, hemoglobin, and bulk lipids in a similar fashion to malignant lesions [2,30]. Exogenous contrast agents have been developed for both absorption and fluorescence contrast, in order to increase sensitivity and specificity [12].
However, it is known that tumors alter the composition of tissues in other ways that are not accounted for in traditional NIR spectroscopic models. Alterations in protein expression, lipid and water states, and the presence of hemoglobin breakdown products can occur in malignant transformation [4,6,15,16]. These biochemical state alterations, or additional biochemical components, are likely to be small contributions to the overall tumor absorption spectrum. Even if small, these contributions should be unique to tumor tissues, and provide an endogenous spectroscopic signature that is specific of the lesion. To summarize: our hypothesis is that the optical absorption spectra of malignant tissues differ from normal spectra not only due to bulk absorber concentration differences, but also due to small changes of as yet unknown components.
In order to discern these subtle alterations in tumor absorption spectra that are not accounted for in traditional NIR component models of breast tissues, we have developed a double-differential method. The method nulls the effect of the four basis components to reveal the small spectral changes unique to cancer. Note that this method only works because we have two independent controls as will be discussed later. The results of our analysis will show only intrinsic, unique quantitative biomarkers of cancers which are not included in the basis spectra of the four components (oxyhemoglobin, deoxyhemoglobin, water and bulk lipids). We hypothesize the biochemical/physical origin of these spectral bands to reside in tumor specific changes in types and abundance of lipid components and changes in the O-H overtone region. Future work will determine the chemical origin of these spectral changes and improve breast cancer diagnostics by using these specific endogenous molecular markers to improve understanding of molecular changes in cancerous tissue.

Instrumentation
The detection and quantitation of subtle spectral shifts to fingerprint disease requires accurate reconstruction of tissue absorption spectra uncontaminated by the effects of scattering. For this purpose we have employed a technique known as Diffuse Optical Spectroscopy Imaging (DOSI), more specifically we use the well established method of Steady State Frequency Domain Photon Migration (SSFDPM) which can quantitatively measure scattering and absorption independently from each other [1,23,31]. The particular instrument used was the Laser Breast Scanner (LBS), which is a bedside-capable instrument that has been used in several clinical trials for breast cancer detection [3,13] and monitoring response to neoadjuvant chemotherapy [14,28]. The LBS employs a handpiece similar to an ultrasound probe which was placed on the surface of the breast to recover tissue optical spectra from 650 to 1000 nm. Typically DOSI samples a low number of spatial locations with a large spectral bandwidth.

Data processing
Data were analyzed in three stages: (1) calculation of absorption and scattering spectra, (2) calculation of absorber concentrations, and (3) calculation of unique tumor spectra by the double-differential method. Stages 1 & 2 have been detailed in the literature and are only summarized here [3]. A diffusive model constructed within the semi-infinite geometry is used to translate measured optical signals into absorption and scattering coefficients for each measured wavelength. Using independently measured molar extinction coefficients, the concentrations of the dominant absorbers found in breast tissue can be obtained: oxygenated [O 2 Hb] and deoxygenated hemoglobin [HHb], water [H 2 O], and bulk lipids (Fig. 1). The concentrations of these chromophores are then used to diagnose the presence of disease; this approach was outlined in a previous review article in this journal [27]. All data processing was performed using custom software designed in the MATLAB platform.
The unique spectral components were calculated using a double-differential technique applied to the measured absorption spectra, which is described in the Results and Discussion section. The absorbance data were analyzed by the Elantest program (available at ftp://ftp.lfd.uci.edu/lfd/egratton/elantest/elantest.exe).

Measurement procedure
We employed a measurement procedure which has been detailed in the literature [3]. Briefly, optical linescans were generated by moving the handheld probe to a set of discrete positions on the breast surface in 10 mm steps (Fig. 2a). Tumor locations were known a priori from mammography, ultrasound, and palpation. Absorption and reduced scattering spectra were measured at each grid location (Fig. 2b). A complete measurement of tissue absorption and reduced scattering spectra required approximately 10 seconds at each linescan location. Linescans were repeated twice at each grid location to evaluate placement variations. The fraction of tumor to normal tissue sampled by the light depended upon the tissue optical properties and the lesion depth.

Patient characteristics
All subjects were informed and provided written informed consent to participate in the studies under a protocol approved by the Internal Review Board of the University of California at Irvine (#95-563 and #02-2306). Ultrasound and surgical pathology reports were utilized to determine type, localization, and tumor lesion extent. In this pilot study patients ranged in age from 32-57 years, 6 pre-and 6 post-menopausal, with pathologically confirmed diagnosis of invasive ductal carcinoma with one patient diagnosed with adenocarcinoma with lobular features.

Breast tissue spectra
Typical NIR diffuse optical spectra of normal and tumor-containing tissues are provided for a Patient 30, a 45-year old pre-menopausal patient with a 29 mm invasive ductal carcinoma in the right breast. In Fig. 3 we show representative spectra for the measured wavelength dependence of scattering and absorption from 650 to 990 nm. These spectra correspond to a single spatial location over the tumor, and the equivalent tissue region on the contra lateral breast which is considered normal, having no known lesion. Repeat measurements over the years of data acquisition have shown that the spectral shape of tumor and normal tissue is well conserved. Variation in the magnitude of the absorption values due to probe handling is on the order of 3-5% [3]. The wavelength dependence of the tissue scattering (i.e., the scatter power, SP) differs substantially between tumor and normal tissue suggesting differences in the density and size of scattering centers between tumor and normal tissue regions (Fig. 2a) [3,18,19]. The measured NIR spectra demonstrate distinctive differences between tumor and normal tissues. Briefly, the spectra over tumor-containing tissue exhibits higher overall absorption along with notable spectral differences in the region above 900 nm. The distinctive peak in the normal absorption spectrum at 930 nm is representative of the vibrational overtones of lipid C-H bonds. Higher lipid absorption is typically seen in normal and postmenopausal breast tissue as compared to tumor-containing and pre-menopausal breast tissue, respectively. This prominent lipid peak results from the higher concentration of bulk lipid in normal adipose tissues relative to tumor tissues [3].
By applying the results of the four component basis spectra fit we determined concentrations for these components, and we calculated the Tissue Optical Index (TOI), which has been discussed in a previous manuscript in this journal [27] (Table 1).
Briefly the TOI is a contrast function empirically designed using the traditional components (recovered concentrations of deoxyhemoglobin, water, and bulk lipids) to accentuate the differences between tumor and normal tissue. While the TOI has been shown to identify some types of tumors, the TOI index is not highly specific. In fact for our set of 12 patients the TOI Index showed a sensitivity of 75% and specificity of 75% based on a threshold of 3.5 that gives three false positives and three false negatives. This makes differential diagnosis solely based on the four traditional components problematic. Thus, using this approach, we had no unique "marker" or "signature" of cancer.

Double differential method
In the double-differential method we want to determine if there are other spectral differences that cannot be accounted for by the different amounts of the four basis components. We first calculate the average spectrum of the normal breast. Then we calculate the differences between this average spectrum and the spectrum at each location (including the normal breast). If the only components present are the ones included in the four basis spectra, then the difference spectrum (at each location) could be completely fitted by the using these four components. The coefficient of the fit will provide us with the different amounts of the four components at each location (the same as for the TOI described in the previous paragraph). However, if the fit is not perfect, the residual of the fit (if highly correlated) will provide the additional spectra which are not included in the four basis components. This residual is the Specific Tumor Component (STC). The double-differential method has two major advantages over the conventional spectroscopic modeling approach. Firstly, the method uses each patient as their own control, so that variations in the "known" extinction spectra are effectively cancelled out. Thus the double-differential method is insensitive to intrasubject variations that arise due to the four component spectroscopic model. Secondly, the method accounts for concentration changes within subjects between normal and tumor tissue if the only differences are in the abundance of the four components. The residual spectrum for different locations of the unaffected breast should essentially be a flat line with values close to zero. If the tumor differs from normal tissues only in concentration changes of the four components, again the residual spectrum should be essentially zero. However, if spectral components are not accounted for in the four component fit, they should appear only in the tumor regions.

Double-differential method applied to breast spectra
Here we outline the double-differential method applied to Patient 30, the same patient presented in Fig. 3. In Fig. 4a we show residual spectra for all of the positions on the normal breast. In Fig. 4b we show residual spectra over a tumor containing position and positions over normal tissue surrounding the lesion in the tumor-containing breast.
As stated earlier, we have two internal controls, contra lateral normal breast tissue and the normal breast tissue surrounding the lesion on the tumor containing breast. Note that all of the residual spectra over normal breast tissue are essentially featureless and provide random values around zero. This control is important in that it proves that variations in normal breast tissue are due only to the natural compositional differences from oxyhemoglobin, deoxyhemoglobin, water and bulk lipids.

STC spectra in 12 cancer patients
In Fig. 5, we show the STC spectra for 12 cancer patients of this pilot study at the location displaying the largest variation. These residual spectra have small amplitudes (about 1% of original spectra). Clearly, the STC spectra are highly reproducible across all patients in this pilot study. There are patient dependent variations in several wavelength regions (Fig. 5), however all tumor STC spectra display a similar spectral shape. Figure 6 displays a comparison of the average of the 12 tumor spectra presented in Fig. 5, along with average of normal spectra obtained from the equivalent position on the unaffected breast for each patient. From Fig. 6 it is clear that these residual spectra are not random, unlike the residuals for the normal breast tissue which are featureless.
From inspection of the average STC spectra, we find there are roughly 5 spectral regions where systematic differences are observed. We have labeled these to be Region 1: 650-665 nm; Region 2: 730-800 nm; Region 3: 875-930 nm; Region 4: 930-960 nm; and Region 5: 980-990 nm. (Fig. 6). To quantify the amount of STC spectra, which should indicate the amount of biomarker, we calculate the local residual variance for each spectral region, defined by: The local variance, L k is a function of position on  the breast given by x and y coordinates. The index k indicates a given spectral region and N k indicates the total number of wavelengths in a particular region. STC i (λ i ,x, y) is the value of the STC spectra at a given wavelength. The STC index is the sum of all local variance L k . Across the 12 patients in this pilot study, the STC index displays the maximum value over the tumor regions. This STC index value can be significantly greater than the value over the surrounding normal tissue as well as the normal contra lateral breast tissue (Table 1). For this pilot study of 12 patients we were able to obtain 100% sensitivity and 100% specificity using the STC index. We set the threshold above 51.5 which separates the two groups. We emphasize that these sensitivity and specificity values are only provided for general reference. Data on a larger subject pool will be reported in another manuscript, thereby providing more significant values. We note that the double-differential method assumes that normal and diseased tissues coexist in the same patient. Furthermore, this method it is not required that a priori knowledge of the lesion location be available.

Origins of STC
We are investigating the biochemical/physical origin of the STC spectral signature, which will be the subject of a future manuscript. We believe Region 1 (650-665 nm) is mainly affected by the changes in hemoglobin absorption. Perhaps there may be other types of hemoglobins or breakdown products in tumors. With regards to Region 2 (730-800 nm), we see a distinct negative peak in the STC spectra. According to the definition of STC, a negative peak indicates absence of a component or a broadening of the NIR band. Changes in this region may be indicative of lipid metabolism changes, as lipids absorb in this region (Fig. 1). In Region 3 (875-930 nm) and Region 4 (930-960 nm), we observe a negative peak immediately followed by a positive peak. This may be indicative of a spectral shift toward longer wavelengths. This is also the spectral region in which lipids have characteristic absorption spectra (Fig. 1). Thus if there are differences in lipid composition in the tumor with respect to the normal tissue, this spectral region is likely to be affected. In Region 5 (980-1000 nm) we only have a few points available. However we do observe changes which may be due to differences in the O-H overtones. In this region there are two possible candidates: the O-H of water and lipid oxidation products.

Conclusion
We have developed a double-differential method to analyze the near infrared absorption spectra of breast tumors. In this method we consider only the spectral differences between normal and diseased tissue by fitting the difference spectra using the basis components spectra and then analyzing the residuals of this fit. This differential approach can be performed by comparison of regions of normal and tumor breast tissue. With this method we show intrinsic spectroscopic markers of breast tumors in the NIR. These are unique signatures related to the biochemical/physical properties of each type of lesion, as the changes in natural tissue components (oxyhemoglobin, deoxyhemoglobin, water and bulk lipid) as well as the individual physiological variation have already been accounted for. Furthermore, these spectral signatures are biomarkers revealing characteristic absorption bands in the lipid region of 760 and 930 nm region that were unnoticed before. The O-H overtone band in the 980 nm region also shows distinct variations in the tumor region compared to the normal breast. By quantifying this information, we constructed the STC index which gives 100% sensitivity and 100% specificity for lesion identification for the 12 patients investigated and has the potential to distinguish tumors on the basis of the lipid composition and/or bound water or lipid oxidation products.
In this paper our focus was to describe the discovery of an intrinsic NIR spectroscopic marker of breast cancer. While here we only provide a description of the double-differential method, in another manuscript we describe the mathematical operations for obtaining the STC residual spectra beginning with scatter corrected absorption spectra of breast tissue obtained noninvasively using the SSFDPM method [17]. In the future we will also discuss the difference between the residual spectra for benign and malignant lesions, opening up a new approach for differential diagnosis using optical methods. While this discovery was based on a pilot study of 12 cancer patients, we are analyzing STC residual spectra for a larger population, and comparing subjects who either have lesions which are malignant or benign to normal, having no known lesion.