Mimicking Multimodal Contrast with Vertex Component Analysis of Hyperspectral CARS Images

We show the applicability of vertex component analysis (VCA) of hyperspectral CARS images in generating a similar contrast profile to that obtained in “multimodal imaging” that uses signals from three separate nonlinear optical techniques. Using an atherosclerotic rabbit aorta test image, we show that the VCA algorithm provides pseudocolor contrast that is comparable to multimodal imaging, thus suggesting that under certain conditions much of the information gleaned from amultimodal nonlinear optical approach can be sufficiently extracted from the CARS hyperspectral stack itself. This is useful for unsupervised contrast generation on hyperspectral CARS implementations such as multiplex CARS that may not havemultimodal capabilities.The utility of VCA as a quantitative analysis tool in CARS is also addressed.


Introduction
The importance of nonlinear optical microscopy techniques based on vibrational resonances has rapidly grown over the past decade.At the forefront of this set of imaging tools is Coherent Anti-Stokes Raman Scattering (CARS) Microscopy [1][2][3][4].As depicted schematically in Figure 1, CARS is a four-wave mixing process involving the interaction of pump ( pu ), Stokes (  ), and probe ( pr ) photons to yield a higher energy anti-Stokes photon ( as ), the production of which is stimulated in a sample when the difference of the pump and Stokes angular frequencies matches a vibrational resonance (  =  pu −  ).Thus, CARS is a label-free imaging technique with broad chemical specificity.The stimulated nature of this nonlinear optical process means that in a wide range of conditions CARS signals are orders of magnitude larger than those from traditional spontaneous Raman scattering [5].Therefore, a primary advantage of CARS as a microscopic technique lies in vastly improved image acquisition times.While traditional CARS microscopy research laid the foundation for it as a powerful contrast-based (i.e., qualitative) technique, there has been a considerable recent thrust to develop the quantitative aspects of this and other coherent Raman microscopy techniques [6][7][8][9][10].
Traditionally, CARS microscopy has been implemented using pulsed picosecond lasers which are optimal for this technique because the picosecond pulse bandwidth closely matches the Raman linewidths of relevant molecular resonances [11,12].However, pulsed femtosecond laser approaches to CARS are currently popular for various reasons including the fact that their high peak powers enable "multimodal" operation that simultaneously integrates CARS with other nonlinear optical imaging tools such as second harmonic generation (SHG) and two-photon excitation fluorescence (TPEF), among others, to provide unique contrast information [13].SHG, for example, is sensitive to noncentrosymmetric molecular assemblies such as collagen, and TPEF can map endogenous fluorescent molecules such as elastin.Figure 2(a) shows such a multimodal image of an atherosclerotic rabbit aorta sample.
Another primary advantage of femtosecond-laser-based approaches to CARS lies in their inherently broadband pulses which allow for better harnessing of the spectroscopic power of vibrational imaging.The combination of spectroscopy and microscopy, termed "hyperspectral imaging, " can provide spectral information at every pixel with a multitude of unique chemically specific contrasts.Figure 2 atherosclerotic rabbit aorta tissue, with two representative spectra shown for a bright and a dark pixel.Typically, contrast is achieved either by on-peak (i.e., single frequency) imaging, as shown in this figure, or by comparing ratiometric differences in two or more peaks in the spectra that are diagnostic of the molecule being probed [14,15].There is a growing sense, however, that single-or few-frequency analysis of hyperspectral CARS images makes insufficient use of the rich spectral information inherent to the technique.Thus, more sophisticated spectral analysis tools-largely based on multivariate analysis (MVA) techniques-are currently gaining in prominence [16][17][18].Multivariate analysis is a mainstay in "chemometrics" and is increasingly being used for vibrational imaging of biological samples [7,14,16,18].For Raman and infrared absorption microscopy, MVA is used primarily for qualitative chemical identification using colormaps.By contrast, CARS and other coherent Raman microscopy techniques have faster acquisition rates that provide impetus for its use not only for qualitative contrast but also for quantitative concentration measurements that are useful for monitoring dynamic concentration changes in biological samples [8].Principal component analysis (PCA) is one of the popular clustering multivariate tools used in both Raman and CARS microscopy [16,17].PCA clusters spectra based on their similarities by choosing basis sets that have the most variance and which are linear combinations of the original spectral axes.PCA, however, traditionally only provides classification based on spectral similarities but not quantitative information.
Although the CARS signal increases with concentration, the presence of a nonresonant background (NRB) that coherently combines with the Raman lineshape yields nonlinear concentration dependence and is a major obstacle for quantitative CARS analysis [12].This may, however, be overcome using phase-retrieval algorithms that extract (or "retrieve") the Raman lineshape [19,20].A recent study uses a phase-retrieval algorithm and "nonnegative matrix factorization" to perform quantitative CARS microscopy [6].
Vertex component analysis (VCA) is another candidate for the quantitative analysis of hyperspectral CARS images.
VCA is a spectral unmixing MVA algorithm initially developed for remote sensing to extract the "purest" spectra [21].Each pixel in the hyperspectral image is expressed as a linear combination of the pure spectral components (known as endmembers) in an abundance fraction matrix that may be further explored for quantitative analysis.The abundance fraction matrix may also be used to form colormaps that aid in image visualization.
In this paper, we compare a multimodal image that makes use of single-frequency CARS, SHG, and TPEF modalities to extract contrast information from a complex biomedical tissue sample with an image generated from the VCA of its associated CARS hyperspectral image data.We chose the VCA algorithm (over PCA, e.g.) for two main reasons.First, unlike principal components which do not necessarily represent a spectroscopically identifiable object but are rather chosen as the most statistically distinct basis sets, the VCA endmembers represent identifiable spectral components and are thus comparable to other vibrational spectra.Second, via the abundance fraction matrix, it may be possible to extract spectral concentration information in VCA.Such concentration information may be useful towards expanding the technique towards quantitative CARS microscopy studies.

Atherosclerotic Rabbit Aorta Tissue Sample Preparation.
We revisit an archival image data stack from a previous multimodal CARS microscopy study of atherosclerotic rabbit aorta tissue [22].The main purpose of the previous study was to investigate the use of multimodal imaging (SHG, TPEF, and CARS) for the label-free diagnosis of luminal atherosclerosis.Hyperspectral CARS data on the same image also shows unique spectral information on regions with contrasting multimodal signal.This stimulated our interest to recreate a multimodal-like image using multivariate analysis of the hyperspectral image stack alone and compare that to the rich contrast information from the original multimodal image.
The tissue sample used in this work was provided by the National Research Council of Canada's Institute of Biodiagnostics.A detailed description of the sample preparation methodology, as well as the analysis of multimodal images for this sample, is provided by Ko et al. in [22].

Optical Setup for Multimodal and Hyperspectral CARS
Imaging.Multimodal imaging was performed at the National Research Council of Canada's CARSLab, using the single femtosecond oscillator light-source microscopy setup developed by Pegoraro et al. [23].A schematic representation of the experimental layout is shown in Figure 3.In short, a 60 fs transform-limited laser beam centered at ∼800 nm is split into two arms using a polarizing beamsplitter (PBS).The first beam (path A) becomes the pump and probe beam.The second beam (path B) generates a supercontinuum light in a photonic crystal fiber (PCF).The redshifted part of the supercontinuum that ranges from ∼950 to 1150 nm is used as the Stokes beam for the CARS process.An anti-Stokes signal is generated in the sample by collinearly overlapping pump/probe and Stokes beams whose temporaloverlap-dependent frequency difference corresponds to a vibrational resonance.For instance, the CH-stretching vibrational mode at ∼2850 cm −1 is probed by overlapping in time the pump beam's 800 nm (12,500 cm −1 ) light with the ∼ 1036 nm (9,650 cm −1 ) part of the Stokes beam.The pump beam is also used to concurrently stimulate other imaging modalities such as SHG and TPEF.The tightly focused beams are raster scanned on the sample using galvo mirrors and the signals are detected using photomultiplier tubes (PMTs).The forward-propagating CARS and SHG signals are filtered using a dichroic mirror and detected on separate PMTs.The TPEF signal is detected in the backward (epi−) direction on a built-in PMT of the Olympus Fluoview 300 microscope.
Spectral scanning is implemented by varying the temporal delay of the pump/probe beam such that it overlaps with different portions of the Stokes light, represented schematically in Figure 3.The spectral resolution (∼30 cm −1 ) is obtained by matching the chirps of the two pulses using high dispersion SF-6 glass, in a process known as "spectral focusing" [23][24][25].Image processing was done with ImageJ [26] and multivariate analysis with MATLAB.

Vertex Component Analysis of Hyperspectral CARS Image.
VCA is an algorithm for hyperspectral unmixing that was developed in the field of remote sensing [21].It assumes that each spectrum in a hyperspectral data set X (of size  × ) is a linear combination of "pure" spectra called endmembers.That is, where M (size  × ) is the mixing matrix containing the spectra of the endmembers.A (size  × ) is the abundance matrix containing the relative amounts of each endmember on each object in matrix X and N (size  × ) is the noise in the signals.In the case of a hyperspectral CARS stack,  is the number of pixels (or "objects"),  is the number of data points (in frequency space) for each spectrum, and  is the number of desired unique unmixed spectra (or "endmembers").
The hyperspectral image taken of the atherosclerotic rabbit aorta sample is a 256 × 256 pixel image with 272 data points for each spectrum which translates to a 65,536 × 272 matrix.Each 256 × 256 image was collected in 0.58 s, and the 272-image hyperspectral stack acquisition required ∼3 minutes.Thus,  is 65,536 pixels and  is 272 data points.The value for  may be chosen based on the number of pure chemical substances (or anticipated "unique" spectra) in the sample.PCA itself can be used to suggest an appropriate value for  based on how the data are clustered.We performed VCA with  values of 3 and 4 based on the results of PCA.Details of this are discussed in the next section.
Each of the  pixel objects is a vector in -dimensional space.Given a known value of , the goal of VCA is to determine the matrices M and A. VCA first extracts the  endmember spectra and stores the associated endmember Figure 3: Optical setup of the multimodal/hyperspectral CARS microscopy system.A 60 fs laser beam centered at 800 nm is split into two beams, one beam becomes the pump/probe at 800 nm and the other beam generates a supercontinuum using a photonic crystal fiber.The red part of the supercontinuum (∼950-1150 nm) becomes the "Stokes" beam.The two beams are chirp-matched using high dispersion SF6 glass and they are overlapped in time and space.Spectral scanning is done by overlapping the pump with different portions of the "Stokes" beam by delaying the arrival time of the pump/probe beam using a high-resolution linear stage.The two beams are sent to the microscope where high speed galvo mirrors raster-scan the beam over the sample.
data points on matrix M. A vector u is selected such that no object is orthogonal to it.All objects are projected unto u and the maximum of the projection is the first endmember.The succeeding endmembers are iteratively projected to a subspace orthogonal to the span of the endmembers already determined.This is done until the  endmembers are found.The abundance fraction A is calculated by multiplying X by M # , the pseudoinverse of M.
Dealing with the CARS signal dependence on concentration is a critical first step in implementing multivariate analysis algorithms as it affects the relative object distances in dimensional space that may lead to their incorrect clustering and classification.In our work, we normalized each spectrum to its peak (maximum) value.Another, typically robust, normalization procedure common in spectroscopic analysis involves normalizing over the area, rather than simply over the peak.Area normalization, however, does not seem to work well in CARS hyperspectral data analysis because of the NRB-distorted lineshape of the CARS spectrum compared to a Raman spectrum [27].
All data analysis was performed using MATLAB 2013b with the Statistical Toolbox Package.The calculated abundance fractions were used to generate a pseudocolor image using the RGB color scheme.

Results and Discussion
PCA is one of the more popular multivariate analysis tools and it is useful to compare its performance with that of  VCA.Furthermore, PCA can be used as an intermediate step for determining the number of endmembers to input in the VCA algorithm.the number of endmembers for VCA.With two PCs, the data can be visualized in a score plot where the number of distinct clusters may suggest an appropriate value for .For our data set, as shown in Figure 4(b), there are no distinct clusters so instead we arbitrarily based the color assignments on the four quadrants defined by the two PCs.The color assignments (denoted in Figure 4(b)) are then the basis for the contrast shown in Figure 4(c).This PCA image already appears somewhat similar to that of the multimodal image of Figure 2(a).For example, the regions with strongest SHG contrast, likely collagen-rich, are easily identified in the PCA image as well.However, little-to-no distinction is made in the PCA image between the lipid-rich regions that appear red in the multimodal image and the green elastin-rich regions that yield strong autofluorescence.The PCA image appears considerably coarser than the multimodal image.This is largely because of the loss of concentration information in PCA.
Others have used PCA to extract representative spectra by averaging all the spectra from pixels of the same cluster in an effort to identify compositionally similar regions [16,18].However, the validity of this method is limited to tightly clustered groups with distinctly identifiable boundaries.Because our 2-PC scatter plot, Figure 4(b), lacks any distinct clustering, PCA does not allow for the extraction of representative spectra [27].Because this single cluster is roughly centered at the origin, it motivates a 4-quadrant PCA and thus perhaps a 4-endmember VCA.Looking at the retrieved PCA image of Figure 4(c), however, the red and black pixels are intermingled and are not on distinct regions of the image.This further suggests that a 3-endmember VCA implementation might be more effective in correctly identifying the unique spectra.And so, we performed both 3-and 4-endmember analyses of the sample as shown in Figure 5.
Figure 5(a) shows the retrieved VCA image for a 3endmember implementation.It is immediately evident upon inspection that the VCA image shows a stark similarity to the multimodal image in Figure 2(a).The brightest red regions in the image match the strongest (blue) SHG regions in the multimodal image and are thus presumed to be vibrational signatures from collagen.In the VCA image, however, these regions are slightly more extensive than the collagen imaged by SHG in the multimodal image.This could be due to SHG's sensitivity to the molecular orientation and organization of collagen, whereas CARS is more sensitive to molecular density.This makes VCA of hyperspectral CARS imaging complementary to the SHG technique.Figure 5(c) shows the corresponding endmember spectra for the VCA image automatically extracted from the CARS hyperspectral data of the atherosclerotic aorta sample.In a multimodal CARS study of the same sample, Ko et al. highlighted similar spectra to that extracted by VCA [22].However, those spectra were extracted (somewhat by trial and error) using the multimodal image as a visual guide for suggested differences in tissue character.Thus, the method was highly supervised and potentially very time consuming.The unsupervised nature of the VCA spectral extraction, however, may be used to an advantage for hyperspectral CARS implementations that do not usually have multimodal imaging capability, such as multiplex CARS [2,4,28].
Figure 5(b) shows the retrieved VCA image for a 4endmember implementation with the corresponding fourendmember spectra in Figure 5(d).It should be noted that the VCA algorithm will always extract endmembers based on the input value for .However, care must be taken in judging the validity of the results.For instance, in our sample, the pixels corresponding to the "black" endmember spectrum in Figure 5(d), highlighted by the white arrow, are confined in a very small region on the sample, which suggests that it is not a useful endmember spectrum.Furthermore, by inspection of the 3-endmember spectra, it appears that the black spectrum is mostly a particular (but nonspecific) combination of the red and green endmember spectra.Thus, we conclude that despite the misleading detail found in the 4-endmember VCA image, the 3-endmember VCA analysis is a better comparator to the multimodal image of Figure 2(a).
The calculated abundance fractions using the three endmembers in Figure 4(a) were used to generate the pseudocolor image in Figure 4(b).The values of the calculated abundance fraction ranging from 0.0 to 1.0 are used as the pixel values in Matlab's RGB (red-green-blue) color scheme.Thus, pixels with combinations of these three colors represent spectra that combine two or three of the endmembers.Compared with PCA, which only assigns one color for each cluster, VCA assign color combinations based on the spectra producing a pseudocolor image similar to the original multimodal image.
The calculated abundance fraction also provides future prospects for use in quantitative analysis.Using available phase-retrieval algorithms [19,20], the Raman lineshapes could potentially be extracted by unmixing (in phase) the nonresonant background and then used as input hyperspectral data for VCA.It would also be interesting to compare quantitative analysis using VCA with the recently developed FSC 3 quantitative CARS method [6].

Conclusion
We succeeded in using an unsupervised VCA algorithm to generate contrast from a hyperspectral CARS image of an archival atherosclerotic rabbit aorta tissue sample that is comparable to that obtained from multimodal nonlinear optical microscopy.Furthermore, we discussed the potential of the technique for quantitative chemical imaging with the use of abundance ratios.The usefulness of VCA and other multivariate analysis techniques is particularly strong in powerful hyperspectral CARS techniques-such as multiplex CARS-that do not traditionally allow for multimodal nonlinear optical operation.

Figure 2 :
Figure 2: (a) Multimodal image of an atherosclerotic rabbit aorta sample where SHG is shown in blue, TPEF is shown in green, and CARS at 2850 cm −1 is shown in red.(b) Grayscale image of the 2850 cm −1 CARS channel shown as red in (a).The spectra on the right are representative spectra from two individual bright (tissue) and dark (background) pixels in the image.Scale Bar: 30 m.

Figure 4 :
Figure 4: Principal component analysis of the hyperspectral CARS sample image.(a) Cumulative variance plot shows that only two principal components (PCs) are needed to cover ∼90% of the variance in the data set.No appreciable increase in the cumulative variance is observed after PC2 as shown for PC3 to PC6.(b) Score plot of the first two principal components (PC1 and PC2).Each point on the plot corresponds to one pixel in the image.(c) The color assignments on the retrieved PCA image depend on which quadrant the pixel object is located on the score plot in (b).Scale Bar: 30 m.
Energy level diagram of the CARS process. pu ,   ,  pr ,  as , and   are the pump, Stokes, probe, anti-Stokes, and Raman vibrational mode frequencies, respectively.