Automatic Quantification of Microvessels Using Unsupervised Image Analysis

An automatic method for quantification of images of microvessels by computing area proportions and number of objects is presented. The objects are segmented from the background using dynamic thresholding of the average component size histogram. To be able to count the objects, fragmented objects are connected, all objects are filled, and touching objects are separated using a watershed segmentation algorithm. The method is fully automatic and robust with respect to illumination and focus settings. A test set consisting of images grabbed with different focus and illumination for each field of view, was used to test the method, and the proposed method showed less variation than the intraoperator variation using manual threshold. Further, the method showed good correlation to manual object counting (r = 0.80) on an other test set.


Introduction
Angiogenesis, or neovascularisation, is by definition formation of new capillaries from preexisting bloodvessels. The process of neovascularisation is regulated by numerous growth factors, vascular endothelial cells, and matrix proteins released from host stromal cells such as macrophages and mast cells [10]. There is evidence from in vivo and in vitro studies that tumour growth is dependent upon neovascularisation. Induction of angiogenesis is a prerequisite for tumours to grow beyond approximately 1-2 mm 2 [9].
Recently, the prognostic value of microvessel density (MVD) has been investigated in several tumour forms including bladder, prostate, breast and lung carcinoma. Most studies are indicating a significant correlation to prognostic factors such as survival, progression, metastasis and recurrence [8,17,20,25,27,28]. Some studies though, have failed to find any correlation [6,15].
The technique described by Weidner et al. in 1991 [26] is regarded as the "Golden standard" by many investigators. This is a simple technique where Factor VIII highlighted microvessels are manually counted in the most vascularised areas of the tumour at 200× magnification. The MVD is then expressed as the highest number of microvessels per mm 2 found in one microscopic field (0.74 mm 2 ).
However, different approaches are employed both regarding staining protocols and choice of endothelial cell marker [16] and how the quantification is carried out [11]. Two different ways of defining MVD could be distinguished: (a) manual counting techniques, either by simply counting the number of microvessels per microscopic view, or using grids or Chalkley eye-piece graticule, making it possible to calculate features as vessel length and vessel area besides the number of vessels; or (b) applying image analysis quantifying the number of objects and/or the stained endothelial cell area related to the surrounding area. Both measurements are thought to reflect the rate of vessel exchange of nutrients and oxygen with the surrounding tissue compartments.
This study is an attempt to make the quantification less subjectively influenced by using automatic classification and object counting, and thus combining reproducibility with a rapid performance.

Patients
A total of 27 cases bladder carcinomas obtained by transurethral resection (TUR) were randomly selected out of a larger multicenter study including 73 patients. All patients were subsequently cystectomised and the TUR specimens were all judged as stage T1 or T2, 17 and 10 cases, respectively. The material was fixed in 10% non-buffered formalin, and paraffin-embedded according to standard procedures.

Immunohistochemistry
Paraffin sections were cut at 4 µm thickness and placed onto Super frost/plus ® slides (Mentzel, Germany). Two mouse monoclonal antibodies, CD 31 (clone JC 70, DAKO, Glostrup, Denmark) and CD 34 (clone QBEND 10, Oxoid, Basingstoke, England) were used as a mixture, diluted 1/80 and 1/100, respectively, and incubated for 16 h at 4 • C. Prior to immunohistochemistry heat mediated antigen retrieval was performed by boiling the slides in 0.01 M citrate buffer, pH 6.0, for 16 min at 750 W [7,18] in a microwave oven (Whirlpool VIP34, Sweden). Blocking for endogenous peroxidase in 0.3% H 2 O 2 and preincubation in 10% normal rabbit serum, both diluted in PBS and incubated for 20 min. As link antibody a biotinylated rabbit anti mouse (DAKO) was applied, followed by a peroxidase labelled streptavidin biotin complex (DAKO), both diluted 1/200 and incubated for 30 min. The slides were developed in nickel sulphate (Merck, Darmstadt, Germany) enhanced DAB (Sigma, St. Louis, MO, USA) for 6 min [13], and counterstained in Lightgreen (Merck). Finally, the slides were dehydrated through graded alcohols to xylene and mounted in organic mounting medium. Reagents were diluted in 0.5% BSA-C (Aurion, Wageningen, The Netherlands) and incubations were performed at room temperature (unless otherwise stated above). Washings, for 3 × 10 min, between incubation steps were done in 0.05 M Tris, pH 7.6, containing 0.3 M NaCl and 0.1% Tween 20 ® .

Image acquisition
The specimens have very little colour, and thus could greyscale images very well be used. But we have found that the results improved slightly if we grabbed colour images and used the first principal component (PC) [14] image instead. The 756 × 572 pixels colour images with 3 × 256 greylevels were grabbed by a Sony DXC-151 colour video camera attached to a standard Olympus BH-10 optical microscope, using magnification 20×. This results in a pixel size of about 0.8 µm. The width of the thinnest microvessels is about 3 µm, and thus could 10× magnification just about suffice according to the Whittaker-Shannon sampling theorem [12]. Small studies have indicated that the proposed method could be used on 10× magnification, and thus cover a larger area. The method is also tested successfully using 40× magnification, but for this type of quantification 20× seems to be the best choice.
For all images Köhler illumination was maintained and the aperture iris diaphragm ring was fixed to 0.5.
Example of microvessel image is shown in Fig. 1.

Method
First the objects are segmented from the background according to the following scheme: 1. Perform a principal component transform [14] of the RGB data and use the first component image.
2. It is difficult to use the ordinary greylevel histogram to set a threshold level (see example in Fig. 2(a)). Therefore a more suitable type of histogram is used; average component size histogram. This histogram shows the average component size for each threshold level. The average component size histogram is computed by computing a cumulative greylevel histogram and a connected component histogram (see Appendix) and dividing each value in the cumulative greylevel histogram with the corresponding value in the connected component histogram (see example of average component size histogram in Fig. 2(b)). Both the cumulative greylevel histogram and the connected component histogram can be implemented as one pass over the image. Note that the histogram has been truncated at the greylevel corresponding to the peak of the greylevel histogram. This is done since the average component size will be very large at high thresholds.
3. Define two threshold levels, T 1 , T 2 , from this average component size histogram. T 1 = the local maximum before the peak of the greylevel histogram, where the peak of the greylevel histogram is computed for a histogram that is smoothed until the histogram has only one local maximum. The greylevel of this peak will correspond to the average background. T 2 = the local minimum after T 1 . 4. All pixels with value <T 1 are included in the object class, and all pixels with value >T 2 are included in the background class. 5. A pixel with value between T 1 and T 2 is included in the object class if the distance to its closest object pixel is less than the distance to its closest background pixel. This is done by computing the 3-4 chamfer distance [4] from the object and background pixels, respectively. At this stage it is possible to measure area and to compute the number of connected components. When computing the number of connected components objects with area <30 pixels are removed.
Fragmentation of the objects will sometimes cause overestimation of the number of connected components (see detail in Fig. 3(a)). Therefore the objects are connected, filled and separated using the following procedure: 1. Compute the 3-4 chamfer distance [4] from every background pixel to the object pixels. 2. On this distance image a watershed separation [19,21,24] is performed. The definition of a waist that should be separated is that the width is less than 30% of the width of the neighbouring local maxima. The result will be connected object structures (see detail in Fig. 3(b)). 3. The connection step might produce some undesired lines connecting the objects. These are opened by thresholding the image from 0 to T 2 and combining the resulting binary image with the connected structures using a pixelwise logical AND.
(a) (b) (c) (d) Fig. 3. The process from threshold to separated objects. 4. Fill the closed structures. This is done by making a connected component labelling of the background and only keeping the largest object as background (see detail in Fig. 3(c)). 5. Compute the 3-4 chamfer distance from every object pixel to the background pixels. 6. Separate touching object by a watershed separation on this distance image. Here the definition of a waist that should be separated is that the width is less than 50% of the width of the neighbouring local maxima. The final results will be filled separated structures, which can be counted (see detail in Fig. 3(d)).

Results
We have evaluated the proposed method by computing the correlation to manual object counting on a test set consisting of 54 images from 27 patients. Values from four different stages of the procedure are collected: • Area. The area after the initial segmentation.
• Number of objects (raw). The number of separate objects after removing objects with area <30 pixels. • Number of objects (filled). The number of separate objects after connecting and filling.
• Number of objects (separated). The number of separate objects after watershed separation.
As comparison we have also tried manual thresholding instead of automatic and computed the correlation to manual object counting. The results are shown in Table 1.
Another necessary criterion for a good quantification method is that it is robust and gives low variation. We have tested the robustness and variation of the proposed method by using images from the same field of view, captured with varying focus and illumination. For this test we have used a test set consisting of 7 fields of view. For each field of view 7 images were grabbed, differing in focus and illumination. Thus, in total we got 49 images. The difference in focus was approximately ±1.5 µm and the difference in illumination was about ±20 on a 0-255 scale. As measure of the variability of the results for a field of view, we have used the variation coefficient: v = σ/x, where is σ the standard deviation, and x is the mean for the 7 images from the same field of view, respectively. The variation coefficients for the proposed method are shown in Table 2.
The initial threshold method spends about seven seconds of CPU time per 752 × 572 image on a 233 MHz DEC Alpha 2000, and the following object separation procedure takes another eight seconds. No efforts were made to optimise for speed.
The present study describes a fully automatic quantification method of both the number of microvessels and the stained endothelial cell area. The method proved very stable to variations in light and focus settings and avoids the subjectively influenced training of a classifier.
Supervised training, in which reference points are chosen subjectively by the observer to define classes in the image, were shown to severely affect the reproducibility of the measurements. In a pilot study where stained endothelial cell area fraction was quantified in the same microscopic field at 5 different occasions by one observer, the results differed a factor 2 between some occasions (results not shown). Or when using interactive thresholding there are quite large differences between segmentations that could be accepted by the same operator at different times, or by different operators. An example of two such interactive segmentations of the same image by the same operator is shown in Fig. 4. The difference in area for these two images is about 50%.
Moreover, the correlation between automatic object-based quantification and manual counting of microvessels (r = 0.80) indicates a reliable estimation applying the automatic method. When using manual threshold followed by the proposed method for object counting, the correlation was slightly better, but the intra-operator variability using manual thresholding was significantly larger.
Another important aspect is the time factor. Using the described automatic quantification method, the operator time is dramatically reduced compared to manual counting or semi-automated techniques employing, e.g., manual training of a classifier. This is not only a matter of cost effectiveness but also a possibility to perform more extensive quantifications within a reasonable amount of time. There is also the possibility to store a large number of images and process them afterwards, e.g., during the night. If larger areas of the tumours are quantified a more general pattern of the microvessel distribution could be calculated as well as the "hot spot". The "hot spot" could also be determined in a more accurate way by separately studying focal tumour area fractions exhibiting the highest MVD within the general pattern.
Different image resolutions, using 10×, 20×, and 40× objective, were compared in a small number of cases, applying the automatic quantification method. The image resolution only marginally affected the accuracy of the quantifications of both the number of microvessels and the stained endothelial cell area fraction of the tissue (results not shown) suggesting a possibility to use the method in a "screening" manner.
The described automatic quantification method does not eliminate all the sources of errors and variations involved in MVD quantification, but could be a help in compensating for some. The minimised influence of observer subjectivity and the rapid performance makes the method reproducible and suitable for more extensive quantifications which, in turn, can decrease the impact of selectivity regarding the choice of where the MVD quantification should be applied in the tumours. As a consequence of the results described here, the automatic quantification method will be applied in an ongoing study comprising 73 bladder carcinomas, stages T1 and T2, selected for cystectomy, where the prognostic value of MVD will be assessed.

Appendix. Pseudo-code for connected component histogram
A connected component histogram is the number of connected objects in the image for each threshold level.
When computing the connected component histogram, an array, Changes, of linked lists is used; one list for each greylevel. Each node in the list tells at which greylevel a new label appears and at which greylevel this label gets connected to another label. For each label we are also keeping track of to which other label it changes, and at which greylevel. After the image is scanned, these lists are traversed and the histogram is computed.
Input: image f . Output: histogram h.