Evaluation and Validation of a Semiautomatic Image Analysis Method : Quantification of Nares Cell Proliferation of Rhinella arenarum ( Anura ) Detected with the Avidin-Biotin Complex / Peroxidase , DAB

The current report presents the development and application of a novel methodological approach for computer-based methods of processing and analysis of proliferative tissues labeled by ABC-peroxidase method using 3, 3-diaminobenzidine (DAB) as chromogen. This semiautomatic method is proposed to replace the classical manual approach, widely accepted as gold standard. Our method is based on a visual analysis of the microscopy image features from which a computational model is built to generate synthetic images which are used to evaluate and validate the methods of image processing and analysis. The evaluation allows knowing whether the computational methods applied are affected by the change of the image characteristics. Validation allows determining the method’s reliability and analyzing the concordance between the proposed method and a gold standard one. Additional strongness of this new approach is that itmay be a framework adaptable to other studiesmade on any kind ofmicroscopy.


Introduction
Pure visual image analysis of immunocytochemical reactions is very prone to subjective bias, since it is a comparative process that can be easily influenced by the presence of background information, or other image artifacts.The way to prevent the inherent subjectivities is to measure and express the results quantitatively.Quantification avoids the anecdotal analysis by means of a general model of the data representation [1,2].This means that it is possible to diminish both the bias and error, based on a purely visual classification obtaining, from a set of objects already classified by one or more expert users, statistics parameter estimators.These parameters could represent some of the features of the objects to be classified.Using these estimators and their variation ranges, it is possible to apply the visual method and following, to perform the classification according to the statistic information previously analyzed.These are currently named manual methods.
Since earlier times, there were numerous efforts to automate quantitative image analysis that have actually been used, as was previously showed by Bengtsson [3] and Swedlow et al. [4,5].Automation allows processing large volumes of information, making possible analysis like those conducted in high content screening [6].Additionally, the subjectivity of the visual analysis is avoided, since the applied criterion is always the same, and it also avoids the error introduced by eyestrain.Automatic image analysis is faster than any manual method of analysis.However, automation in microscopy image analysis is not absolute, because it depends on the automatism settings to images, which present an inherent strong variability.That is why it is preferred to call this process semiautomation.
In research and diagnosis, new methods of image analysis are often proposed as alternatives or replacements of universally accepted methods or gold standards.This change in the procedures involves both evaluation and validation steps in order to determine whether the results produced by a new method is at least comparable to those obtained with gold standards, which have been accepted as valid and reliable [7,8].However, to ensure that the proposed analysis method is reliable, it should be evaluated and validated with a known object.In microscopy, as well as other imaging techniques, these known objects (or phantoms) are often built from models and can be degraded and distorted as it commonly happens in real acquisition, yielding a synthetic (or simulated) image.These models allow deeply studying the performance of a particular analysis method and estimating the introduced errors by the successive transformations (image capture, processing, and analysis).They also reveal the limitations of the image processing and analysis methods, testing them for (controlled) extreme values of their parameters.This process represents a more objective replacement methodology, because it takes into account the evaluation and validation steps that classically have been used in the replacement of a method of (image) analysis, but additionally, we propose to apply this approach to synthetic images which model real situations [9].
In the present work, we show the outcome obtained using this methodology to replace a visual classic procedure of image analysis by a semiautomatic one, which is applied to the identification of DAB labeled nuclei for the cellular proliferation study in olfactory epithelium of Rhinella arenarum [10].This South American toad has been historically used in Argentinean biological research, such as physiology, histology, and embryology, since earlier studies of the first Argentinean Nobel Price laureate Bernardo Houssay [11].These amphibians show metamorphic changes and migratory behavior, starting approximately at stage 42 and ending at stage 46, when the toad passes from an aquatic to a terrestrial environment.During this period, the olfactory system is deeply remodeled, in part, as a response to the new habitat [12], making this system an interesting biological model of neuronal cell proliferation [13][14][15].
The olfactory epithelium of Rhinella arenarum, DAB immunostained using anti-PCNA monoclonal antibodies, exhibited under microscope three main structures: immunopositive and immunonegative nucleus and also an opaque pigment layer.The biological goal points out to quantify immunopositive nuclei corresponding to proliferating cells; therefore image analysis was focused on the counting of those nuclear bodies as measure of growing and remodeling of the olfactory epithelia and it was also important for the nuclear areas morphometric calculation in order to establish if there exist differences of the proliferative activity both at the beginning and at the end of metamorphosis.The main difficulty in purely visual object counting is the accurate classification of nuclear bodies of proliferating cells from those belonging to nonproliferating ones, because of their proximity and sometimes their staining similarities.
To reach these objectives, a first visual and statistical analysis was conducted in order to find parameters for object characterization in the images.This allowed establishing mean values and variation range of the parameters that define each object group.It also allowed the design and development of an objective method of manual and semiautomatic counting.Finally, those parameters were used to build object models of the real images, allowing so, to develop synthetic image generation software.The set of synthetic images were generated to represent several acquisition situations.In the following step, both image analysis methods, manual and semiautomatic, were applied to simulated images.After measuring, the quantitative results were statistically evaluated and validated.These steps allowed estimating the error between the measurements and phantom actual data.After that operation, both methods were applied to real images where a validation step of the new semiautomatic method versus the manual (gold standard) was used.

Materials and Methods
The following scheme summarizes the proceedings conducted for the completion of this work (Figure 1).

Biological Sample
Processing.Rhinella arenarum larvae were obtained by in vitro fertilization using the methods described by Casco et al. [16].Five specimens of stages 42 and 46 [17] were selected and fixed in Carnoy solution, washed in PBS, dehydrated, and cleared.In the following, samples were preincluded in paraffin and 8 m-thick transversal sections (lengthwise direction) were obtained using a Reichert Jung Hn 40 microtome.These sections were mounted on gelatincoated slides and dried at room temperature.The sections were dewaxed and cleared with xylene and rehydrated in ethanol-PBS series.The permeabilization was made with Triton X-100.The nonspecific immunoreaction was blocked by normal goat serum.Subsequently, the sections were incubated with anti-PCNA 1 : 50 (mouse monoclonal antibody, IgG2a, clone PC 10, NOVO Castro) overnight at 4 ∘ C in a moist chamber.After the incubation, sections were placed at room temperature and washed with PBS and then treated with secondary antibody IgG (H + L) conjugated goat antimouse biotin 1 : 100 (Vector Laboratories Inc.).After that, sections were washed with PBS and incubated with H 2 O 2 0.3% in methanol and then were washed once again and incubated with PBS.The samples were subsequently incubated in ABC reagent (Vectastain Universal Elite ABC Kit, Vector Laboratories Inc.) for 30 minutes.Finally, the reaction was revealed using 3, 3  -diaminobenzidine as chromogen and H 2 O 2 as substrate for peroxidase.The reactions developments were controlled under microscope.The sections were dehydrated in ethanol series, xylene cleared, and mounted with Canada's natural balsam (Biopack).All reactions were performed by duplicate.The negative control reactions were performed by incubating the sections with PBS, omitting the primary antibody.
Nostrils of five larvae (stages 42 and 46) were sectioned along dorsal-ventral direction.Three zones were defined in the organ in order to evaluate the homogeneity of the process: dorsal, medium, and ventral.Two slices were collected by zone, which results in six slices by specimen and a total of sixty slices for imaging.The left olfactory epithelium sections were photographed in lateral-flow direction.
Samples were carefully positioned always in the same way, by reshaping the entire paraffin block as trapezoid, to avoid the random errors of non-systematic sectioning.Sections were also mounted on slides following the same orientation.

Imaging.
The reactions were recorded with an Olympus PM20 camera, coupled to a BX50 Olympus microscope (objective 40x, NA 0.85) via 3.3x couple lens.Fuji Color film (Quality II) was used (35 mm, 100 loops).The negatives were digitized always preserving the same scanning conditions.In all cases, the  resolution of the images was 1909 × 1273 pixels square.The images were stored in TIF format, 24-bit RGB (red, green, and blue) color.The spatial calibration was made with an object of known dimensions.This procedure was performed in quintuplicate and calculated both mean and standard deviation.

Preliminary Analysis.
A first visual and statistical analysis was made on five randomly selected immunocytochemistry images from the total of sixty images acquired.On these images the following elements were identified (see Figure 2):  For each region and object of interest the following features were considered: mean and standard deviation of RGB intensities and areas of the N+ and N− objects.The set of RGB intensity means is a sample of the distribution of the parameter that would explain the amount of PCNA present, variability in the treatment of the tissue, and/or exposure time to DAB.The set of RGB intensity standard deviations is a sample of the distribution of the parameter that would be associated with acquisition noise between others.Additionally, sample mean and standard deviation of the areas of the N+ were considered because they are representative parameters of the size of the elements and are a useful filter in the (manual and automatic) identification procedure.Finally, sample mean and standard deviation of the number of N+ and N− were considered because they are quantitative parameters that indicate objectively the cell proliferation level.
Histograms of the  and  positions of the N+, N− and  identified showed that they could be modeled by known distributions.All elements were uniformly distributed along  direction, which were coincident with the lateral view of the nares and follow an approximated Gaussian distribution along the  axis which were coincident with the basoluminal direction (see Figure 2).The statistical analysis of data extracted over the identified regions was made in order to obtain the parameter's estimator and their ranges of variation.The areas of interest (AOIs) interior to the mentioned regions were rectangular areas of 553.30 m 2 and 97.96 m 2 for  and P and circular areas of 15.02 m 2 and 18.19 m 2 for N+ and N−, respectively.The XY positions of the centers of mass of each object, obtained as total intensity divided by the object area, were stored for subsecuent statistical analysis.
The fit of each data group extracted from images to their respective distribution allows setting the range of variation of each parameter in the model.Therefore, when synthetics images were generated each parameter value of each element on the image was a sample obtained from the corresponding distribution.Finally, with a statistical analysis of the intensities on line profiles over the identified elements it was possible to reduce information to intensity characteristic parameters and noise associated.Table 1 shows how these parameters were named and their respective noise.Table 2 shows a synopsis of the measurements.

Manual Method.
The procedure was carried out by visual analysis on digital images where ROIs were selected based on RGB intensity levels of the pixels.The criterion of selection was based on the range of the RGB intensities previously obtained.The selected region was bounded by a line located on the perimeter's object.All identified objects and their corresponding metadata were saved on spreadsheets for further analysis.3.4.Semiautomatic Method.RGB images were converted to HSI (hue, saturation, and intensity) color space; because it has advantages for distinguishing features on the segmentation process [18].Additionally, the use of HSI color space could overcome some difficulties that arise when histological slices contain more than one chromogen of pigments [19,20] as in the case of the olfactory tissue of Rhinella arenarum.Therefore, images were segmented by thresholding the HSI components.The levels of the threshold were selected by visual and statistical analysis of those specific signals that allow isolating the N+ objects from the remaining elements.The parameter , presented similar values in N+ and N−; and because of that, it was not used to characterize the image.Thus, segmentation by threshold levels was made on S and I images, yielding two binary images.They presented unwanted information randomly distributed, but relevant information was spatially coincident in both images.Then, by using the logical AND operation the objects of interest were isolated more specifically.An automatic counting tool was used count/size of Image-Pro Plus 5.1, where the filter of area dimension was used to limit the counting.Then, the watershed split algorithm for an automatic separation of overlapping objects was used.The resulting images were contrasted with the raw images and also were analyzed to be sure that the objects of interest were identified.Finally, data obtained from the application of semiautomatic method were saved on spreadsheets for further analysis.

Method's Evaluation.
The semiautomatic and manual methods were evaluated in order to analyze their stability against changes of the previously selected image features.This stage was done on simulated images taking advantage that each image feature can be controlled separately.In these images the number of N+ in phantoms and their respective areas were determined using both methods.As mentioned in the Preliminary Analysis, the selected parameter values to be analyzed were obtained by image analysis of the immunohistochemistry pictures features.In most cases, the selection of parameter values to test was random while others were kept fixed (see Table 3); and for each value test, three images were synthetized.
The results of the counting of N+ objects were expressed as median ± standard deviation, and an analysis of variance was applied on them.Kolmogorov-Smirnov's test was used to determine the normality of the data and Levene's test was carried out to analyze the variances homogeneity.The quantification of the areas was analyzed using histograms, which were constructed for each condition analyzed and for both methods.
3.6.Method's Validation.Data obtained from manual and semiautomatic analyses made on synthetic images were compared, where the number of elements representing N+ and their respective areas were quantified using a hit rate relationship.To match the detected elements in manual method with semiautomatic method each one was recognized as being the minimum Euclidean distance regarding from the phantom data.A -test of paired samples was made in order to determine the methods accuracy in the area quantification face to different conditions.Fifteen N+ objects per condition for five replications were compared, measurements versus actual values.In order to establish what percentage of the actual area accounted for these differences, we calculated the percentage error.Using the graphic method, the relationship between the measurements magnitude and differences with the actual values were discarded [21].

Validation on Immunohistochemical
Images.After the validation on phantoms images, the number of N+ objects and their areas were quantified by both methods in IHC ones.The results were compared using the manual method as the gold standard.The success rate for counting was determined by N+ counting with the semiautomatic method over the manual and applying a -test for paired samples to the areas measured by both methods.For statistical analysis we used the ANOVA test, using Kolmogorov-Smirnov's test to analyze the normality of the data distribution and using Levene's homogeneity of variance.

Computational Model: Quantification Method Design.
Because the image element positions were randomly defined, an overlapping between them was verified.These phenomena were between object-object and object-nonobject, where the overlapping area depicted RGB levels significantly different from the rest of objects.To determine the uncertainty introduced by these phenomena, the overlapping as a success was quantified.For this operation an overlapping index was defined as is showed in the next equation: ∘ Total Obj × 100. ( Overlapping index was determined in twenty five images for both phantoms and IHC images.The outcome of that operation was statistically analyzed.
The types of overlap considered were those between N+ and N+, N+ and N−, and N− and P, because they could influence the results of analysis.In all cases, the overlapped area presented RGB levels significantly different from the rest of the overlapping objects.The overlap was quantified as an event, in order to determine the uncertainty introduced by the phenomena.We used an overlap index which account as percentage by summing all overlapping over the total number of N+.This was statistically analyzed in twenty-five images.The results of overlap show 7.5% in mean term.
This was an important result because the area of overlap showed objects with two intensity zones clearly differentiated.In this sense, the quantification of the overlap allowed knowing how much of the methods uncertainty could be attributable to overlap (Figure 3).

Method's Evaluation:
Counting.Manual and semiautomatic methods were evaluated facing to change on the parameters of the phantoms as shown in Table 3.Under these conditions, there were no significant differences on counting with both manual and semiautomatic methods.Table 4 shows the percentage of number of N+ objects in phantoms.Standard deviations in Table 4 evidence that the counting procedure was stable because these values did not overpass the overlap index.This allows thinking that the small differences obtained in the number of N+ counted would not be related to the counting method, but the overlapping of the elements in phantoms.

Areas.
The performance of area quantification of both manual and semiautomatic methods was evaluated changing the parameters showed in Table 3.The analysis for each case was carried out on histograms, where it is possible to note shifting and spreading of the measurements.Shortly, only evaluation of the changes in levels of RGB N+ and  + is showed (Figures 4 and 5).
In Figure 6, it can be noted that when the overall change in the signal noise has high values ( + = 40 and 55), there is a shift of the whole histogram to higher values of areas.The counterpart results for the semiautomatic method are shown in Figure 7. Unlike the manual method, semiautomatic method seems to be unaffected by the change of the  + .The quantified values of areas in the phantoms generated with  + = 55 appear to be slightly lower than those measured in other conditions.
In Figures 6 and 7, other aspects of the evaluation of the area quantification are presented.In this case, the results when both methods were applied to phantoms when intensities levels of the N+ objects were changed are shown.In this case it is noticeable that histograms maintain its features except at high values of  + , where it is appreciable that they suffer shifting indicating that quantification of the area is not stable to those values or analysis of  + .This set of histograms also showed that there was a difference in the mean value of the areas between manual and semiautomatic  method.This uncertainty was quantified in the validations of the methods using phantoms.

Method's Validation.
The logic of the methodology proposed brings out to the validation of the methods by a two-step way.First, validation aims at analyzing the accuracy of the methods when quantifying the number of N+ objects on phantoms and their respective area, by determining the hit rate percentage just shown in     of pair samples over area quantification results (Table 5), respectively.Results showed, for most of the conditions evaluated, that the success in counting N+ objects with both methods is at least 95%.Differences with a negative tendency were found between actual areas and the quantified by both methods in most tested conditions.These differences did not exceed 5.5% of the full area quantified.Due to this, the values of areas had significant differences with respect to the actual values when they were statistically analyzed.In order to solve this, a correction term was applied, which was calculated as the median of the differences between the values obtained by the methods and the actual values.It was determined one for each condition evaluated for each method.We analyzed the differences between corrected and actual values using the -test for paired samples; these area values were applied correction factor calculated for each of the conditions.The corrected area values do not differ statistically with respect to the actual values of areas.The mean differences between corrected and actual areas become equal to zero.These results are summarized in Table 5.
Another validation step was made with the goal to replace the manual method by the semiautomatic one.This was carried out on real immunohistochemistry images obtained The number of immunopositive nuclei obtained with the manual and the semiautomatic method were analyzed by applying a -test for paired samples.There were no statistically significant differences between results obtained with manual and semiautomatic methods.Finally, a statistic analysis over the quantified areas was made in order to test whether measurements with each method had correspondence.For this, fifteen N+ of each image of each stage were randomly selected and compared the area values obtained with each method.Then a -test for paired samples was applied, and it was found that there was no statistically significant difference (Table 7).
We analyzed the results of counting of nuclei in proliferation with semiautomatic method in each of the stages by applying the ANOVA test, differences being highly statistically significant ( = 0.0038) between levels of cell proliferation stage 42 versus 46.

Discussion
The olfactory epithelium of amphibians, and other vertebrates, is the lining of the lumen of the main olfactory cavity.It presents a pseudo-stratified structure and is basically composed of three cell types: supporting cells, olfactory neurons, and basal cells (stem cells).Columnar support cells are localized from the basal lamina, to the epithelial surface, where most of their nuclei are located.In some species, depending on the degree of development, support ciliated and secretory cells can be observed [22].Olfactory neurons are the second nuclei layer, located approximately in the middle of the epithelium, some of which may also be ciliated or present microvilli [22].These are bipolar cells that have a unique thin dendrite, which reaches the outer surface of the epithelium, where their ends form a mosaic with external terminals supporting cells.Their axons extend beneath the epithelium among the basal cell and internal terminals of supporting cells, to penetrate the basement membrane to reach the brain where they synapse with dendrites of the mitral cells in the glomeruli of the olfactory bulb [23].
Like other intraepithelial neural elements, the axons of the olfactory neurons are intertwined with Schwann cells in the lamina propria, which are inserted into the basal lamina.The third and deepest layer of nuclei corresponds to the basal cells whose apical ends do not seem to reach the outer surface [24].At present, two types of basal cell have been described: basal cells themselves (horizontal basal cells) and goblet cells [22,25].Basal cells allow a constant cell turnover, due to its ability to autoreplace and lead to various differentiated cell types, keeping this feature in adulthood [23,[26][27][28][29][30][31].This also happens in other systems able to renew and regenerate the epithelium, as the intestinal epithelium [32], the epidermis [33], and the hematopoietic system [34], being in the olfactory system one of the first organs where neuronal turnover was detected [30].The properties of neurogenesis and regeneration of sensory neurons from basal cells are also applicable to the vomeronasal olfactory system [35].The cell turnover in the olfactory epithelium allows repairing the damage caused by contact with the environment agents [30].These features make the olfactory epithelium and excellent biological model for quantification of cell proliferation based on a classical immunohistochemical approach [36].However, this method due to its experimental rigor can be successfully applied to other biological systems.
Regarding the methods of morphological quantification, an initial point in consideration is the well-known assertion that when an universally accepted method of quantitative analysis has to be replace by other, it not only must be guaranteed to obtain comparable results, but must also be clearly shown that the new method has more advantages.In order to contribute to this line, in this work we have adapted and applied the new methodology to replace a manual method of image analysis, commonly used for immunohistochemical images quantification revealed with DAB, by a semiautomatic method.The new proposed method has the advantages of being more objective, because it is less prone to systematic errors.It is also faster than the manual one, since the hard work of recognition is carried out by a computer.
Other important and novel aspects of this approach were the use of a computer model for synthetic image generation.They allow testing image analysis methods where images characteristics can be finely controlled.At this point, it is important to mention that in the revised literature the use of computer models as an additional evaluation/validation phase when a method of analysis is going to be replaced by other has not been found.In most cases, the results obtained with the new method are only validated with the gold standard method.However, according to this new methodology, the importance of these two steps is at the same level.While on the one hand, it is important to compare results of the new method against the well accepted method, little can be said about the accuracy of the measurements.Thus, it is only possible to establish that the new proposed method gives results only comparable in terms of the images which were validated.And, by the other hand, if the methods are only tested on synthetic images nothing can be said on how the new method is going to work on real images.
A phantom gives a way of testing method for quantitative image analysis knowing how actually objects of interest are before being imaged.The model that produces these images could include information about the objects characteristics as well as any other relevant data transformation to be analyzed.Convolution with an optical transfer function, for instance, could be part of the model if degradations introduced by the acquisitions system are considered.This is a very important consideration in microscopical quantitative image analysis and it is going to be a short-term research of our group in the methodological line proposed in this work.As it is very well known, microscopes introduce degradations that change those characteristics of the object being imaged.In certain modes of microscopy, such as fluorescence microscopy, these degradations can be modeled theoretically or obtained experimentally.In such cases, phantoms would allow studying image characteristics and how they turn away from the real object.

Figure 1 :
Figure1: Outline of the methodology of replacement of image analysis methods.The main steps of this process correspond to image acquisition and analysis by manual and semiautomatic methods, where black lines indicate the classic process (yellow line corresponds to the gold standard) and red ones the proposed new steps with computational models as an additional tool.
(a) background (B), region of the image that is not covered by tissue, it presented a uniform color with relatively high levels of RGB; (b) a basal region of the tissue where the immunopositive nuclei (N+) are located for PCNA exhibiting intense brown color and whose most intense component was the red channel.(c) A region with nonproliferative cells, depicting immunonegative nuclei for PCNA (N−).(d) The pigmented region (P) was located below the basal layer and sometimes can be observed forming small clusters spread without any special arrangement throughout the epithelia.

3. 2 .
Computational Model.The computational model for synthetic image production was implemented in script-based calculus software, with the following inputs: number of N+, N−, and  objects and level and amplitude of the rapid variations of the RGB intensities of the N+, N−, B, and .As previously mentioned, the  positions of the N+, N−, and  objects were generated randomly from a normal distribution; and the  positions of these objects were randomly generated from a uniform distribution.Finally, the output of the program consisted of a synthetic image (phantom) and an additional data file containing mean levels of RGB intensities, positions -, and area for each N+ in the image (Figure3).

Figure 4 :
Figure4: Histograms of the areas of N+ objects when varying levels of RGB N+ when measurements are carried out using the manual method.

Figure 6 :
Figure6: Changes on histograms of the areas of N+ objects when varying levels of signal noise ( + ) when measurements were carried out using the manual method.

Figure 7 :
Figure7: Histograms of the areas of N+ objects when varying levels of signal noise ( + ) when measurements were carried out using the semiautomatic method.

Table 1 :
Main parameters drawing how they were found and named both in real immunohistochemistry images and their phantoms counterpart.

Table 2 :
Summarizing the measurements obtained from the images of the olfactory epithelium of Rhinella arenarum.

Table 3 :
Values of the parameters modeled in phantoms used in the evaluations and validations of the methods.

Table 4 :
Mean ± SD of the number of counted N+ objects with both methods under different conditions. values of the statistic test made are also showed.

Table 4
Figure 5: Histograms of the areas of N+ objects when varying levels of RGB N+ when measurements are carried out using the semiautomatic method.
and by a -test

Table 5 :
Average differences between actual and measured area values obtained with both methods of quantification.

Table 6 :
Average number of nuclei immunopositive area values and half obtained from the quantification of images of the Rhinella arenarum immunolabeled nares both with the manual and with semiautomatic methods.Indexes success and value of p obtained from comparing the number of immunopositive nuclei obtained with manual and semiautomatic method.

Table 6 .
In this table, we may see that the success rate in each one of the stages is close to 100%.

Table 7 :
Comparison of the values of areas of immunopositive nuclei obtained with manual and semiautomatic methods.