The Use of Fractal Features from the Periphery of Cell Nuclei as a Classification Tool

A polygonization‐based method is used to estimate the fractal dimension and several new scalar lacunarity features from digitized transmission electron micrographs (TEM) of mouse liver cell nuclei. The fractal features have been estimated in different segments of 1D curves obtained by scanning the 2D cell nuclei in a spiral‐like fashion called “peel‐off scanning”. This is a venue to separate estimates of fractal features in the center and periphery of a cell nucleus. Our aim was to see if a small set of fractal features could discriminate between samples from normal liver, hyperplastic nodules and hepatocellular carcinomas. The Bhattacharyya distance was used to evaluate the features. Bayesian classification with pooled covariance matrix and equal prior probabilities was used as the rule for classification. Several single fractal features estimated from the periphery of the cell nuclei discriminated samples from the hyperplastic nodules and hepatocellular carcinomas from normal ones. The outer 25–30% of the cell nuclei contained important texture information about the differences between the classes. The polygonization‐based method was also used as an analysis tool to relate the differences between the classes to differences in the chromatin structure.


Introduction
There is an increasing interest in using fractals in biology and medicine, particularly to describe single cells [10,15,32,39,47,53]. However, most studies of fractal features of cells are considering fractal descriptions of the curve outlining the cell contour in binary images, and are almost exclusively using only the fractal dimension [4,29,33,37]. Only a few studies [6,17,18,25,35] have utilized fractal geometry to characterize the chromatin structure in gray level images of the cell nuclei. The fractal approach may be helpful in detecting ultrastructural changes of nuclear components which occur in cell tissues during pathologic processes [34].
It is known that carcinogenesis can involve alterations of normal gene regulation [12], resulting not only in an increased amount but also structural differences in the chromatin [31,48,57]. Danielsen et al. [12] noted several variables that discriminated between liver cell nuclei from normal liver and hepatocellular carcinomas. Second order and higher order statistical texture features and texture features based on the Laws convolution matrices have earlier been extracted from the same liver cell material used in the present study [2,3,56], where mean texture features extracted from the whole 2D cell nucleus were used to classify samples as normal, regenerating, hyperplastic nodules and hepatocellular carcinomas. The results demonstrated the possible use of digital texture analysis as a diagnostic aid in tumour pathology.
The aim of the present study was to extract separate estimates of fractal features in the center and periphery of the cell nuclei and to select a small set of such features for classification. Of special interest for us was a quantification of the chromatin structure close to the nuclear membrane.
As a venue to separate estimates of fractal features in the center and periphery of a cell nucleus, we have transformed 2D gray level images of liver cell nuclei into 1D gray level signals by scanning the cell nuclei in a spiral-like fashion called "peel-off scanning". A 1D polygonization-based method [1] was then used to estimate fractal texture features in different segments of the 1D gray level curves.
An estimate of the fractal dimension of a 2D gray level surface is usually obtained by counting the number of boxes of different sizes that are needed to cover the surface. A number of refinements of this 2D scheme exist [19,28,44,50]. The successive "blanket" method [40] also gives good approximation to the fractal dimension of 2D surfaces [1].
For 1D signals such as those obtained by our scanning procedure, several algorithms are applicable to the problem of measuring the length of an irregular curve. Mandelbrot [36] suggested walking a yardstick of length along the curve. This constant-length divider-step method is not even theoretically exact for curves without self-similarity, and yields poor results in practice if the curve contains only a small number of points [38]. Normant and Tricot [38] proposed a constant-deviation variable-step method, using local convex hulls. However, this method involves a complicated algorithm, resulting in long computation times. We therefore proposed a fast and reliable polygonization-based method for estimation of fractal dimension [1] which we have applied in the present study.
Mandelbrot [36] also introduced lacunarity, Λ, to quantify texture. The lacunarity definition of Voss [50] has the form of a set of variance measures, computed from a set of distributions, each of which is obtained for a certain value of the box size. Keller et al. [28] introduced another measure of lacunarity, also based on a function obtained for different values of the box size. The significance of the lacunarity vector is that it describes the deviation from homogeneity in the texture. We suggested two new lacunarity definitions [1] which we have used in this study, and we now also introduce a small, compact set of scalar lacunarity features that represent these lacunarity vectors.
We have applied the polygonization-based method to digitized transmission electron micrographs (TEM) of mouse liver cell nuclei from normal and regenerating liver, hyperplastic nodules and hepatocellular carcinomas [14]. For comparison, we have estimated fractal dimension by a 1D " -blanket" method [40]. We have also computed Gray Level Coocurrence Matrix (GLCM) [24] and Gray Level Run Length Matrix (GLRLM) [21] features from the different segments of the 1D gray level curves.
Using fractal dimension and lacunarity, we found that the outer parts of the liver cell nuclei contained im-portant textural information that enabled us to differentiate normal and regenerating samples from noduli and tumour samples.
The polygonization-based method was also used as an analysis tool to give qualitative descriptions of the difference in chromatin texture between the classes.

Care and treatment of animals
C3H/HeJ mice were obtained from Bomholt gård, Ry, Denmark. They were housed in plastic cages on hardwood shavings and fed a commercial diet (R3, Ewos, Sweden) and tap water ad libitum. Except for controls, each mouse was given a single i.p. injection of 0.5 µmol DEN (diethylnitrosamine, Sigma Chemicals Co., St Louis, MO) dissolved in 0.9% NaCl/g body weight within 24 h of birth. The controls were given 0.9% NaCl only. At three weeks of age the mice were weaned, segregated by sex, and male mice were kept and housed in groups of five. Unless otherwise indicated by procedure, the mice were killed by cervical dislocation.

Preparation of liver specimens
The hepatocyte nuclei of normal liver, regenerating liver, nodules and hepatocellular carcinomas were studied [14]. Four to five-month-old mice were used in two different control groups. The first group was a normal (non-proliferating) control group and consisted of liver samples from five animals. The second group contained normal, but proliferating cells, and included liver samples from five animals killed 48 h after a 2/3 partial hepatectomy. Specimens were taken from each of the right lobes from each animal. The third group consisted of a total of 15 nodules (0.5-2 mm in diameter) isolated by liver perfusion of 15 mice, 5 months after the administration of DEN. The perfusion procedure has been described elsewhere [13]. The nodules were isolated from the cell suspensions by filtering through a 250 µm nylon mesh filter. For analysis of hepatocellular carcinomas (fourth group), 15 mice were killed 1 year after the administration of DEN. One well defined tumour was taken randomly from each animal, of which four were excluded from the study by a pathologist, due to a large number of necrotic cells.

Preparation of liver sections and electron
micrographs All specimens were cut into blocks less than 1 mm in size and fixed with 2% glutaraldehyde in 0.1 M cacodylate buffer with 0.1 M sucrose, pH 7.4 at 4 • C for 48 hours. They were rinsed overnight in 0.1 M cacodylate buffer with 0.2 M sucrose and post-fixed with 1% osmium tetroxide in 0.1 M cacodylate buffer with 0.1 M sucrose pH 7.4, at 20 • C. Thereafter the material was dehydrated in graded ethanol, rinsed three times in propylenoxide and embedded in Epon. Ultrathin sections (approximate thickness = 60−90 nm) were stained first with uranylacetate and then lead citrate, and studied at a magnification of 2500 in a JEOL EX1200 transmission electron microscope at 60 kV with a 20 µm aperture.
The cell images were recorded on Kodak 4489 EM photographic film and were examined using a Kontron (Germany) Image Processing System (IBAS), in which a Sony CCD video camera (XC-77CE, Japan) was used to capture the positive electron micrographs (with total magnification of 7500). The images of cell nuclei were selected at random from a larger set of cell images from each sample. Each nuclear image was stored in 512 × 512 × 8 bits, and the pixel resolution was 39 nm per pixel on the cell specimen.

Experimental design
In the first experiment the data was divided into a training set consisting of 5 normal, 5 regenerating, 5 noduli and 5 tumour samples and an independent test set consisting of 10 noduli and 6 tumour samples. This is the same experimental design as was used earlier [2,3,56].
Because of the small number of samples we also performed a second experiment where all samples were utilized in the feature evaluation and the error estimation.
Because the main aim was to identify samples from hyperplastic nodules and hepatocellular carcinomas, we have considered normal and regenerating samples as one (normal+regenerating) class in the statistical analysis. However, we have also checked if it is reasonable to pool these two classes for each of the features chosen. Thus we define the following classes: normal+regenerating, noduli and tumour. The value of each texture feature used here to classify each sample (lesion) was the mean value of about 100 cell nuclei from each sample.

Preprocessing
The image data set available for the present study had been pre-processed by a 3 × 3 median filter [42]. This was done in order to reduce possible noise without too much unwanted altering of the local texture [56]. For the present application, the region of interest includes the cell nucleus only. This was obtained by manual segmentation. After segmentation, the histograms of all images were normalized to the same mean value (127.5) and standard deviation (50.0). The gray level value 255 was used as background.

Spiral scanning
An 8-neighbour backtracking bug follower [42] has been used as a spiral scanning algorithm. Starting with the segmented cell nucleus, we follow the (outer) contour of the nucleus, and spiral inwards as we peel off layer by layer of the nucleus, forming a 1D gray level signal. In this type of spiral scanning, which we call "peel-off scanning", the resulting 1D gray level curve only reflects the size and contrast of structures inside the nucleus, not the morphology of the nuclear membrane.
The 1D gray level signal resulting from the "peeloff scanning" of each TEM image was subdivided into 10 segments, each representing approximately 10% of the total area of the nucleus (see Fig. 1). Each segment contained an average of 2468 pixels. Two features of the 1D signal were immediately visible for most of the cells: (1) The dark outer 10-15% of the area close to the nuclear membrane, corresponding to the accumulation of dark chromatin and (2) the repetitive pattern in the 1D signal, corresponding to one circumference of the remaining area of the cell nucleus in the "peel-off scanning" process. Figure 2 shows examples of 2D cell nuclei from the three classes, and the resulting 1D signals corresponding to the second peripheral segments. The second peripheral segments correspond to about five pixel layers of the nuclei in the "peel-off scanning". The intensely stained particles (dark pixels) represent heterochromatin and the lighter pixels represent euchromatin. To illustrate the differences in texture, the 1D signal is shown for cell nuclei chosen from the normal, noduli and tumour classes.  (1,3,5,7,9) and (c) the resulting 1D gray level signal.

Fractal dimension by the polygonization-based method
Our method for estimation of fractal dimension of 1D curves [1] is based on the polygonization method of Wall and Danielsson [52]. Wall-Danielsson's method steps from point to point through an ordered sequence of points (x i , f i ), and outputs the previous point as a new breakpoint if the area deviation A i per unit length of the approximating line segment s i exceeds a prespecified tolerance, T WD (see Fig. 3).
If |A i |/s i < T WD , i is incremented and (A i , s i ) is recomputed. Otherwise, the previous point is a new breakpoint and the previous value of s i is stored. This method is purely sequential and very fast. We approximate the 1D gray level signal obtained by the "peel-off scanning" process by polygonization with several values of the tolerance, T WD . For each tolerance value the total length of the line segments that approximate the curve is summed up by Here, the set of tolerance values is computed from a Fibonacci sequence T WD (k) = T WD (0)F ib(k + 1), where T WD (0) = 0.25, and F ib(1) = 1, F ib(2) = 2, F ib(n) = F ib(n − 1) + F ib(n − 2). The Fibonacci sequence is used in order to obtain more points within a certain {log(T WD ), log(S TWD )}-range, and thereby a more reliable value of the estimated fractal dimension, as compared to ordinary doubling of tolerance values. For a true fractal the slope of the log(S TWD ) versus log(T WD ) graph is independent of the scale parameter T WD . For real world physical objects, we merely expect it to be approximately constant within some range of scale. Finding this range of scale is a tricky problem for any fractal dimension estimation method, and often leads to visual inspection of the {log(T WD ), log(S TWD )} plot.
Using the variable length polygonization concept, the upper and lower limits of the useful range of T WD values may be found automatically, without visual inspection [1]. We simply assume that there is an upper and a lower limit to the number of line segments, l, in a useful approximation of a curve consisting of M points. After some testing on the liver cell images, we found that a lower limit of l = (M − 1) × 0.01 and an upper limit of l = (M − 1) × 0.15 will give {log(T WD ), log(S TWD )}-values that approximate a straight line. Thus, the minimum average support of the line segments is between seven and eight points, and the minimum number of line segments is 1% of the number of points.
Given a set of n remaining points in the {log(T WD ), log(S TWD )}-domain, we find the coefficients of a leastsquares fitted linear relation, as well as the uncertainty in the linear slope coefficient, and the linear correlation coefficient, r T ,S . Thus, we have an estimate of the validity of the fit as well as the uncertainty in the estimated fractal dimensionD TWD (the Hurst-parameter,Ĥ TWD = 2 −D TWD ).
In this study, the coefficient a from Eq. (2) is also used as a texture feature. This "prefactor" [36] is related to the "true" length of the 1D curve. It is therefore related to both the fractal dimension of the texture and the size of the cell nucleus, but it may also contain additional textural information [36]. Figure 4 shows the log-log plots for two liver cell images chosen from the normal and tumour classes, together with the corresponding regression lines estimated on the ba-  sis of the automatic selection of the useful linear range, as described above.

Lacunarity by the polygonization-based
method For each tolerance value T WD of the polygonization we first find the distribution of the length, s, of the line segments approximating the 1D gray level curve [1]. This is done as a simple accumulation during the polygonization. Examples of such distributions for the normal and tumour classes (averaged over all cells in each class) are shown in Fig. 5. We observe that the average distributions are different for the two classes, depending on the value of the tolerance T WD . In order to characterize these distributions, we have therefore computed first order lacunarity values (e.g., mean value µ s (T WD ), standard deviation σ s (T WD ), third moment m3 s (T WD ) and fourth moment m4 s (T WD )) from this distribution, for each tolerance value T WD .  and tumour ( ) class, using the µs(T WD ) lacunarity obtained by the 1D polygonization method. The lacunarity vectors were fitted in the log-log domain in a least-square sense by second order polynomials and the inverse of these curves are also plotted (solid lines). The lacunarity vectors were extracted from the second 10% peripheral segment of the cell nuclei.

From lacunarity vectors to scalar lacunarity
features A low dimensionality feature space will often prove beneficial from a classification performance point of view. It may therefore be wise to compress as much information as possible into only a few salient features. Thus, it is necessary to extract only a few scalar features from each lacunarity vector. In order to do this, we transformed the lacunarity vectors into the log-log domain, fitted them in a least-squares sense by secondorder polynomials, e.g., and used the coefficients α, β and γ describing each lacunarity vector (e.g., α µ , β µ , and γ µ from µ s (T WD )) as a new compact set of features to represent the lacunarity of the texture. Performing the least squares quadratic regression in the log-log domain resulted in a very good fit to the lacunarity vectors µ s (T WD ) and σ s (T WD ), even when transformed back to the linear domain (see Fig. 6). The regression coefficients α, β and γ from these two lacuarity vectors were therfore used as new lacunarity features in this study.

Fractal dimension by the " -blanket" method
Peleg et al. [40] used gray level erosion and dilation to estimate fractal signatures of a gray level intensity surface. This is equivalent to covering the gray level surface with a succession of "blankets" with increasing thickness. The covering "blanket" is defined by its upper surface u and its lower surface l . The scale-volume of the "blanket" around a 2D gray level intensity surface f (x, y) is defined by The surface area as a function of thickness is the volume of the added layer from radius − 1, divided by 2 to account for the upper and lower layers: The surface area decreases at coarser resolutions since fine details that contribute to the area disappear. Now the slope of A( ) on the log-log scale is an estimate of 2 −D (the Hurst parameterĤ = 3 −D ). The final estimateD is obtained by a linear regression over the range of for which the slope is approximately constant. We have used the 1D version of this method with -values from 1 to 25.

Lacunarity by the " -blanket" method
In its original version the method of Peleg et al. [40] only estimates the fractal dimension, summing up the vertical distances between the upper and the lower "blanket" surface for each value of . The information contained in the distribution of these distances is not utilized by Peleg et al. However, a histogram of the surface distance τ for each value of opens up the possibility of obtaining a number of first order lacunarity vectors (e.g., mean value µ τ ( ), standard deviation σ τ ( ), third moment m3 τ ( ) and fourth moment m4 τ ( )) [1]. These four lacunarity vectors were computed, and new scalar lacunarity features were extracted from each lacunarity vector in the same way as described above (Section 2.3.3).

Gray Level Cooccurrence Matrix and Gray Level Run Length Matrix features
A number of 2D Gray Level Coocurrence Matrix (GLCM) [24] and Gray Level Run Length Matrix (GLRLM) [21] features have given reasonably good results (correct classification rates ∈ (65-75%) on the training data) for the same material [56]. These 2D features were obtained by combining matrices from orthogonal directions, thus assuming directional isotropy in the texture, and using the whole 2D area of the cell nucleus, thus assuming a stationary nuclear chromatin structure. The best results were obtained when the number of gray levels in the images were reduced to G = 16 before the matrices were computed. In the case of GLCM, an inter-pixel distance d = 3 was demonstrated to be optimal.
From each 1D gray level scan obtained by the "peeloff scanning", a GLCM and a GLRLM were obtained for each of the 10 segments, from the nuclear membrane to the nuclear center. The directions in the cell nuclei that were utilized to extract the GLCM and GLRLM seems more natural than, e.g., the horizontal and vertical directions, particularly if one wants to separate estimates from the center and the periphery of the cell nuclei.
The following nine often used GLCM features were extracted: Angular Second Moment, Contrast, Correlation, Variance, Inverse Difference Moment, Entropy [24], Cluster Shade, Cluster Prominence [9], and Diagonal Moment [5]. From the GLRLM, the following seven features were extracted: Short Runs Emphasis, Long Runs Emphasis, Gray Level Nonuniformity, Run-Length Non-uniformity, Run Percentage [21], Low Gray Level Runs Emphasis and High Gray Level Runs Emphasis [8].

Feature evaluation
In this study, texture features have been evaluated according to the statistical probabilistic Bhattacharyya distance, J B (ω u , ω v ), between classes ω u and ω v [23]. We have assumed normal distribution of features within each class.
As suggested by, e.g., Kittler [30], we have used a multiclass feature evaluation criterion defined in terms of the above two-class probabilistic distance measure as an average of the pairwise distances, weighted by the a priori probabilities P (ω u,v ) We have assumed equal a priori probabilities.
J B (ω u , ω v ) was also used here to estimate upper and lower bounds on the pairwise Bayesian classification error, ε u,v (see, e.g., [23]).

Classification
Bayesian classification with equal prior probabilities for each class was used as the rule for classification. The value of each texture feature used here to classify each sample was the mean value of about 100 cell nuclei. The feature distribution within each class was assumed to be multivariate normal and the within-class covariance matrices were assumed equal.
We have used the leave-one-out method to estimate the misclassification rates in the training set of the first experiment and in the whole data set of the second experiment. In leave-one-out one merely classifies in turn each sample based on the classifier designed on the other N − 1 samples [23]. The set of features is obviously kept the same during the whole procedure. The total number of misclassifications relative to the design set size is then used as an estimate of the Bayes error rate.
In the first experiment the designed classifier was tested on the independent test set.

Results
The 1D gray level signal resulting from the "peeloff scanning" of each TEM image was subdivided into 10 segments, each representing approximately 10% of the total area of the nucleus. Features were extracted from each of the 10 segments, and the mean value of about 100 cell nuclei was used to represent each sample. Histograms of the features of all cells in each class verified that our assumption of approximately normally distributed features within each class was correct.

Training set
For the second 10% peripheral segment of the cell nuclei (see Fig. 2), the best single polygonizationbased features, ranked according to Eq. (6), are given in Table 1. Several single fractal features estimated from the periphery of the cell nuclei discriminated between normal+regenerating, noduli and tumour samples. The best feature, a, which gave a 0% error rate, was then tested on the independent test set. The classification errors (ERR) estimated by the leave-one-out method are also given. The features were extracted from the second 10% peripheral segment of cell nuclei. The estimates are based on the 20 samples of the training set (experimental design 1). The classification errors (ERR) estimated by the leave-one-out method are also given. The features were extracted from the second 10% peripheral segment of cell nuclei. The estimates are based on the whole data set (experimental design 2).

Independent test set
The error rate (ERR) on the test set was 70%. Five tumour samples were classified as noduli and three noduli samples were classified as tumour. One tumour and one noduli sample were classified as normals. The conclusion from this experiment was that the fractal features could not discriminate noduli samples from tumour. However, the fractal features seemed to discriminate normal+regenerating samples from noduli and tumour.

1D fractal features -experimental design 2
For the second 10% peripheral segment of the cell nuclei, the best single features, ranked according to Eq. (6), are now given in Table 2. Several single fractal features estimated from the periphery of the cell nuclei discriminated samples from the noduli and tumour classes from normal+regenerating ones.
The outer 25-30% of the nuclei contained most of the fractal textural information about the differences between the classes (see Fig. 7). Alternatively one may therefore utilize more data (e.g., the outer 20, 25, or 30% of the nucleus) in the estimates. Because of the uncertainty caused by the manual segmentation process, one may also remove the outer 5%, and then utilize, e.g., the next 20% of the nuclear area. In this study, we have chosen to use the second 10% peripheral segment, but the exact figures were not critical.

Fractal dimension by the polygonization-based
method An average number of five points was used in the linear regression (Eq. (2)) to obtain the estimated fractal dimension of each cell nucleus. The average linear correlation coefficient for the log-log plots was r T ,S = −0.99 with standard deviation 0.002, and the average uncertainty in the linear slope coefficient was 0.18 (with standard deviation 0.06).
The fractal dimension in the periphery of the cell nuclei (the outer 25-30%) was lower for samples in the noduli and tumour classes (Ĥ TWD was higher) than for normal+regenerating ones (see Fig. 7(top)). In the center of the nuclei, on the other hand, the fractal dimension of the three classes overlapped. We observe that the standard deviations vary, both between the classes Fig. 7. Average estimated Hurst parameter,Ĥ T WD , (top), regression coefficient a (middel) and scalar lacunarity parameter, αµ, (bottom) for samples from the normal+regenerating (--), noduli (-·) and tumour (-) classes, as obtained from the 1D polygonization method, are plotted versus segment. The standard deviation of each feature for each class and segment is indicated. This is based on the mean feature values obtained for each sample in each class. and between the segments. For all features, the normal class shows the smallest variation between samples, while the tumour class has a significantly greater inter-sample variation. For the features studied here the standard deviation of the cell-wise features within samples were a factor of up to six higher than the standard deviation of the mean sample values.
Note the relatively high value of J B for the feature a from Eq. (2) (see Table 2 and Fig. 7(middle)). Each of the featuresĤ TWD and a of segment number two discriminated noduli and tumour samples from normal+regenerating ones with an error rate (ERR) of 5.6%. One tumour sample and one noduli sample were classified as normal in both cases.

Lacunarity by the polygonization-based
method As in the case of fractal dimension, Table 2 indicates that several scalar lacunarity features extracted from the periphery of the cell nuclei may discriminate samples from the noduli and tumour classes from normal+regenerating ones. Figure 7(bottom) gives a more detailed picture for one of these features. The α µ feature of the outer segments was lower for samples from the noduli and tumour classes than for nor-mal+regenerating ones. Each of the features α µ and α σ discriminated normal+regenerating samples from noduli and tumour samples with an error rate (ERR) of 8.3%. One tumour sample and two noduli samples were classified as normal.

Fractal dimension by the " -blanket" method
The estimated J B (ω u , ω v ) for theĤ feature was lower than the corresponding J B (ω u , ω v ) forĤ TWD , except for the discrimination between noduli and tumour samples, where the class overlap is considerable anyway (see Table 2). However, the fractal dimensions estimated from the two different 1D methods correlated very well (rĤ T WD ,Ĥ = 0.95), although the estimated numerical values were different (see Fig. 8).

Lacunarity by the " -blanket" method
Of the scalar features α, β and γ representing each lacunarity vector, the feature α was the best to discriminate between the classes (see Table 2).

Gray Level Coocurrence Matrix and Gray Level Run Length Matrix features
A number of the GLCM and GLRLM features seemed to benefit from being applied separately to the outer parts of the cell nuclei (see Fig. 9). On the other hand, some features, e.g., Shade, seemed to be able to discriminate samples from the tumour class from the other classes, when estimated from the non-peripheral part (70%) of the nuclei (see Fig. 10).

Combination of 1D features
As shown by Table 2 and Figs 7 and 9, several fractal features as well as GLCM and GLRLM features were able to discriminate the normal+regenerating samples from the noduli and tumour samples, provided that a peripheral segment is used. Only a few features, like Shade, seemed to be able to discriminate the tumour class from the other classes, but now the nonperipheral part of the nuclei was used. Thus, when searching for good feature combinations, a full combinatorial search can be avoided.
The combination of a single fractal feature estimated from the periphery of the cell nuclei and the GLCM feature Shade, extracted from the non-periphery of the nuclei seemed to be useful to discriminate the three classes. Examples of such combinations of two features are given in Fig. 11, giving the mean feature values per sample for the GLCM feature Shade, and the two features a and α µ obtained by the 1D polygonization method. Figure 11 also gives the ellipses describing the pooled covariance matrix and the decision boundaries for the classifier computed from the whole data set. These combinations gave an error rate (ERR) of 5.6%. One tumour sample and one noduli sample was classified as normal.

Discussion
Chromatin structure has traditionally been regarded as an important diagnostic clue in pathology. The chromatin structure, which both reflects and partially controls genetic functions [54,58], changes within cell nuclei as cancer develops [41]. In routine pathology changes in cell nuclei are assessed subjectively and are used in diagnosis and in grading of malignancies. Quantification of such changes may therefore be a diagnostic aid, and may also contribute towards a further understanding of the biological processes involved.
For several reasons the liver has proved especially useful for experimental studies of carcinogenesis. The administration of a single dose of diethylnitrosamine to infant mice induces a sequential development of foci, hyperplastic nodules and hepatocellular carcinomas. The premalignant nature of foci and hyperplastic nodules is well documented [22,45,49]. The liver is normally a quiescent organ whereas both hyperplastic nodules and hepatocarcinomas are proliferative samples. Regeneration can be induced by partial hepatectomy and one may use regenerating liver as a positive control for proliferation. Furthermore, it is relatively easy to isolate a large number of intact living cells, both before and after the induced carcinogenesis.
An increased amount of condensed chromatin during carcinogenesis has been shown for liver as well as other tissues (see [11] and references therein). Danielsen et al. [11] observed a stepwise development of changes in the chromatin in liver carcinogenesis. Some of the early changes, observed in preneoplastic nodules, were reduction in nuclear size, increased size of condensed chromatin particles and a reduction in the number of chromatin particles.
Of particular interest is a quantification of the tendency of condensed heterochromatin to be located adjacent to the nuclear envelope [57]. Walker [51] proposed a small set of additional features to the Statistical Geometric Features of Chen et al. [7], tailored to detect some of these differences. However, features designed to measure the average displacement and average inertia of chromatin clumps from the center of the gravity of the nucleus, were not reported to capture chromatin changes during cell dysplasia, when tested on a set of cervical cell images.
Jagoe et al. [26] separated non-neoplastic and neoplastic liver biopsies by using a linear discriminant based on nuclear shape and texture. They found, however, that texture measurements designed to reflect aggregation of nuclear stained material close to the nuclear membrane did not appear to be of value when discriminating between the two groups of biopsies.
Irinopoulou [25] evaluated the relationship between morphometric features and prognosis in patients with prostatic carcinoma. Dimension-, form-, and texturerelated nuclear features, both classic and fractal, were computed. A discriminant function based on five texture-related features allowed complete separation between good and poor prognosis in the training set, but the fractal dimension was not among the selected features.
Einstein [17] applied fractal dimension to describe chromatin appearance in biopsies of the breast. While the mean Minkowski dimension did not differ significantly between the benign and malignant samples, the average mean spectral fractal dimension differed significantly. Using logistic regression to classify a training set of 19 patients based on their mean fractal texture dimension and nuclear areas, 16 cases (84%) were diagnosed correctly.
In the medical literature, estimates of the exact fractal dimension of, e.g., boundaries of tumours and cell profiles are reported [10]. Because of the preparation of liver sections, the imaging process and the preprocessing of the digitized images, it may be difficult to give the exact fractal dimension of the chromatin structure itself. Our aim has therefore not been to estimate the exact fractal dimension, but to see if there were sig- nificant differences in fractal dimension estimates between the classes.
In the present study, we have shown that the fractal dimension in the periphery of mouse liver cell nuclei was lower for samples in the noduli and tumour classes than for normal+regenerating ones. In the center of the cell nuclei, on the other hand, the fractal dimension of the three classes overlap. Thus, fractal dimension estimated from the whole 1D signal or from the whole 2D nucleus was not able to discriminate between samples from the normal, noduli and tumour classes.
The "prefactor" a (Eq. (2)) was correlated witĥ H TWD (r a,ĤT WD = −0.91) and with the area of the cell nucleus (r a,M = 0.80). Danielsen et al. [11] found that the area of nuclear profiles was lower for nuclei from the noduli and tumour classes than for normal and regenerating ones. We have found that the fractal dimension was lower (Ĥ TWD was higher) for samples from the noduli and tumour classes than for nor-mal+regenerating ones. The combination of these results explains the good results obtained by the feature a. The feature a is, however, not area-invariant. An area invariant feature would be a/log(area), but this feature is even more correlated with the fractal dimension (r = −0.96). In our study, the mean area of the cell nuclei per sample could not be used to discriminate between the three classes (J B = 0.3).
We have introduced a small, compact set of scalar fractal lacunarity features, based on the distribution of the length of the line segments or the distribution of distance between surfaces that approximate the 1D curves. We found that several such scalar fractal features estimated from the periphery of nuclei could discriminate noduli and tumour samples from normal+regenerating ones. We have also found that the classification performance of many of the GLCM and GLRLM features is also enhanced if separate estimates are made in the center and periphery of the cell nuclei. However, the features estimated from the periphery of the cell nuclei could not by themselves discriminate noduli from tumour. To discriminate between all three classes, it seems that we have to combine a feature estimated from the periphery of the nuclei with texture information extracted from the non-peripheral part of the nuclei.
The distributions of line lengths, s, of the line segments approximating the 1D gray level curves are skewed (see Fig. 5). We have characterized these distributions by computing first order statistical moments. Alternatively, median and percentiles could be used to characterize these distributions, but the results would be similar.
In this study, the linear discriminant functions gave satisfactory results. Linear discriminant functions have a variety of pleasant properties from an analytical point of view. Even when they are not optimal, one might be willing to sacrifice some performance to gain the advantage of simplicity [16]. We have assumed equal within-class covariance matrices in the classification, but we have observed greater variance in the noduli and tumour classes than in the normal+regenerating one (see Figs 7,9 and 10).
Small sample effects make the problem of designing a pattern classification system very difficult [27,43]. We have studied chromatin changes during experimental carcinogenesis in in-bred mice. In this way we have obtained well controlled noduli samples, which is difficult when using clinical material, and thus made it possible to work with a smaller data set. It has previously been observed that for the data set used here the statistical variation between the samples (normal liver) on measurement of, e.g., nuclear fraction of condensed chromatin was as low as 2-4%, whereas the variation between nuclear profiles from the same biopsy could be as high as 35% [12,56]. Because of the reported high variations between nuclear profiles we have used approximately 100 nuclei from each sample. Increasing the number of nuclear profiles above 100 is not expected to improve the results statistically, as previously shown [11]. This has also been demonstrated for light microscopy images of prostate lesions [55]. For the features studied here the standard deviation of the cell-wise features within samples were a factor of up to six higher than the standard deviation of the mean sample values. Hence the most important issue is a statistically sufficient number of cell nuclei per sample.
Ideally an independent data set should be used to estimate misclassification rates [46]. In the first experiment we used a training set consisting of five normal, five regenerating, five noduli and five tumour samples. Feature selection was done based on the results obtained from these samples. The classifier was then tested on an independent test set consisting of 10 noduli and 6 tumour samples. However, because of the small number of samples available, we have also chosen to utilize all samples in the estimation of the Bhattacharyya distances end error rates ( Table 2). The features selected, the range of the features, and the con-clusion were the same in both experiments. The fractal features estimated from the periphery of nuclei could not discriminate noduli from tumour, but discriminated normal and regenerating samples from noduli and tumour samples.
It is obviously of interest to analyze the results of the polygonization-based method, related to the differences between the classes expressed in terms of differences in the chromatin structure. Thus, 2D normalized histograms of line length s versus distance between breakpoints in the x-direction of the digital curve were generated. Mean histograms were computed over all cells in each class, and difference histograms (Fig. 12) for each tolerance were used as a tool to analyze the difference in texture between the classes. Line segments of the same length may have widely different gradients. From Fig. 12 one may observe that for a given horizontal distance between breakpoints in the 1D signal, shorter polygonization line segments are more probable in tumour cell nuclei, generally implying lower gray level gradients. Separate difference histograms were also computed for line segments in the dark area (gray level 127), in the bright area (gray level > 127) and for the transition between bright and dark. The main difference between the classes were found in the darker areas of the images and in the transitions between darker and lighter areas. The histograms were extracted from the second 10% peripheral segment of the cell nuclei. The lighter area corresponds to line segments that are more probable for the tumour class than for the normal class and the darker area corresponds to line segments that are less probable for the tumour class than for the normal class.
The above analysis leads directly to the observation that GLCM matrices will be more concentrated along the diagonal in the tumour class than in the normal. The same relation holds for the relation between the noduli and normal classes. GLCM features computed from the peripheral parts of cell nuclei using an interpixel distance d = 3 support the above analysis results. Thus, Inverse Difference Moment, which generally gives high values for homogeneous images, was higher for noduli and tumour samples than for nor-mal+regenerating ones. Contrast, on the other hand, will favour contributions away from the diagonal of the cooccurrence matrix. Consequently, Contrast was higher for the normal+regenerating class than for the noduli and tumour classes (see Fig. 9). We note that a lot of GLCM's would have to be accumulated to match the width of information contained in Fig. 12.
Difference histograms computed from the non-peripheral part (70%) of the cell nuclei confirmed that the main fractal information about the difference between the classes is contained in the peripheral parts of the cell nuclei. From these histograms it was hard to infer anything about the behaviour of the GLCM-features of the central parts of the cell nuclei. However, the results indicate that the feature Shade, when estimated from the non-peripheral part (70%) of the nuclei, will discriminate samples of the tumour class from the other classes (see Fig. 10). The weighting function of the GLCM feature Shade [9] is anti-symmetric with respect to the bi-diagonal of the GLCM matrix. This is completely different from Inverse Difference Moment and Contrast, which are both symmetric with respect to the diagonal. So Shade will be sensitive to skewness along the diagonal, i.e., if outliers are found predominantly on one side of the bi-diagonal. The combination of a fractal feature and Shade discriminated the three classes with a correct classification rate of 94.4%.
Considering the periphery of nuclei, discrete cosine transforms and autocorrelation functions indicated a higher content of high frequencies and smaller structures in the normal nuclei compared to nuclei from the tumour class. This supports the finding of higher fractal dimension in the normal class, and the finding that the GLRLM features Short Run Emphasis and Run Percentage gave higher values for the nor-mal+regenerating class than for the noduli and tumour classes (see Fig. 9). The best two-feature set of Yogesan et al. was Short Run Emphasis and High Gray Level Runs Emphasis with G = 16 [56].
Transforming the 2D images to 1D curves before estimating the fractal features may seem like a waste of contextual information. However, as we have shown, this gives us the possibility of separate and independent estimates of fractal dimension and lacunarity in the center and periphery of a cell nucleus.
This separate treatment of center and periphery would be hard to acheive when using fixed 2D windows or using GLCM or GLRLM matrices based on orthogonal directions. However, it is straightforward when using the "peel-off-scanning". And when assuming a directional isotropy in the local texture of the nuclear chromatine, the approximately tangential direction of the "peel-off-scanning" is the natural choice if we want to perform a radial differentiation of the texture description.
The "peel-off scanning" procedure is not start-point independent. To obtain an invariant description on the cell level, we could have started at several random points, and used an average/median result. However, as we are using the average feature values of about 100 cells to classify each sample, each with a random orientation and thus a random starting point, we have not adopted this procedure.
In the present study, the 1D features of segment number two gave higher Bhattacharyya distances than the corresponding features of the outermost segment. One possible reason for this may be the uncertainty caused by the manual segmentation process. When working with large volumes of cells, a (semi)-automatic segmentation of cell nuclei will be advantageous, as it both reduces this uncertainty and ensures the reproducibility of the results.
In conclusion we have shown that several scalar fractal features estimated from the periphery of nuclei in TEM images could discriminate noduli and tumour samples from normal+regenerating ones in a controlled mouse liver carcinogenesis experiment.
The spiral-like "peel-off-scanning" is a straight forward venue to separate estimates of fractal features in the center and periphery of a cell nucleus or any other object where a radial differentiation of the texture description is desired. Together with the subsequent variable length polygonization it not only provides several fruitful fractal texture features, but may also serve as an analysis tool.
We have demonstrated that the previous implicit assumption of a stationary nuclear texture is suboptimal, and that a number of texture features benefit from being applied separately to the outer parts of the cell nuclei.
The biological implications of the possibility to quantify the fractal parameters of the nuclear membranebound heterochromatin need to be investigated.
Work is in progress on testing the methods presented here on a larger set of clinical data from light microscopy.