Topographical Analysis of Spatial Patterns Generated by a Cellular Automaton Model of the Proliferation of a Cancer Cell Line In Vitro

A well‐suited model to simulate cellular population dynamics is the two‐dimensional cellular automaton model, which consists of a lattice of sites, the value ai,j of each site being updated in discrete time steps according to an identical deterministic rule depending on a neighbourhood of sites around it. A cellular automaton is described which mimics cell population proliferation by replacing the site values by the age and the cycle phase of cells. The model takes into account the size of the cells. It is used to simulate the proliferation of the human breast cancer cell line MCF‐7 and the results of the simulation are compared with experimental data obtained from a light microscopic image analysis of the proliferation process. The initial configuration of the cellular automaton is obtained from the discretization of the results of the initial stage of the image processing. After each day of proliferation the pattern obtained from the simulation is compared to the experimental result of the corresponding image analysis. The comparison is made from a topographical point of view through the concept of the minimal spanning tree graph. The agreement between experiment and model is a good starting point to complex models such as cell proliferation under growth effectors or drugs.


Introduction
Cell proliferation is a fundamental process in biological systems [24]. Cells can remain quiescent for long times or can increase rapidly in number, for example, during embryogenesis and in wound healing. Cell division is controlled by a variety of regulative mechanisms, including the availability of space and the secretion of stimulating and inhibiting factors by cells in their environment. Cancer cells can be shown to have lost many of these controls. The modelisation of cancer cell growth by computer simulation is a way to indicate how cellular functions and microenvironmental factors may influence the morphological patterns obtained through the proliferation process. We will describe a cellular automaton which mimics the proliferating process of cancer cells in vitro. Cells are put on a two-dimensional square lattice. The values of the sites are the age and the cycle phase of the cells. The cell cycle is an ordered sequence of biochemical events leading up to the cell division (mitosis). The cycle illustrated in Fig. 1, can be divided into four phases, called G 1 , S, G 2 and M. In G 1 (the gap period), RNA and proteins are synthetised to prepare for the DNA replication that takes place in S (the synthetic period); then the cell progresses through G 2 , which is the second gap period, and finally enters the mitosis (M). Each daughter cell can again traverse the cycle, or can shift to a quiescent state (G 0 ) during which cells do not divide for long periods. Quiescent cells retain for a long time the ability to recycle under proper conditions or stimuli [2,14]. Before a typical cell can divide it must double its mass and duplicate all of its contents. Only in this way will the two new daughter cells contain all of the components that they need to begin their own cycle of cell growth followed by division. Our model takes the size and the changes in the cell size during the cell cycle and cell division into account. The general structure of the simulation is deduced from the deterministic cellular automata theory [26], but some parameters introduced in the model have a probabilistic nature.
The initial configuration of the cellular automaton is obtained from the discretization of the results of the image analysis of a culture of the breast cancer cell line MCF-7 after Feulgen staining [5]. The results of the simulation are compared daily to the images of time course analysis of the cell proliferation. The comparison is made from a topographical point of view through the concept of the minimal spanning tree (MST) graph. The MST is a graph which provides several ways to analyse the topography (spatial relationships) of objects sets [6,27]. Starting from the hypothesis that the analysis of the spatial patterns of cells may lead to display and determine the control processes between the cells which have induced those patterns, the MST method is well-suited to analyse these mechanisms.

Initial configuration
The initial configuration of the cellular automaton was obtained from the digitization of the results of an image analysis of a culture of the breast cancer cell line MCF-7 after Feulgen staining [5]. Images were recorded via a colour tri-CCD camera (JVC) attached to an inverted microscope (×4 objective, IMT2 Olympus, SCOP, Rungis, France) and then digitized through a SAMBA 2005 (TITN, Grenoble, France) image processing environment with a spatial resolution of 512 × 512. Images were processed on a VAX station 4000 (Digital Equipment Corporation, Nashua, NH, USA) in the MIDAS (Munich Image Data Analysis System) environment. The different steps of image segmentation combine filtering algorithms and mathematical morphology [18,19]. From the density image a 2D Laplacian of Gaussian filter (LOG) was performed. A binary map of the nuclei centers was extracted by the computation of an 8-connexity detection algorithm on the LOG image. Four dilations of this mask using logic to prevent merging of features was then performed. A logical AND between the resulting mask and the one obtained by thresholding the LOG image at the zero crossings ended the segmentation process. This mask of the segmented nuclei was used with the density image to bring out for each nucleus: the (x, y) coordinates of its barycenter which allows us to perform topographical analysis; A, the surface of the nucleus projection on the (x, y) plane, and IOD, the Integrated Optical Density of nuclei stained by Feulgen-Rossenbech stoichiometric nuclear reaction [9,25]. IOD is proportional to the amount of DNA content of cells and consequently related to the cell cycle phase. The IOD histogram reflects the cell distribution in the different phases of the cell cycle. The cells with the least amount of DNA are in G 0 or G 1 , those with double this amount are either in G 2 or M, while those in S have intermediate amounts. By fitting the IOD histogram by gaussians, it is possible to define the three subpopulations G 0 /G 1 , S and G 2 /M [5]. The characteristics of each subpopulation are recorded (position, nuclear surfaces, nuclear integrated densities). The initial phase of each cell is then determined from the values of the IOD histogram in arbitrary units. Because the cells get larger when they traverse the cycle, the age of a cell in each phase is A/(A max − A min )T , where T is the phase duration, and A, A max and A min , the actual, maximal and minimal values of the nuclear area in this phase as measured on experimental images.
Each cell is put on a square lattice of 512 × 512 sites. Opposite sides of the lattice are considered to be continuously linked, in order to minimize side effects. The site (i, j) of each nucleus is the node of the grid which is the nearest neighbour of the real coordinates of the nuclear barycenter. In the model, to take the size of the cells into account, the cell nuclei are considered as disks and the nuclear radius of a cell located at where R max represents the maximum value of all the nuclear radii as measured on experimental images. Figure 2a shows a part of the initial configuration obtained by our image analysis.

Automaton rules
The time step size is one hour. From t to t + 1, the cellular automaton operations are applied to the sites individually according to the following rules.

Grid cells management
(1) During its proliferation, the tumor cell passes the distinct cell cycle phases, G 1 , S, G 2 and M. For the cell cycle phases, a normal distribution (Gaussian distribution) of each of the phase durations T G 1 , T S , T G 2 and T M with standard deviation σ is assumed.
(2) The cell size increases when the cell ages. The initial nuclear radius R 0 (i, j) rises by a factor k, determined by: k = (αR max /R min ) 1/T , where R max and R min are the maximum and minimum values of the nuclear radius in a phase measured on the image, T the phase duration, and α a correcting factor because the experimental values measured could not be the real maximum and minimum values. The factor α decreases from 1.100 to 0.935 when the cells reach confluence.
(3) At any time of the G 1 phase a cell is able to enter the resting phase G 0 . A random number x is compared to the probability of quiescence of the G 1 cells (p(quies)). If x < p(quies), then the cell enters the resting phase G 0 . Otherwise the cell progresses through S. At the end of the G 1 phase, the cell passes into the G 2 phase.
(4) At the end of the S phase, the cell passes automatically into the G 2 phase. (5) At the end of the G 2 phase, the cell passes automatically into the M phase. (6) Mitosis: at the end of its M phase, a cell C located at (i, j) duplicates itself. When division occurs, the radius of each nucleus of the daughter cells returns to its initial value R 0 (i, j). One of the daughter cells will occupy the original position (i, j) and the other will be put on one of its neighbouring sites. In this stage the overlapping of the cells must be prevented. In the neighbourhood of each possible vacant grid site (k, l), one calculates the Euclidian distance d between the site (k, l) and an occupied site (m, n). If d R 0 (i, j)+R(m, n), the site (k, l) may be selected. If there is more than one possible site for the daughter cell, one site is randomly selected. Figure 3 is an illustration of the process of duplication. If there is no free position around the dividing cell, it may detach from the grid according to the probability p(det). A random number y is compared to the probability of detachment (p(det)). If y < p(det), then the cell dies. Otherwise both the daughter cells are released into the culture medium.

Culture medium cell management
A test is performed to know if a detached cell is able to bind again to the grid. A random number z is compared to a parameter called the probability of binding of the cells p(bind). If z < p(bind) and if there is a free site on the grid, the cell is put on the grid in the G 1 phase. If z < p(bind) and if there is no free site on the grid, the cell remains in suspension. The time during which a cell remains in the culture medium is limited to the value T max (susp). If z > p(bind) and if the age of the cell in the culture medium is less than T max (susp), the cell remains in suspension. If z > p(bind) and if the age of the cell in the culture medium is superior to T max (susp), the cell dies.
The flow chart of the whole simulation procedure is given in Fig. 4. After each day of proliferation the pattern obtained from the simulation is compared to the experimental result of an image analysis of the proliferation process [5]. The comparison is made from a topographical point of view through the concept of the minimal spanning tree graph.

Topographical analysis
The power of graph-theoretical methods in the context of detecting and describing the inherent cluster structure in arbitrary point sets with a distance function is well known. The graph enjoying the widest application in pattern recognition is the minimal spanning tree (MST) [27]. An edge-weighted linear graph G is composed of a set of points called nodes and a set of node pairs called edges, with a number called weight (in this paper the Euclidean distance) assigned to each edge. A tree is a connected graph with no loop, and an MST is a tree which contains all the nodes and where the sum of the edges weight is minimal. There exist several algorithms enabling the MST to be computed. In this paper we used Rohlf's algorithm [20]. Depending on the starting point there may be more than one MST for a given set of points, but all the MSTs have the same edge-length histogram. At any time of the G1 phase a cell is able to enter the resting phase G0 with a probability p(quies). When division occurs, one daughter cell stays on the original site and the other will be put on one of its neighbouring free sites. If there is no free site, the daughter cell may detach from the grid with a probability p(det). B: culture medium cells management. A cell in the culture medium is able to bind a free site on the grid with a probability p(bind). The time during which a cell remains in the culture medium is limited to Tmax(susp). It follows that statistical information deduced from this histogram such as the average edge-length m and the standard deviation σ may be used as characteristics for the corresponding distribution. Figure 2b shows the MST of the initial configuration in Fig. 2a. In order to be able to compare different distributions regardless the particle density and the sampling window, the values of m and σ must be normalized. For this purpose, the process described by Hoffman and Jain [11] has been used. In the case of small samples, borders effects on the MST become important [16], but when a large number of points is used, stationarity is assumed [18]. The use of a diagram involving both the values of m and σ enables us to determine the degree of order of any distribution by taking a simple reading in the (m, σ) plane on which well-characterized distributions have previously been located (Fig. 5A). Perfectly orderly distributions are obviously characterized by σ = 0. There exist eleven possibilities permitting the arrangement of a set of points on a plane along regular or semi-regular mosaics [1]. The normalized m values corresponding to each model are shown on Fig. 5B. Each of those arrangements may be randomized by giving each point a new position deduced from its previous position by using a Gaussian distribution with a standard deviation of ω [6]. Then some trajectories in the (m, σ) plane are obtained, each point of the trajectories corresponding to a certain degree of order. In particular the full line in Fig. 5A corresponds to the trajectory related to the triangular lattice. For ω 1.075, the uniform random distribution is reached. The corresponding values m = 0.650 and σ = 0.300 shown on Fig. 5A have been corroborated by Monte Carlo simulations of random distributions generated using the linear congruential method [15]. On the (m, σ) plane are also marked in different areas: cluster structures (small m, σ = 0), gradients of concentration (large σ) and two-dimensional quasiperiodic tiling (large m, σ = 0). Thanks to the normalization process any actual distribution can be plotted in the (m, σ) plane and easily compared to the above mentioned distributions.

Results
The values adopted for the different parameters are listed in Table 1. The phase durations refer to cyto-kinetic literature [3,10,12,21]. In these studies the G 1 and S phase durations are similar, while the G 2 and M phase lengths are distinctively different. The probability of quiescence has been deduced from the computation of a model of a cell population in exponential growth [4,12]. There is a lack of experimental data for the detachment and binding probabilities and for the maximum time of suspension. The adjustments of these parameters as well as G 2 and M phase lengths has been performed by varying their values over a wide range and testing for each set of values the goodness of fit between order degrees of the resulting simulation and the real cell distribution. In order to reach the optimal values (best fit), the goodness of fit was maximized by minimizing an energy function through a simulated annealing process [13]. One starts a simulation with a given parameter set (for example random), and an "energy" function: where C i is the current simulation and C 0 the experimental pattern to be simulated, m and σ the order degrees.
The parameters set of the simulation is then randomly modified and the new parameter configuration, C j , is accepted as the starting state for further rearrangement depending on the Metropolis criterion: where T is a decreasing parameter. The process terminates when no further improvement can be found. In this approach, although the multivariate function have some local minima, the unique global minimum is reached. Figure 6 shows the two proliferating curves obtained from the experimental cultures and from the simulation process. The data were compared by chi-square analysis. The value of the chi-square χ 2 = 2.075 corresponds to a probability of about 0.75. Table 1 Values adopted for the parameters of the simulation   The results of the computation of the degree of order of the different distributions obtained from image analysis and the simulation is shown in Fig. 8. The agreement between experiment and simulation is acceptable because the differences in the degrees of order is of the order of the stochastic fluctuations amplitude (∆m = 0.015, ∆σ = 0.020).

Discussion
Modeling complex physical phenomena by computer simulations has become a common tool for understanding biological phenomena. Realistic models based on physical laws generally result in large systems of non-linear integro-and partial differential equations. Such models lead to formidable numerical problems that require huge memories and fast computers. A way to simplify these numerical systems is to mimic the physical laws by simple rules computed in parallel by using cellular automata. Cellular automata are discrete in time, space and state. The abstract properties of general deterministic cellular automata have been the object of many studies [26]. Recently, Ermentrout et al. [8] have presented a review on cellular automata approaches to biological modeling. In this study our objective was twofold: (i) to simulate the proliferation process of a cancer cell line in vitro based on the cell cycle. (ii) to compare the patterns obtained through the simulation process with those of an experimental time course of the cell proliferation via a topographical concept: the minimal spanning tree graph.
Most of the work done by computer simulation of cancer cell growth employed a rigid twodimensional square tesselated space, where each site represented a cell. Division occurred by placing a daughter cell into one neighbouring site [5,6,19,23,25]. These models cannot be used to simulate changes in cell size [7,17]. Matela and Ransom [17] used another approach, a triangulated planar graph and its dual structure, a trivalent map taking into account cellular changes. As discussed by Duvdevani-Bar and Segel [7], this topological model gives no information on the geometric pattern of which it is an abstraction. Our model uses a rigid two-dimensional tessellation, but cell sizes and changes in cell sizes are simulated.
The simulation-experiment comparison of the proliferation curves and patterns supports the accuracy of the model. Beyond the subjective evaluation, it is important to have a quantitative estimation of the degree of similarity between computed and experimental patterns. Smolle et al. [22] use morphological descriptors to assess this point. Our choice is the minimal spanning tree graph and a good agreement has been obtained by this approach. Since so much of cell biology is, at best, qualitative, cellular automata and minimal spanning tree analysis are good tools in which one needs to specify precise values of parameters before obtaining some results from the model to fit the experiment.
The implemented model is a good start and may be rendered complex in order to take into account cellular phenotypes and cell communication (contact or signal emitting). Then we should be able to simulate cancer cell proliferation under growth effectors such as hormones, anti-hormones, growth factors or anti-tumor agents. Co-cultured cell growth such as for normal and malignant cells can also be simulated with this model.
The analysis of the results obtained from cellular automata through the minimal spanning tree graph is a good tool to quantify hidden informations in cell biology experiments. This approach is expected to provide valuable help in bioassays for understanding of the mechanisms of action of drugs on cellular populations.