Agent-based modelling is being used to represent biological systems with increasing frequency and success. This paper presents the implementation of a new tool for biomolecular reaction modelling in the open source Multiagent Simulator of Neighborhoods framework. The rationale behind this new tool is the necessity to describe interactions at the molecular level to be able to grasp emergent and meaningful biological behaviour. We are particularly interested in characterising and quantifying the various effects that facilitate biocatalysis. Enzymes may display high specificity for their substrates and this information is crucial to the engineering and optimisation of bioprocesses. Simulation results demonstrate that molecule distributions, reaction rate parameters, and structural parameters can be adjusted separately in the simulation allowing a comprehensive study of individual effects in the context of realistic cell environments. While higher percentage of collisions with occurrence of reaction increases the affinity of the enzyme to the substrate, a faster reaction (i.e., turnover number) leads to a smaller number of time steps. Slower diffusion rates and molecular crowding (physical hurdles) decrease the collision rate of reactants, hence reducing the reaction rate, as expected. Also, the random distribution of molecules affects the results significantly.
Microbial chemical factories have become an increasingly important industrial platform, with numerous applications in the food, agriculture, chemical, and pharmaceutical industries [
Recent advances in protein engineering, metabolic engineering, and synthetic biology have revolutionised our ability to discover and design new biosynthetic pathways and engineer industrially viable strains [
The interplay of mathematical modelling and
Researchers are looking into novel approaches for abstraction, for modelling bioprocesses that follow different biochemical and biophysical rules, and for combining different modules into larger models that still allow realistic simulation with the computational power available today.
This paper explores the potential application of agent-based modelling to such complex modelling. Notably, the aim is to develop a computational infrastructure for multiscale biomolecular modelling and simulation based on common biochemical and biophysical rules. The novelty of the work lays on fully considering the spatial location of the molecules and allowing for the description of intricate microscale structures, which enables the modelling of microbial behaviour in more realistic and complex environments. The prototype of the agent-based cellular simulator was developed in the open source Multiagent Simulator of Neighborhoods (MASON) [
Test and validation experiments addressed the correct formulation of diffusion coefficient and reaction rate principles. Then, a simple cellular system was formulated, encompassing most of the rules previously validated and accounting for a realistic number of participants. This experiment exposes the computational requirements imposed by a realistic scenario and raises discussion about future lines of research and development for agent-based biomodelling.
The next sections of the paper describe the biological and computational rationale behind our simulator. Section
Agent-based modelling (ABM) is as a relatively new paradigm for engineering complex and distributed intelligent systems [
Generally, agents can be defined as computer systems that are situated in some environment and that are capable of autonomous action in this environment, based on mechanisms and representations somehow incorporated. The early work of Wooldridge [
In ABM, the purpose is to “monitor” the behaviour of the agent from the perspective of the agent itself, rather than the system as a whole [
An agent-based model (i.e., the automaton) is thus composed of agents (autonomous entities), rules (logic or mathematical), a simulation environment (source of local information), and a set of initial and boundary conditions. Agents may be defined at multiple scales, and the model can formalise the various behaviours through which individuals interact with one another, directly or indirectly, through the shared environment. This requires the preparation of plausible and adequately detailed design plans for how components at various system levels are thought to fit and function together.
Such individual-based modelling has the potential to replicate cellular systems at its minimum components and thus help to understand the linkage from molecular level events to the emerging behaviour of the system [
It is reasonable to say that ABM has become a popular biomodelling approach and the new models are reaching out for increasingly more complex and higher resolution problems. The key challenge is to be able to reproduce different scales realistically, in terms of the number and type of participants involved and the events taking place, whilst balancing the requirements of extendible model granularity with computational tractability. So far, the use of general purpose graphical processing unit (GP GPU) technology and multicore CPU processors are the favoured approaches to parallelise simulation algorithms [
A multiscale agent-based model mimicking the biology of biochemical reactions was developed using MASON version 16 [
The agent-based model is created on a continuous two-dimensional environment, which corresponds to 5
The intracellular environment can be populated by enzymes, some metabolites, and cofactors (e.g., NAD+ and NADH). Once the simulation starts, other types of agents, such as other metabolites, appear in the model in accordance with the behavioural rules. So, the simulation only requires defining the particle radius and diffusion coefficient for each species and the initial number of the molecules. Agents are then distributed randomly and may circulate freely.
Every agent, except obstacles, is randomly initialised with a given orientation. The behaviour of each agent is determined by the corresponding set of behavioural rules and most notably its spatial location (Figure
Decision process for the movement and rotation of an agent during the simulation.
Given the circular shape of the agents, the detection of a collision between agents is based on the Pythagorean Theorem for triangles. That is, collision is detected by knowing that if the distance between the centres of the agents is less than their combined radius the agents are to collide.
In the event of a collision, the simulator identifies the types of the agents involved and looks for any behavioural rules that may apply. Either no rule applies and agents should be reoriented or the matching rule should be executed and agents should be affected accordingly. Agents are reoriented based on the angle of collision and the corresponding diffusion rate [
Most of the rules applicable in a scenario of collision involve enzymes and metabolites, that is, the possible occurrence of an enzymatic reaction (see Section
Particularly, the number of agents representing metabolites and enzymes needs to be compared with values reported in the literature. For this purpose, at model construction, we established a conversion mechanism, between the number of agents in the simulation and the number of moles calculated in laboratorial experiments. In the literature, values of molecules are typically represented as a concentration (e.g., mM). Molar concentrations can be modified to number of molecules per volume unit by simply multiplying the concentration by the Avogadro number. Because this is a 2D simulator, a height also had to be indicated. This was assumed to be 0.005
Each type of agent can be tracked continuously in one run of simulation. To facilitate the inspection and a dimensional representation, distinct agent types are associated with different colours and sizes.
The behavioural rules are twofold: interaction of agents with their environment and responses to the presence of other agents (Table
Agents, behavioural rules, and interacting agents.
Agent | Rules | Interacts with |
---|---|---|
Enzyme (apoenzyme) | Moving and binding | Metabolite and cofactor |
Metabolite | Moving, binding, and death | Enzyme-cofactor complex |
Cofactor | Moving, binding, and reconverting | Enzyme and cell membrane |
Enzyme-cofactor complex (holoenzyme) | Moving, reaction, and decoupling | Metabolite |
Obstacle | Preventing movement | All agents in movement |
The obstacles aim to mimic the presence of other lower level molecules in the intracellular space. They are not represented individually to preserve computational tractability (e.g., a bacterial cell contains approximately
In certain cases, the type of agent can be changed and consequently, the corresponding behavioural rules of the new type would be applied to the agent. This transition is typically based on the spatial location of the agent, its type, and the local environment. One example of agent type reassignment is the “recycling” of NADH molecules to NAD+ molecules whenever NADH agents collide with the cell membrane.
Regarding agent interaction, enzymes interact with cofactors and metabolites. Many enzymes require the assistance of cofactors in biochemical transformations. When an enzyme agent and a cofactor agent collide, the enzyme checks whether it requires the cofactor to operate. If so, a new agent representing the enzyme-cofactor complex (holoenzyme) is created in replacement of the two agents.
The interaction between the enzymes (or enzyme-cofactor complexes) and metabolites represents catalysis and was modelled according to Michaelis-Menten equation (see kinetic parameters section). In general, metabolites and enzymes are supposed to react within a certain probability whenever they collide, and the enzymatic reaction may be concluded in the same time step or after a number of time steps.
After the enzymatic reaction takes place, that is, the enzyme-cofactor complex collides with a substrate, the complex is destroyed and the agents representing the enzyme and the cofactor are created again. Likewise, the agents representing the substrate disappear and new agents are created for the products of the reaction.
The critical part of developing our model was related with incorporating kinetic information on the cellular dynamics, especially on different enzymatic reaction kinetics.
Generally, the kinetic scheme representing an enzymatic reaction under steady-state conditions is written as
Since the rate constants for the binding and unbinding reactions are either often unknown or difficult to determine, modelling has to rely on approximations, also called aggregate rate laws, such as the Michaelis-Menten kinetics [
The turnover number
Having in mind the biological meaning of the parameters, we hypothesised that the percentage of reactive collisions between enzyme and metabolite and the number of simulation time steps could together be used to mimic the rates expressed by
To actually demonstrate that our design plan is functionally plausible, we recreated different scenarios of enzymatic activity to show that the constructed model exhibits behaviours that match those observed in the laboratory experiments.
We present results obtained for the simulation of a simple scenario where an enzyme catalyses one substrate and releases one product. These results are discussed theoretically in terms of enzyme affinity to substrate and catalytic efficiency and further tested against experimentally calculated kinetic parameters.
Then, we show that the tool is able to model biochemical pathways, accounting for biochemical and biophysical laws adequately. We describe the computational costs of representing more complex biomolecular scenarios. We discuss a number of present commitments and simplifications necessary to ensure computational tractability and point out ongoing lines of work.
This process of analysis is somewhat similar to that performed in laboratory experiments. That is, we studied the behaviour of an identical amount of enzyme in the presence of increasing concentrations of substrate and measured the velocity of reaction by determining the rate of product formation. Furthermore, we tested different (combinations of) simulation parameters, namely, the percentage of collisions producing reaction and the number of time steps taken by a reaction. Based on the interpretation of the Lineweaver-Burke plot, which describes the Michaelis-Menten laws for kinetic dynamics, we calculated the theoretical values of
Substrate concentrations ranged from
The model assumes that a simulation tick corresponds to a configurable, specific amount of time in the system. Notably, our approach to real time-time step conversion focused on framing realistic values of velocity of reaction and hence of
Figure
The Lineweaver-Burk plots obtained while simulating different percentages of reactive collision and a time step of 1. The markers represent the outputs of simulation and the lines represent the corresponding trend lines based on linear regression (equation also shown).
From the Lineweaver-Burke plots and, in particular, considering the linear regression model that approximates the equation
Approximation of kinetic parameters by different percentages of reactive collision and time steps.
% reactive collision | Time step |
|
|
---|---|---|---|
1 | 0 | −0,850576595 | −0,028514453 |
5 | 0 | −2,996153951 | −6,87 |
5 | 25 | −4,656418568 | −9,77 |
5 | 50 | 3,439111157 | 8,08 |
5 | 75 | 2,918555171 | 6,25 |
5 | 100 | −21,79373575 | −3,92 |
|
|||
10 | 0 | −14,6001318 | −6,631944616 |
15 | 0 | 11,51508927 | 7,623330464 |
25 | 0 | 0,87739 | 1,08 |
25 | 5 | 1,36829 | 1,44 |
25 | 25 | 2,45375 | 2,23 |
25 | 50 | 1,20869 | 1,06 |
25 | 75 | 0,70036 | 6,22 |
25 | 100 | 0,531257304 | 4,52 |
|
|||
34 | 1 | 3,18977 | 3,93 |
50 | 0 | 0,71725 | 1,27 |
75 | 0 | 3,65866 | 6,49 |
75 | 5 | 3,06473 | 5,33 |
75 | 25 | 1,13387 | 2,01 |
75 | 50 | 0,60205 | 1,07 |
75 | 75 | 0,37565 | 6,69 |
75 | 100 | 0,29693544 | 5,06 |
|
|||
90 | 0 | 4,43222 | 8,17 |
95 | 0 | 4,49601 | 8,44 |
95 | 25 | 1,277963717 | 2,396621333 |
95 | 50 | 0,56747 | 1,11 |
95 | 75 | 0,34196 | 6,88 |
95 | 100 | 0,237537579 | 4,88 |
|
|||
100 | 0 | 0,75547 | 1,60 |
To further validate the approximation of the kinetic parameters made by our model we tested them against experimentally validated data. Six enzyme records falling within the range of kinetic values calculated were randomly selected from BRENDA database [
We selected the simulation scenarios producing the most similar approximation to the experimentally validated kinetic parameters (Table
An approximation between experimentally calculated kinetic parameters and the parameters simulated by our model.
Enzyme identification | Experimental kinetics | Simulation parameters | Approximated kinetics | |||
---|---|---|---|---|---|---|
|
|
% reactive collision | Time step |
|
|
|
EC 1.8.1.9—glutathione reductase activity | 0.404 | 0.39 | 25 | 100 | 0.531257304 |
|
EC 4.1.1.11—aspartate 1-decarboxylase | 0.219 | 0.65 | 75 | 75 | 0.37565 |
|
EC 1.1.1.1—alcohol dehydrogenase | 0.41 | 1 | 95 | 75 | 0.34196 |
|
EC 1.1.1.205—IMP dehydrogenase | 1.7 | 1.9 | 25 | 25 | 2.45375 |
|
EC 3.4.13.22—D-Ala-D-Ala dipeptidase | 1 | 4.7 | 75 | 25 | 1.13387 |
|
EC 4.1.1.1—pyruvate decarboxylase | 1.8 | 1.2 | 25 | 5 | 1.36829 |
|
The Lineweaver-Burk plots for experimentally calculated kinetic parameters (represented by a solid line) and simulation parameters simulated by our model (represented by a dash dotted line). From top to bottom, and from left to right, the plots represent the activity of the following enzymes: glutathione reductase, aspartate 1-decarboxylase, alcohol dehydrogenase, IMP dehydrogenase, D-Ala-D-Ala dipeptidase, and pyruvate decarboxylase.
After validating our model for situations where only one enzymatic reaction occurs, we then studied the behaviour of our simulator when two enzymes are present in a two-stage process.
Specifically, we studied the catalytic activity of two enzymes commonly present in aromatic aldehyde production: the aryl-alcohol dehydrogenase (EC number 1.1.1.90) and the benzaldehyde dehydrogenase (EC number 1.2.1.28).
In particular, the model encompasses the following two equations:
The model represented an area of approximately 1
Agent size and velocity of movement were adjusted according to the molecular weight of the biological species (Table
The weight, size, and diffusion rate of the molecules represented in the two-step enzymatic system.
Species | Molecular weight (g/mol) | Particle radius ( |
Diffusion rate ( |
---|---|---|---|
Benzyl alcohol | 108.14 | 0.323 × 10−3 | 4.018 × 10−14 |
NAD+ | 661.41 | 0.657 × 10−3 | 1.975 × 10−14 |
NADH | 663.43 | 0.658 × 10−3 | 1.973 × 10−14 |
Benzaldehyde | 106.121 | 0.321 × 10−3 | 4.047 × 10−14 |
Benzoate | 121.12 | 0.338 × 10−3 | 3.843 × 10−14 |
To facilitate visual inspection, distinct agent types are associated with different colours and proportional sizes. As such, it is possible to visually observe the evolving of the simulation and, at some extent, observe how the agents are moving and interacting with each other. Specifically, it is possible to see how different agents traverse the environment and how behavioural rules are triggered or take precedence over each other.
As illustrated in Figure
Visual illustration of the evolving of the population of agents during the simulation steps.
Numeric outputs detail these visual insights and present data about the movement of the different species, the velocity at which the reactions are taking place and the behavioural system as a whole (Figure
The number of agents of different species interacting in the environment during 15000 simulation steps. From top to bottom, and from left to right, the above plots represent the number of agents of benzyl alcohol, NAD+, benzaldehyde, NADH, aryl-alcohol dehydrogenase holoenzyme, and benzaldehyde dehydrogenase holoenzyme. At the bottom, there is the number of occurring reactions and the number of molecules of benzoate excreted the extracellular medium.
This work describes the first phase of development of the biomolecular simulator. That is to say that focus was set on identifying and implementing the main biochemical and biophysical laws that would govern the model rather than implementing a realistic picture of the molecular landscape.
As far as we know, there has not been a previous attempt to simulate the effects of spatial localisation and temporal scales of individuals in the modelling of biomolecular systems. So, before engaging into more complex scenarios, it was pivotal to take advantage of available experimental data and ensure that the tool was able to account for basic cellular dynamics, such as those governing enzymes, adequately. Now that we obtained a successful proof of concept, we will work on system scalability in order to address more complex problems.
For this purpose, we run preliminary performance tests to find out the current scalability of our system. Figure
Performance of the tool in scenarios of increasing computational complexity.
So, in the near future, we will investigate the use of distributed and high-performance computing in the simulation of more complex biomolecular systems. Namely, we are investigating the potential of the new distributed environment of MASON, the DMASON (
ABM is increasingly popular in biology due to its natural ability to represent multiple scales of system decomposition, intertwine complicated behaviours, and deal with spatial-temporal constraints.
The agent-based tool developed in this work aims to support biomolecular simulations and, most notably, provide insights into catalytic efficiency in scenarios of industrial interest. Hence, proof of concept was focused on the approximation of kinetic parameters. The models correctly simulated known enzymatic characteristics and yielded useful predictions that may guide future experimental design. It also provides a simulation variability that may reproduce the experimental variation observed in lab experiments.
Future development of the models presented here will include three-dimensional representation, metabolic pathway simulation, and accounting of extracellular substances. Moreover, we plan to take into advantage the new DMASON platform to engage into distributed, affordable simulation and study more complex scenarios. Other recent high-performance computing frameworks like Biocellion will also be evaluated.
After consolidation, our tool will provide several resources and services for the investigation of bacterial cells in benefit of the research and industry communities.
The authors do not have any competing interests.
The authors thank the Agrupamento INBIOMED from DXPCTSUG-FEDER unha maneira de facer Europa (2012/273). The research leading to these results has received funding from the European Union’s Seventh Framework Programme FP7/REGPOT-2012-2013.1 under Grant Agreement no. 316265 (BIOCAPS) and the [14VI05] Contract-Programme from the University of Vigo. This document reflects only the authors’ views and the European Union is not liable for any use that may be made of the information contained herein.