^{1}

^{2}

^{1}

^{2, 3}

^{1}

^{2}

^{3}

This paper presents the MPI parallelization of a new algorithm—DPD-B thermostat—for molecular dynamics simulations. The presented results are using Martini Coarse Grained Water System. It should be taken into account that molecular dynamics simulations are time consuming. In some cases the running time varies from days to weeks and even months. Therefore, parallelization is one solution for reducing the execution time. The paper describes the new algorithm, the main characteristics of the MPI parallelization of the new algorithm, and the simulation performances.

Molecular simulation (MD) is a useful tool for studying the physicchemical properties of molecular systems. Nowadays, MD is one method used by the scientific community to analyze the properties of polymers, proteins, lipids, and other cellular systems. It is also used as an alternative to the laboratory experiments for the design of new materials. Molecular dynamics is based on the Newtonian equations of motions.

The purpose of this paper is to present an MPI parallelization of a new molecular dynamics algorithm DPD-B that resulted from the combination of a dissipative particle dynamics (DPD) algorithm [

It should be noted that, ideally, force fields used in molecular dynamics to describe particle interactions should be calibrated such that the desired reference temperature of a simulated system be maintained. However, in practice the average temperature of a simulated system deviates from the desired reference temperature. Therefore thermostats are used for assuring that a simulated system will maintain the reference temperature. Global thermostats apply a collective correction to all particles of a system while dissipative particle dynamics applies temperature corrections to pairs of particles. In this paper we present the parallelization of a new algorithm that is composed by a combination of a global thermostat and a DPD one. The new thermostat has better properties in eliminating ice-cube effects in multiscaling simulations [

This paper is organized as follows. The next section presents the DPD-B theory, the Gromacs engine of molecular dynamics, design, and implementation of the new algorithm in Gromacs. Section

This section is organized as follows. First subsection presents global Berendsen thermostat and the next one the new DPD-B thermostat; then Gromacs engine for molecular dynamics will be described and at the end of the section, design and implementation issues will be discussed.

In what follows we present an outline of the relevant theory for global Berendsen thermostat. For an exhaustive presentation we refer to [

A coupling between a molecular system with temperature

The strength of the coupling to the bath is determined by the

Through the Langevin equation the system is locally subjected to random noise and couples globally to a heat bath. In order to impose global coupling with minimal local disturbance we should modify (

We will analyze how the system temperature,

For an easier analytical computation we chose the friction constant to be the same for all particles:

In what follows we present an outline of the relevant theory for combined DPD-Berendsen thermostat. For an exhaustive presentation of the combined thermostat and critical tests we refer to [

The main idea of the DPD-like friction and noise is that they are applied to relative velocities between pairs of particles. Theoretically, if the pairwise application is well resolved it should ensure the conservation of total linear momentum. The friction and noise can be applied isotropically to the velocity difference vector. This is a simple and easy form for a DPD thermostat that will be used in this work. For DPD there is a velocity reduction factor

In order to compute the forces of the system a pair particle list is obtained. Then, with this list the interparticle distance are determined. In DPD the dumping factor is scaled with a factor which depends on the distance between two pairs of particles; the original DPD chooses a linear bound between

A first substep is the selection of a pair of particles that will be subject to the impulsive friction and noise. This pair selection can be done in several ways. The selection can be made at random, but it can also be based on a distance-weighted probability, for example, proportional to

For all particles

For

choose the velocity reduction factor

determine the velocity noise factor

where

is the reduced mass of the two particles,

construct the relative velocity vector

for DPD-B,

choose 3 random numbers

construct the vector

Proceed to step (I.5).

Distribute the relative velocity change over the two particles:

In this way the velocities of the particles are updated while the total momentum

The particle velocities are updated after each impulsive event, not at the end of each step. This is necessary, as a single particle may be involved in more than one pair event.

In the next section we will shortly outline the Gromacs tool.

Gromacs (groningen Machine for Chemical Simulations) is a molecular dynamics package primarily designed for simulations of proteins, lipids, and nucleic acids at molecular level. It implements the Newtonian equations of motions for different systems [

Gromacs is widely used in the Folding@home project for simulations of protein folding, which underlines the popularity and usage of this software [

There are several innovative elements related to the implementation of the new algorithm that results in reduced complexity and more efficient usage of computational resources. As explained earlier the implementation was done in Gromacs. The choice for Gromacs is motivated by the fact that it is open-source software, worldwide used, and known. Second, it was originating from the Molecular Dynamics Group which participates in the research described here.

A first innovative element is the use of a random selection of one neighbor per particle for the DPD component. This is possible through the physictheoretical design of the algorithm. The selection can be made at random, but it can also be based on a distance-weighted probability, for example, proportional to

Parallelization scheme is based on domain decomposition which means that the simulation box is divided in small boxes that are assigned to one processor each. This model of parallelization has the advantage that messages that are exchanged for communicating different information needed for the simulation happen mostly between neighbor processors (in number of nine). As a consequence the amount of messages is usually constant per processor and therefore scalability is almost linear per number of processors. Figure

Domain decomposition. Simulation box is decomposed in small boxes assigned for different processors.

Another innovative element that is introduced for the parallelization of the DPD-B algorithm is the fact that neighbor-particle list is restricted to the particles that are computed by one processor. We will show in the results section that this choice preserves system properties. The fact that particles will be selected at random from the neighbor particles assigned to a processor makes no need to extra communication messages, thus reducing overall computation complexity. This fact is underlined in Figure

Pairs of particles are randomly selected only from the neighbor-list particles assigned to a processor.

First we will discuss the efficiency gain obtained through the random selection of a pair of particles as compared to the situation of using all-neighbor lists. The simulations are done on a molecular system of Coarse Grained Martini [

For DPD-B, we compared the random selection of one pair of particles with the usage of all particles from the neighbor list. We also reported the times for existing thermostats from Gromacs such as Berendsen, Langevin type, and no thermostat. In Table

The speed is reported as a number of nanoseconds simulated within 24 h.

Method | ns/day |
---|---|

Berendsen | 34.63 |

No Themostat | 34.69 |

DPD-B, all neighbour list | 6.94 |

DPD-B, random selection of pairs | 29.31 |

Langevin | 30.01 |

For comparing molecular properties for the parallelization-optimization of using only particles pairs assigned to a processor (see Section ^{2}/s for serial and of ^{2}/s for parallel. It can be concluded that properties are similar and preserved by the optimization.

Temperatures for the runs of a Martini Coarse Grained System on serial and parallel.

Below we represent the computational speed-up for the Martini CG water system. The simulations were done on a machine with 4 processors, of type Intel(R) Core(TM) i7 CPU 920 with 2.67 Hz and 2 Gb of memory per core. As it can be seen in Figure

Speed-up function of a number of processors.

At the end we show how the temperature vanishes to 300 K for a starting gradient temperature condition. The box is split in seven zones; temperature is being increased symmetrically from 310 K (the first and the last zone) to a pick of 340 K (the middle zone); see Figure

Gradient of temperature.

We performed the simulations for 1 processor (serial case), 2, and 4 processors. The measured temperature per each zone is within the normal statistical error for each case (see Figure

Temperature per thermal zone (group) after temperature gradient is let to vanish to 300 K, for 1, 2, and 4 processors. All temperatures are within statistical errors.

The purpose of this paper is to present the MPI parallelization of a new molecular dynamics algorithm DPD-B that resulted from the combination of a dissipative particle dynamics (DPD) algorithm and Berendsen (B) thermostat. It should be taken into account that molecular dynamics simulations are time consuming. In some cases the running time varies from days to weeks and even months. Therefore, parallelization is one solution for reducing the execution time.

The paper describes the new algorithm, the main characteristics of the MPI parallelization of the new algorithms, and simulations performances. The work was done through research collaboration between Molecular Dynamics Group, University of Groningen, one of the well-known groups in the MD domain and researchers from Politehnica University of Bucharest. The parallelization was done in Gromacs [

Several innovative elements related to the algorithm implementation are reported in the paper. A first innovative element is the use of a random selection of one neighbor per particle for the DPD component. The selection can be made at random, but it can also be based on a distance-weighted probability, for example, proportional to

Another innovative element that is introduced for the parallelization of the DPD-B algorithm is the fact that neighbor-particle list is restricted to the particles that are computed by one processor. We show in the results part that this choice preserves system properties. The fact that particles are selected at random from the neighbor particles assigned to a processor makes no need to extra communication messages, thus reducing the overall computation complexity.

When looking on how the temperature vanishes for a starting temperature gradient condition, it can be observed that the measured temperature is within statistical errors for the cases of 1, 2, and 4 processors.

As future work we intend to implement more new DPD algorithms.