^{1, 2}

^{1, 2}

^{3}

^{1, 2}

^{1}

^{2}

^{3}

A circuit of evaluation and selection of the alternatives is considered a reliable model in neurobiology. The prominent contributions of the literature to this topic are reported. In this study, valuation and choice of a decisional process during Two-Alternative Forced-Choice (TAFC) task are represented as a two-layered network of computational cells, where information accrual and processing progress in nonlinear diffusion dynamics. The evolution of the response-to-stimulus map is thus modeled by two linked diffusive modules (2LDM) representing the neuronal populations involved in the valuation-and-decision circuit of decision making. Diffusion models are naturally appropriate for describing accumulation of evidence over the time. This allows the computation of the response times (RTs) in valuation and choice, under the hypothesis of ex-Wald distribution. A nonlinear transfer function integrates the activities of the two layers. The input-output map based on the infomax principle makes the 2LDM consistent with the reinforcement learning approach. Results from simulated likelihood time series indicate that 2LDM may account for the activity-dependent modulatory component of effective connectivity between the neuronal populations. Rhythmic fluctuations of the estimate gain functions in the delta-beta bands also support the compatibility of 2LDM with the neurobiology of DM.

Even simple decisions imply higher cognitive functions that integrate noisy sensory stimuli, prior knowledge, and the costs-and-benefits related to possible actions in function of their time of occurrence. Accumulation of noisy information is a reliable pattern performed by neural pools in cortical circuitry during decision making (DM) process. This process is time absorbing, especially when the quality of information is poor and there exist many possible alternatives that may be evaluated and compared. There exists large consensus in the studies of DM toward the conformation of a phase of accumulation of evidence until a decision is made [

In natural environments several sensory stimuli produce different alternatives and hence demand the evaluation of different possible responses, that is, a variety of behaviors. In other terms, also a selection question arises [

The main purpose of this work was to set the theoretical, neurobiologically sustainable bases for representing the two stages of valuation and choice of DM during Two-Alternative Forced-Choice (TAFC) task in terms of two distinguished layers of neuronal populations performing diffusive dynamics (2LDM), under the assumption that in the DM among alternative options the cortical areas (lateral prefrontal and parietal cortex) integrate the corresponding weighted evidence of the alternatives, whilst the ventromedial prefrontal cortex and the striatum encode the value of different options [

Drift diffusion model. The randomness of the path taken under the influence of noisy stimuli characterizes the diffusion models. A stimulus is represented in a diffusion equation by its influence on the drift rate of a random variable. This random variable, say the difference of evidence corresponding to the alternatives, accumulates the effects of the inputs over time until one of the boundaries is reached. The decision process ends when evidence reaches the threshold, and the time at which it occurs is called response time (RT). Response time (RT) depends on (a) the distance between the boundaries and the starting point, (b) the drift, that is, the rate at which the average (trend) of the random variable changes, and (c) the diffusion, that is, the variability of the path from the trend. These elements characterize the so-called drift diffusion model (DDM). The accumulation of evidence is then driven both by a deterministic component (drift) that is proportional to the stimulus intensity and by a stochastic component of noise (diffusion) that makes the evidence deviate from its own trend. The rationale of DDM is that, since the transmission and codification of the stimuli are inherently noisy, the quality of the feature extraction from such inputs may call for accumulation of a sufficient large sequence of the stimuli to get information [

It has been shown [

However, the canonical diffusion models assume that momentary evidence is accumulated continuously and at

Although the evidence accumulation and choice formation are usually described as a one-stage process such that a decision is given as soon as the decisional variable reaches a threshold, it is empirically yet unknown whether decision making is performed in a single neuronal circuit [

The most intriguing two-stage models have been proposed in terms of integrate-and-fire attractor networks [

The paper is organized as follows.

Section

Section

Section

The appendix deals with statistical theory on distances between features.

As long as the cells in a neural population have similar response properties, that is, acting in a statistically similar way [

After the signal

Example of binary encoding of information. The threshold value

The importance of IPI arises from the hypothesis that the information transferred within the nervous system is usually encoded also by the timing of spikes [

Let us consider the input-output map between input

The two-layered diffusion model (2LDM) for decision making. Both stages (valuation and choice) are affected by noise. In the valuation stage the critical threshold indicates the firing rate of the neuronal populations involved, to which would correspond the expected reward. The outputs of this stage then are the differences between the responses of observed neuronal activity at the stimuli provided by the alternatives and the target. These measurements enter the next stage, where the decision is taken so as to optimize some utility criterion (reward). Hence, the attainment of the threshold in the decision stage indicates the preferred alternative. Feedback information flows from the decision stage in order to elicit the adaptation of the boundary in the valuation layer. In this way, a mechanism of reinforcement determines the competition between the alternatives and the valuation is biased to the most probable rewarded one.

In order to test the ability of the model to detect effective interactions between the neuronal populations, simulation of the 2LDM was carried on by resampling time series of conditional probabilities from a previous experiment of eye tracking. Nine subjects had been asked to look at two abstract images displayed on a screen for 5 seconds (s) at randomly assigned locations (left or right side). Each subject performed ten trials. The two images were balanced by extension and by photometrical characteristics (color, luminance, and contrast). Eye movements had been recorded during the period of 5 s (sampling frequency 1/50 ms) and at the end of that time subjects declared which of the images was their preferred one. The likelihood, that is, the probability of visual targeting towards one of two images conditional to the final chosen stimulus, was then calculated over the total 90 choices. One hundred surrogates of this likelihood time series were obtained by using the iAAFT technique (iterated amplitude adjusted Fourier transform) [

To use the phase locking indices in a meaningful way, we need to know their distribution under the null hypothesis of independent pairs of oscillatory activity. Only values that depart significantly from what would be expected for independent oscillators can be considered as revealing the presence of synchronization. The distribution of the index, computed for pairs drawn randomly from the surrogate ensembles, can be considered as an approximation of the distribution under the null hypothesis [

To test the null hypothesis that the mean of the distances between features (

The resampled time series of the likelihood of visual targeting at the final selected image ranged over

Gain functions. In the plot are displayed the gain functions relative to the neuronal populations of the two layers. Both showed prominent rhythmic activity in the delta band. Increased oscillations up to beta band characterized population

Time course of the correntropy coefficient between the phase signals in

Cumulative distribution function of distance between

Distance of the correntropy coefficients measured for the phase signals and their surrogates. Synchronized interaction between the two neuronal populations was determined in correspondence with the values of the correntropy distance vector above the critical value (0.756), which was calculated according to the distribution of a Weibull random variable with parameters

The model presented in this study assumes that the trajectories of an observable variable

There is a theoretical linkage between 2LDM and the well-recognized integrate-and-fire attractor network model [

Simulation was used to test the ability of 2LDM to represent interactions between the neuronal populations on reliable time series and did not aim at investigating the underlying cognitive process. Synchronous interaction was present within a restricted median time interval, where, supposedly, the dynamics of the two neuronal populations were mutually reinforcing [

Improvement in the optimization of the 2LDM parameters is expected by considering other error functions instead of RMSE if the distribution of the residuals is not Gaussian and is heavy-tailed such that it exhibits large skewness and kurtosis. A challenging task would be the implementation of further layers for studying the subcircuits possibly involved in the valuation or choice stage of DM (e.g., the direct and indirect pathways in BG). Finally, the application of 2LDM to specific cognitive experimental task would yield information about how the speed-and-accuracy performance may vary on the base of some psychometric or behavioral smoothing parameter.

To measure the similarity between two feature vectors, many distance measures have been proposed [

For nonidentical and correlated random variables

If in Lemma

For nonidentical, correlated, and upper bounded variables

For finite length feature vectors with nonidentical, correlated, and upper bounded values, the

The authors declare that there is no conflict of interests regarding the publication of this paper.