Reward Processing in the Brain: A Prerequisite for Movement Preparation?

In the last decade, expanding animal studies on the cerebral organization of reward processing toward human in vivo situations has become possible. In this review, we define some of the concepts associated with reward, summarize the crucial importance of the dopaminergic system, and discuss the currently available neuroimaging studies in man. We will show that abstract concepts of human behavior like emotions, drive, arousal, and reinforcement are now open for further exploration in man at the level of neuronal circuit organization. The cerebral dopaminergic neurotransmitter circuitry does play an important role in the organization of both the motor and motivational system.


INTRODUCTION
From an evolutionary perspective, a rewarding stimulus can be considered a directional force toward a higher survival value for the species. Supposedly a complex species could not have Reprint requests to: M. Keitz, Department of Neurology, University Hospital Groningen, The Netherlands survived if it could not learn from experience. A condition for the survival of vertebrate species is the ability to learn from past experience, or in other words to distinguish between rewarding and non-rewarding stimuli to know which stimuli or situations should be approached and which should be avoided. Currently, it is widely accepted that the dopaminergic system in the brain plays an important role in the processing of rewarding stimuli. Since the discovery of a "reward pathway" by Olds and Milner (1954), a large body of empirical research has been performed to gain more insight into the neurobiology of reward processing.
Here we provide a short review concerning the processing of reward in the brain with particular emphasis on neuroimaging studies in man. An impaired motivational background might be the basis for the existence of slowness of movement in dopamine deficiency conditions. This effect should be distinguished from the consequences of other neuronal impairments like those seen in central motor neuron or cerebellar diseases. Clumsiness is usually associated with the latter conditions.

REWARD
In every-day life, reward is a common word. In scientific or neurobiological language, however, reward refers to the cause of learning and adaptive behavior (Martin-Soelch, 2002). This concept is used differently in various fields of research. Rolls (1999)  animal will work. According to Schultz (1997), reward stimuli have three main properties.
First, they elicit consumatory and preparatory or goal-directed behavior. Second, they increase the probability of the reoccu,'rence of a goal-directed behavior, in other words, they work as a positive reinforcer, initiating the phenomenon of operant conditioning (learning). Third, they elicit pleasurable subjective feelings in man (like hedonia).
In the next section we will briefly mention four concepts that are related to reward, namely pleasure, drive, arousal, and reinforcement, which is necessary for the comprehension of reward in a behavioral and neurobiological framework. PLEASURE, DRIVE, AROUSAL, AND REINFORCEMENT satisfy that need. For example, Hull (1943) in his "drive reduction theory" proposed a link between motivation and reinforcement. The organism or individual tries to reinstall homeostasis, which in turn will reduce drive.
Also arousal, a general state of activation in any organism, plays a role in reward mechanisms. Berlyne (Berlyne, 1971;Martin-Soelch, 2002) developed a theory about the relation between the aroused state and pleasure, in which he postulated that every individual has their own "arousal potential" where the pleasure experienced is maximal. Skinner (1938) considered reward a positive reinforcer. Reinforcement is the strengthening of the relation between a response and an event (namely, stimulus). In Skinner's theory a behavioral response to a stimulus can be reinforced leading to conditioning or to a learning process.
In a behavioral context, the word pleasure is defined as a desire or as a state of gratification. In an emotional context, pleasure is seen as a source of joy. Pleasure is difficult to define because it is associated with a high degree of subjectivity. Pleasure can be conceptualized in terms of perception of stimuli as an initial source of pleasure or in *erms of behavioral activity in relation to the stimulus-response theory or in terms of meaningful relationships (Martin-Soelch, 2002). The pleasurable effect associated with a stimulus is determined by the environmental context, as well as by the internal motivational state of the organism at the moment of stimulation. As postulated by Thorndike (1913), reward is associated with pleasure because it can be considered a pleasant consequence of behavior.
The theories concerning the drive concept try to explain motivated, consumatory, and learning behavior. An imbalance of homeostasis creates a need in the individual and produces a drive, to

Rat studies
In 1954, Olds and Milner were the first to discover a direct method for the study of neuronal mechanisms underlying reinforcement and learning using intracranial self-stimulation (ICSS). After having implanted electrodes in certain regions of the brain, the investigators found that the animal would stimulate itself by pressing a lever. In some cases, the animals still stimulated themselves by pressing a lever when they were in a deprived state (Aou et al., 1983). Although one of the primary findings was that in many brain regions, the self-stimulation elicited pedaling behavior, in other regions the effect was the opposite, namely the animals tried to avoid stimulation. The most salient ICSS regions were the lateral hypothalamus and parts of the brain stem in the vicinity ofthe medial forebrain bundle.
In later experiments, more brain sites supporting ICSS were discovered, like parts of the frontal cortex, basal ganglia, septal area, the hippocampus, and amygdala. This phenomenon was generally considered a brain stimulation reward, and the ICSS regions were considered brain sites involved in reward processing (Rolls, 1999). Many ICSS-sites seemed to follow the course of the dorsal noradrenergic bundle, starting from the locus coeruleus, through the hypothalamus and toward the end-point in the neocortex. This concept formed the basis of the "noradrenalin hypothesis", which postulated that noradrenalin plays an important role in the mediation of reward processing during ICSS. Evidence against this hypothesis emerged, however. For example, Rolls (Rolls, 1999;Rolls et al., 1974) found thak rats treated with disulfiram, a substance depleting noradrenalin in the brain, could self-stimulate in an aroused state, but the rats were usually too drowsy to do so. In addition, Clavier and Routtenberg (1976) found that lesions of the locus coeruleus did not attenuate self-stimulation along the course of the dorsal noradrenergic bundle.
The "dopaminergic hypothesis" has been proposed as alternative to the noradrenalin hypothesis, postulating that dopaminergic pathways are involved in ICSS. This hypothesis was supported by findings from at least five areas of research. The first arguments are from mapping studies (Redgrave & Dean, 1981). Many brain sites eliciting ICSS were located within the dopaminergic system. Thus, sites with high response-rates and low thresholds during ICSS were those with the highest density of dopaminergic neurons. Second, microdialysis studies showed increases of dopamine release in regions that are projection areas of the ventral tegmental area (VTA), for example, the nucleus accumbens (Nakahara et al., 1989). More evidence emerged from dopaminergic drug studies. Dopamine agonists increase ICSS response rates, whereas antagonists decrease them. For example, dopamine-receptor blockade with spiroperidol into the nucleus accumbens or into the hypothalamus attenuates amygdala self-stimulation. Spiroperidol also attenuates hypothalamic self-stimulation without producing an arousal deficit (Rolls, 1999;Mora et al., 1976b). Infusion of d-amphetamine into the rat caudal nucleus accumbens significantly decreases ICSS thresholds in the VTA (Ranaldi & Beninger, 1994). Additional evidence was obtained from experiments carried out by Mora et al. (1976a) with apomorphine, a dopamine receptor agonist, which attenuated self-stimulation in prefrontal cortex in rats and in the orbitofrontal cortex in monkeys. The fourth research area comprises the lesion studies, in which lesions of brain sites with high dopaminergic neuron density abolish the effect of ICSS. For example, 6-hydroxydopamine lesions of ascending fibers of the mesotelencephalic dopamine projections results in a decreased ICSS effect of the ipsilateral part of the VTA (Fibiger et al., 1987). Finally, selfadministration studies also contributed to show the involvement of dopamine in reward processing. Yokel and Wise (1975), for example, showed that amphetamine self-administration was affected by dopamine receptor blockade by pimozide. Pettit and Justice (1989) measured increased extracellular dopamine levels in the nucleus accumbens during cocaine self-administration, and Di Ciano et al. (1995) during cocaine as well as damphetamine self-administration. The results of these studies demonstrated dopamine's reinforcing role. Natural rewards like food and sex have similar brain regions involved in their processing. And again, dopamine has been found to be involved in these regions. Salamone et al. (1989) for example found in an in vivo microdialysis study in rats increased dopamine 'metabolism in response to a food reward. Furthermore, Becker et al. (2001) found elevated dopamine concentrations in female rat dorsal and ventral striatum during sexual behavior.
Currently it is accepted that noradrenergic pathways are involved in terms of general increase in reactivity to all sorts of stimuli and that dopaminergic pathways are particularly involved in terms oi" increase in reactivity to specifically rewarding stimuli.

Primate studies
In the search for the neuronal organization of reward processing, Schultz and coworkers Schultz & Romo, 1990) used single-cell recordings in macaque monkeys. Using this method, the group correlated the amplitude and the phase of the separate action potentials with different stimulus conditions. They found that the dopamine neurons in the monkey brain were activated in response to the delivery of a food reward and not to visual stimuli. Preparatory instruction signals also elicited such a response. In a further study Apicella et al. (1991) investigated the activity of single striatal neurons of macaque monkeys in response to a primary reward during a go-nogo task. The principal re;ions showing an increase of activity were located in the ventral striatum. A new study of Schultz et al. (1992) in 1992 showed that the expectation of rewarding events activated the dopaminergic neurons of the ventral striatum, indicating that this region is involved in the control of goal-directed behavior. The findings in another study by Schultz et al. (1997) suggested that the dopamine neurons in the midbrain and striatum code for reward prediction. Thus, when an unpredicted reward occurred, dopamine neurons started firing; they were also activated when a conditioned stimulus (CS) predicted a reward; if the reward did not occur, then the activity of the dopamine neurons was depressed, exactly at the time the reward would have taken place. On the other hand, Schultz et al. (2000) found that the ventromedial frontal cortex and the orbitofrontal cortex were active only if the reward was delivered. This result suggests that the anticipation of reward recruits a distinct neuroanatomical and neurochemical mechanism from the consumption or the delivery of reward. These reports and others indicate that dopamine is involved in reward processing but that the appetitive and consumatory stages of reward differ in their neuronal trajectory. Other questions about the selectivity of the ventral striatum in the anticipation of reward rather than punishment are still unanswered. Salamone et al. (1994) for example reported that nucleus accumbens lesions, made by the neurotoxin 6-hydroxy-dopamine impaired not only approach behavior but also active avoidance.

Human studies
The phenomenon of brain stimulation reward was demonstrated not only in animals but also in humans, as reported by Sem-Jacobsen (1976).
Patients with electrodes implanted in different brain regions for therapeutic reasons reported having felt pleasant or unpleasant smells. Even sexual responses were elicited by some sites. The development of neuroimaging methods opened a new way for the investigation of reward in the brain. Positron Emission Tomography (PET) or functional Magnetic Resonance Imaging (fMRI) has enabled to track the location of cognitive processes in the brain. Thut et al. (1997)  Healthy subjects performed a go-nogo task with two different forms of reinforcement: a nonmonetary reinforcement and a monetary reward. The trials were identical except for these two conditions. The authors found activation in corticosubcortical networks, including regions in the dorsolateral prefrontal cortex, the orbitofrontal cortex, the thalamus and the midbrain in response to monetary reward (Thut et al., 1997).
More insight into reward circuitry was gained using a paradigm similar to that of Thut et al. including three different forms of reinforcement (no feedback, a non-monetary reinforcement signal and a monetary reward)to measure rCBF in drug addicts and smokers (Martin-Soelch et al., 2001) and in parkinsonian patients (Kuenig et al., 2000) who were performing a pattern recognition task with delayed response. The results of these studies demonstrated that different groups showed different rCBF patterns in response to reward, suggesting that different types of subjects use different brain circuits. In opiate addicts for example, the mesolimbic and mesocorticolimbic regions, which were associated with monetary and non-monetary reinforcement in healthy subjects, responded to monetary reward but not to non-monetary reinforcement (Martin-Soelch et al., 2001 a). The comparison between smokers and non-smokers also showed different activation patterns in response to reinforcement. This difference concerned principally the striatum, which was not activated at all in smokers (Martin-Soelch et al., 2001 b). In addition, an fMRl-study by Stein et al. (1998) showed that direct intravenous administration of nicotine to smokers induces a dosedependent increase in neuronal activity in the nucleus accumbens, the amygdala, the cingulate, and the frontal lobes. Kuenig et al. (2000) compared rCBF changes associated with monetary reward in parkinsonian patients and in healthy subjects, using the same experimental task as in the studies on smokers and opiate addicts. The authors found that Parkinson patients show less or no activation in the mesolimbic regions and instead seem to use more cortical regions for processing reward information, showing thus an activation pattern similar to that of smokers and opiate addicts (Fig. 1). Knutson et al. (2001) performed an fMRIstudy on reward processing in healthy subjects, in Older Controls

I -B --I PD patients
Midbrain Striatum Cingulate gyrus Frontal cortex Fig. l:Reward-related brain activation in parkinsonian patients and in age-matched controls. A coronal projection of a glass brain of significantly activated brain areas when subjects are monetary rewarded. Regions activated in older controls" bilaterally in the striatum, caudate nucleus and anterior cingulate gyrus, and unilaterally in the left cerebellum, midbrain and medial frontal gyrus (adapted from Kuenig et al., 2000). Regions activated in parkinsonian patients: right medial frontal cortex, left superior parietal lobule, medial temporal gyrus, thalamus and right and left cerebellum. (adapted from Kuenig et al., 2000). which they investigated the activation related to the anticipatory component of monetary reward and punishment. That study did replicate in humans the research work done by Schultz et al. in primates (1997). The results of the Knutson (2001) study showed that the ventral striatum was more involved in the expectation of positive outcomes of reward, whereas the medial caudate was activated in reaction to reward feedback and to punishment. The authors suggested then that the caudate nucleus coded for expected incentive value in general. A further fMRI-study (O'Doherty et al., 2001) investigating reward processing recorded the brain activity during a gambling task. The results showed an implication of brain regions that are directly connected with parts of the basal ganglia, like the orbitofrontal cortex. The authors could also differentiate between different response patterns within the orbitofrontal cortex. Thus, the medial orbitofrontal cortex seemed to be activated when a correct choice was followed by a rewardfeedback, whereas the lateral orbitofrontal cortex was activated when an incorrect choice was followed by a punishment feedback.
In addition, another gambling task experiment (Elliott et al., 2000) showed a positive correlation between a positive outcome and activation in the midbrain and in the ventral striatum. Furthermore, Delgado et al. (2000) showed that monetary gains, a form of reward, and monetary losses, a form of punishment, were associated with different neuronal responses. The dorsal and ventral striatum showed increased activation in response to positive feedback and decreased activation after negative feedback, suggesting that the striatum can differentiate between gains and losses.

CONCLUDING REMARKS
Reward is a difficult concept in theory and in praxis. A large body of evidence demonstrated that the dopaminergic system plays a crucial role in the processing of rewarding stimuli in the brain. This role has been shown in many animal studies and lately also in human neuroimaging studies. In man, pathologic behavior or neuropsychiatric diseases, like addiction, and neurological diseases like Parkinson's disease, are associated with a change in the cerebral activity related to reward processing, especially in dopaminergic regions. Exploring specific human aspects of reward and its associated concepts, like pleasure, drive, arousal and reinforcement in human behavior is now possible.
A disturbed cerebral dopamine-system in adults may lead to---apart from the motivational problems summarized abovea movement disorder known as parkinsonism, the hallmark of which is slowness of movement. The model disease here is the idiopathic form of Parkinson's disease, but many other brain diseases are accompanied by parkinsonism too. The symptoms akinesia and bradykinesia could be interpreted as a result of faulty central organization of the innate movement patterns and are thus an adult form of "clumsiness" in contradistinction to the accepted corticospinal forms of clumsiness. From the work of Ballermann (2001), we know that rats become more "clumsy "during a reaching task when they are treated with 6-hydroxy-dopamine, which causes dopamine depletion in the nigrostriatal bundle. In Parkinson's disease whether the slowness of movement, rather than being a motor disturbance in itself is actually based on a missing internal motivationalnot consciously perceived--deficit in marshalling motor patterns in real time must be investigated. This outcome then should be distinguished from clumsiness as the consequence of the more commonly present neuronal impairment of the central motor neuron. It can be argued whether the clinical description of "clumsiness" is an adequate term to point automatically to the central motor neuron and related systems, bringing us back to the question" What is clumsiness?