Norepinephrine and Dopamine as Learning Signals

The present review focuses on the hypothesis that norepinephrine (NE) and dopamine (DA) act as learning signals. Both NE and DA are broadly distributed in areas concerned with the representation of the world and with the conjunction of sensory inputs and motor outputs. Both are released at times of novelty and uncertainty, providing plausible signal events for updating representations and associations. These catecholamines activate intracellular machinery postulated to serve as a memory-formation cascade. Yet, despite the plausibility of an NE and DA role in vertebrate learning and memory, most evidence that they provide a learning signal is circumstantial. The major weakness of the data available is the lack of a specific description of how the neural circuit modulated by NE or DA participates in the learning being analyzed. Identifying a conditioned stimuli (CS) representation would facilitate the identification of a learning signal role for NE or DA. Describing how the CS representation comes to relate to learned behavior, either through sensory-sensory associations, in which the CS acquires the motivational significance of reward or punishment, thus driving appropriate behavior, or through direct sensory-motor associations is necessary to identify how NE and DA participate in memory creation. As described here, evidence consistent with a direct learning signal role for NE and DA is seen in the changing of sensory circuits in odor preference learning (NE), defensive conditioning (NE), and auditory cortex remodeling in adult rats (DA). Evidence that NE and DA contribute to normal learning through unspecified mechanisms is extensive, but the details of that support role are lacking.


INTRODUCTION
The projections of locus eoeruleus (LC) norepinephrine (NE) cells and of midbrain dopamine (DA) cells interact with large regions of the vertebrate forebrain. Neuronal activity in these cell groups in behaving animals suggests they are active when new environmental contingencies occur.
Norepinephrine and DA also engage the cyclic adenosine monophosphate (cAMP) cascade and, ultimately, activate the cAMP response element binding protein (CREB), a promoter of new protein transcription that is proposed to be universally involved in long-term memory formation (Silva et al., 1998). The present review examines the hypothesis that NE and DA provide learning signals through the activation of their respective cAMPcoupled receptors, the 13-adrenergic receptor for NE and the D 1/D5 receptor for DA.
The LC-NE neurons project to all cortical forebrain regions, as well as to the cerebellum, spinal cord, and limbic and hypothalamic nuclei (Moore & Bloom, 1979). Midbrain DA neurons project heavily to the frontal cortex, striatum, and limbic areas; other cortical areas also receive DA innervation (Moore & Bloom, 1978).
The diffuse projecting pattern of NE axons was part of Kety's (1970) initial rationale for proposing, more than 30 years ago, that NE would serve as a signal to produce persistent facilitation of synaptic inputs when those inputs occurred in conjunction with significant consequences for the organism. Livingston's (1967) proposal of a widely projecting "Now Print" message was similar to Kety's NE learning signal. To mediate the learning effects of unconditioned stimuli (UCS), UCS should activate NE and DA neurons such that the release of NE and DA would occur in the appropriate temporal sequence to strengthen associated inputs. Cellularrecording studies indicate that both NE and DA neurons show patterns of activation that are consistent with a role as learning signals. The neurons do not invariably respond to unconditioned rewards or punishments, however, but instead are affected by the degree of predictability of such signals.
Midbrain DA neurons fire to unpredicted rewards and are depressed by the absence of predicted rewards (Hollerman & $chultz, 1998). This DA cellular firing pattern has been described as a 'teaching' signal because it occurs before reliable cue and reward associations have been made and disappears as such associations become established (Waelti et al., 2001). Dopamine cell firing then becomes associated to the conditioned stimuli (CS) signaling reward and ultimately becomes associated only to the earliest CS in the temporal chain of events leading to reward (Schultz, 1998). Thus, DA cellular activity signals the UCS when learning is initiated and remains available as a signal to link temporal contingencies leading to reward, but dissociates from the primary reward event. The DA cellular signal can plausibly initiate learning induced by a reward UCS. Dopamine neurons are also activated by novelty (Ljungberg et al., 1992). Aversive evems are not potent activators of DA neurons (Mirenowicz & Schultz, 1996).
When rewarding UCS are presented and are unpredictable, LC-NE neurons also fire (Sara et al., 1994;Sara, 1998). When conditioning to such stimuli is established, the neurons no longer fire unless the reward contingency changes. For example, if a reward is omitted during extinction, then the neurons will fire again (Sara & Segal, 1991;Sara et al., 1994) in contrast to DA neurons, which decrease their firing rate when a predicted reward is omitted (Hollerman & Schultz, 1998). Possibly related to the activation of NE neurons by reward omission is the recent report that extinction oftaste aversions, an active learning process, depends on [-adrenergic-receptor activation (Berman & Dudai, 2001).
The NE neurons are activated by both appetitive and aversive UCS (Sara & Segal, 1991), by novel sensory events (Vankov et al., 1995;Aston-Jones & Bloom, 1981b), or by any change in environmental contingencies that might cause an animal to orient or notice (Aston-Jones & Bloom, 1981 b;Vankov et al., 1995). Such neurons are tonically active as a function of arousal (Aston-Jones & Bloom, 1981 a) but produce burst responses (Grant et al., 1988;Aston-Jones et al., 1994) to significant events, as do DA neurons. Thus, the NE cellular signal is well timed to mediate the updating of representations or the acquisition of adaptive responses to important environmental events, as first discussed by Kety (1970). Both anatomical and signaling characteristics of NE and DA neurons in the vertebrate brain are consistent with a role for these neurotransmitters as learning signals.
Heterosynaptic facilitation by NE or DA of informational (usually glutamate) connections could promote a change in the response to that information. The change might represent sensorysensory or sensory-motivational associations, such that a previously neutral input calls up a second, behaviorally potent, representation, or a sensorymotor change such that a sensory input directly elicits a motor response. Connection change, functional and structural, is the current vision of the underpinning of memory in all nervous systems.
Burst activation of the LC has recently been shown to produce long-term heterosynaptic facilitation at the perforant-path synapse, which depends on 13-adrenergic receptor activation and on protein synthesis (Walling & Harley, 2004).
Unexpectedly, long-term synaptic facilitation occurs independently of short-term synaptic facilitation, but an early increase in cell excitability is observed. This result suggests that in the vertebrate brain, NE can selectively promote long-term memory, as proposed from behavioral observations of rodents (Kobayashi et al., 2000;Izquierdo et al., 1998) and humans (Quevedo et al., 2003).
Dopamine application produces enduring heterosynaptic facilitation of glutamatergic inputs to the Mauthner cell in fish (Kumar & Faber, 1999) and of muscarinic inputs to sympathetic ganglia in rabbits (Libet, 1992 In the infant rat pup, learning to prefer odors associated with maternal care helps the pup maintain proximity to the mother. Stroking and licking the pup produces a prolonged activation of LC neurons (Kimura & Nakamura, 1985;Nakamura et al., 1987) and the release of NE in the olfactory bulb (Rangel & Leon, 1995). When stroking is preceded by exposure to a novel odor, the pups learn a preference for the novel odor (Sullivan & Hall, 1988). Backward pairings do not produce conditioning (Sullivan & Hall, 1988). A 13-adrenergic receptor agonist in the olfactory bulb can act as the UCS (Sullivan et al., 2000), whereas a 13-adrenergic receptor antagonist in the olfactory bulb prevents odor preference learning to stroking UCS (Sullivan et al., 1992). Thus, NE release and 13-adrenergic receptor activation in the olfactory bulb are both necessary and sufficient for rat pup odor preference learning (Wilson & Sullivan, 1994). Experiments in our laboratory have shown that intracellular cAMP elevation is essential for inducing odor preference memory. Phosphorylation of CREB, which modulates DNA transcription, is also essential in rat pup odor preference learning (Yuan et al., 2003a). A similar role has been shown for CREB in odor aversion learning in Drosophila (Yin et al., 1994(Yin et al., , 1995. Metabolic increases (Sullivan & Leon, 1986;Sullivan et al., 1990) and CREB phosphorylation changes (McLean et al., 1999) associated with odor learning are localized to the olfactory bulb region, where the odor stimulus is encoded by mitral cells. We have suggested that the mitral cell is the locus of learning changes (Yuan et al., 2003b). Thus, although the specific circuit remains to be characterized, changes in the motivational significance of the odor, mediated by the changed patterns of mitral cell activity, produce learned odor-preference behavior.
The neuronal circuitry for odor preference learning in the olfactory bulb remains intact in adult rats, but LC signaling is altered after the neonatal period, providing only brief responses to tactile stimuli that, when paired with odor, do not produce odor preference learning or olfactory bulb change (Moriceau & Sullivan, in press). Nevertheless, the pharmacological activation of the LC designed to reinstate the firing response pattern of the neonate rat reinstates LC mediation of odor preference learning in older rat pups.
In adult sheep, the NE release pattern associated with giving birth mediates the learning of an odor preference in the ewe for its own lamb after parturition (Brennan & Keveme, 1997).
Lamb odor preference learning depends on the activation of the 13-adrenergic cascade in the olfactory bulb. The data from sheep and older rat pups support the hypothesis that the magnitude and duration of NE release are critical for its role in inducing long-term learning. This view is consistent with the proposed requirement for higher synaptic NE levels to induce long-term as opposed to short-term spike potentiation to glutamate input in the dentate gyrus (Harley et al., 1996). Such LC activation also transmits the learning effects of a UCS in classically conditioned heart rate in the pigeon (Wall et al., 1985;Wild & Cohen, 1985;Gibbs et al., 1986;Elmslie & Cohen, 1990). A defensive response is conditioned by pairing light and shock. The CS and UCS pathways, their sites of interaction, and the behavioral circuit mediating the learned response have been identified. The first modification of sensory responses by conditioning occurring along the CS pathway is in a subset of neurons in the lateral geniculate nucleus. The LC mediates shockinduced cellular changes in the lateral geniculate nucleus neurons, which are seen subsequently in response to the light CS (Elmslie & Cohen, 1990). As the input and the output pathways for lightevoked heart rate conditioning in pigeon are known, and the LC appears to provide the learning signal in this paradigm, further experiments with the pigeon model might further illuminate NE's role as a learning signal.
Vibrisgae activation paired with shock produces a conditioned arousal to later vibrissae stimulation in the rat pup. This somatosensory conditioned response, which is not acquired in the presence of a 13-adrenergic receptor antagonist, is mimicked by pairing vibrissae activation with a [-adrenergic receptor agonist (Landers & Sullivan, 1999). In these appetitive and aversive paradigms, NE release acts as a signal to initiate changes that support learning and memory. The changes do not depend on the continued presence of NE for their expression (e.g., Sullivan & Wilson, 1991).
Dopamine release is associated with natural reward, brain stimulation reward, and drugs of abuse (Wise, 2002). The activation of the DA system, as discussed by Wise, "somehow" serves to establish response habits. Yet, direct evidence for DA as a learning signal in classical conditioning is sparse. Dopamine reward signals in rodents energize and promote approach behaviors and enhance cue salience (Robinson & Berridge, 2000). Although rats readily self-administer drugs that increase DA signals, the animals do not continue to bar-press in the absence of the signal, suggesting that DA is continuously needed to maintain or to motivate such behavior (e.g, Ranaldi & Wise, 2001).
Reviewers concerned with drug addiction have argued that although DA motivational effects are important, cue or context learning dependent on D1 receptor activation in the striatum is likely to contribute to enduring changes in the response to drug-associated cues and environments (e.g., Berke & Hyman, 2000). Nevertheless, the striatal experiments reviewed here show that whereas a DA psychostimulant like amphetamine can, when injected into the striatum, enhance the learning of a conditioned response to a visual or to an olfactory CS (Viaud & White, 1989), amphetamine injections into the striatum cannot act as a UCS (Vezina & Stewart, 1990).
The stimulation of midbrain DA neurons, when paired with an auditory tone, produces an enlarged cortical representation of the paired tone in auditory area and a novel representation of the paired tone in auditory area 2, together with a diminution in response to the adjacent unpaired The localization of the interaction between tone input and DA release has not been identified, nor is it known if the D receptor is critically involved.
Footshock paired with odor increases the odor synaptic input to basolateral amygdala neurons in the anesthetized rat. This associative change, which is localized to the neurons of the basolateral nucleus, requires DA (Grace & Rosenkranz, 2002;Rosenkranz & Grace, 2002 demonstrated. This phenomenon is a primary feature of classical conditioning. For organisms to learn 'what' leads to 'what', rather than nonspecifically associating events in any order, is critical. Learning signals using the cAMP cascade offer a mechanism for explaining the greater effectiveness of forward rather than backward pairings of CS and UCS. In Aplysia, several groups have demonstrated that a CS allowing calcium entry primes adenylate cyclase such that higher levels of cAMP are achieved when the UCS arrives (Clark et al., 1994;Abrams et al., 1998;Yovell & Abrams, 1992). Such facilitation of cAMP levels occurs only with the forward pairing of CS and UCS. Higher levels of cAMP in Aplysia are associated with a longer duration of synaptic plasticity (Bernier et al., 1982;Schacher et al., 1993;Schacher et al., 1988;Sun & Schacher, 1996). In our rat pup model, we found that odor paired with UCS induces cAMP patterns that are not induced by UCS alone. In the vertebrate, patterns of cAMP rather than levels of cAMP might be the key to temporal-order effects in learning.
Homosynaptic glutamate N-methyl-D-aspartate (NMDA) mechanisms do not have a forward pairing requirement. Activation of the NMDA receptor requires a UCS-induced postsynaptic depolarization either before or concurrent with the arrival of the putative CS (Brown et al., 1988 (Malva et al., 1994;Krebs et al., 1991;Wang et al., 1992). Prolonged elevation of NE when learning occurs, e.g., pairing of a novel environment' with shock, suggests that heightened activation in glutamate circuits coding for the novel environment might sustain NE release.
Learning is unlikely to occur with a familiar stimulus associated with a lesser level of sensory activation, even when paired with shock .and in this instance stimulus-associated glutamate release would presumably be insufficient to sustain NE release. Sustained NE levels can be critical for NE's role as a learning signal. Consistent with this hypothesis, Mclntyre et al. (2002) also showed that the level of prolonged NE increase in the amygdala associated with aversive conditioning predicts the strength of learning measured 24 hours later in a conditioned avoidance task.
Prolonged increases in the catecholamines could also account for the ability of postacquisition infusions of 13-adrenergic or of D1/D5 antagonists to disrupt leaming and memory (see review by Izquierdo et al., 2004, this issue.) Such memory-impairing effects argue that prolonged activation of the cAMP-coupled receptors is needed to produce stable learning. The learningsignal events that trigger acquisition might be inseparable from those associated with consolidation. Other studies (Sara et al., 1999) suggesting that the requirement for receptor activation can be markedly delayed argue for a separate catecholamine-associated consolidation event. Dopamine release in specific brain areas is also seen with aversive stimuli (e.g., Wilkinson et al., 1998), in contrast to weaker evidence for DA cell responses to aversive stimuli (but see Schultz & Romo, 1987). Dopamine elevation with aversive events vould be important if DA is to act as a learning signal in, for example, the odor followed by shock model described in the basolateral amygdala.
A caveat with respect to the foregoing discussion is that microdialysis measurements might not be sensitive to the learning signal events of primary interest. A recent study argues that microdialysis results for DA reflect different patterns of firing in the midbrain DA cell population (Floresco .et al., 2003). A general increase in DA levels is associated with an overall increase in the number of DA cells firing. Burst responses associated with signaling do not initiate measurable DA increases because the release is synaptically targeted and reuptake mechanisms effectively remove synaptic DA. Burst responses, however, produce higher levels of local DA release than do increases in the DA cell population firing.

REWARD AND PUNISHMENT
Norepinephrine can mediate leaming signals for both rewardmas in odor preference learning in the rat pupand for punishmentmas suggested by the light-shock conditioning paradigm in the pigeon. Dopamine, although traditionally associated with reward, has also been shown to contribute to aversive learning (e.g., Guarraci et al., 1999) and as noted above, is elevated in aversive learning. Might these catecholamines act as affectively neutral learning signals such that their role is to bind associations but not to determine the 'quality' ofthose associations?
In the honeybee, the cAMP cascade is involved in both appetitive and aversive odor learning. The nature of the UCS neurotransmitters, oetopamine or dopamine, determines the appetitive or aversive nature of the learning signal (Schwaerzel et al., 2003). The cAMP cascade mediates the UCS learning signal in both kinds of learning, but each transmitter is thought to recruit a differem output pathway. In the vertebrate brain, cAMP cascades have also been implicated in appetitive (e.g., odor preference conditioning) and aversive (e.g., fear conditioning) learning. Thus, NE and DA could participate as UCS mediators for both types of learning if other factors like the structures mediating the representations or the outputs were distinct.

NOREPINEPHRINE AND DOPAMINE AS LEARNING MODULATORS
Rather than mediating the UCS learning signal, NE and DA might imeract synergistically with learning signals mediated by other mechanisms. A homosynaptic glutamate NMDA mechanism and a heterosynaptic monoamine mechanism are both required for the full expression of conditioning in the invertebrate Aplysia (Antonov et al., 2003;Glanzman, 1995). In Aplysia, monoamine facilitation is presynaptic, whereas NMDA mechanisms are postsynaptic, although their co-activation leads to an increase in synaptic strength at the same loci. In vertebrates, an interaction of the two mechanisms in postsynaptic cells is common. Most likely, in the odor preference model discussed earlier, a novel odor signal normally produces a calcium influx through NMDA channels, and the calcium signal interacts with cAMP signals to restrict memory changes to the cell groups representing the odor. The NMDA channe! are activated normally as part of the odor input in the rat pup. In other models, postsynaptic depolarization would be necessary for their participation.
Homosynaptic glutamate mechanisms have been well characterized in the vertebrate brain. Such mechanisms alone could support associative learning and could interact with cAMP cascade mechanisms as well. Two kinds of interaction might be envisioned. NE and DA are required at basal or permissive levels to support the normal function of glutamate pathways. NE and DA are required as synergistic learning signals to generate long-term memory in conjunction with homosynaptic glutamate mediated plasticity.
The permissive requirement is exemplified in the role of 5-hydroxytryptamine (5-HT) 5-HT2/c receptor subtypes in rat-pup odor learning, in which a 5-HT2/c receptor antagonist (McLean et al., 1996) or 5-HT depletion (McLean et al., 1993) prevents learning but at the level of mechanism, 5-HT2/ receptors are acting to support the normal 13-adrenergic receptor promotion of cAMP. On its own, 5-HT2a-receptor activation cannot produce associative change, although its absence prevents learning (Price et al., 1998). Higher levels of 13-adrenergic receptor activation overcome the requirement for 5-HT and reinstate odor preference learning (Langdon et al., 1997), confirming the UCS role ofNE in this paradigm.
If NE and DA are required for normal cell excitability and normal intracellular signaling to glutamate inputs, then a blockade of 13-adrenergic or D1/D5 receptors could impair learning without NE or DA acting as learning signals. Alternatively, as mentioned earlier, NE and DA could be, together with homosynaptic glutamate mechanisms, synergistic learning signals. The effects of receptor blockade might be indistinguishable in the two conditions, but the effects of increases in NE and DA receptor activation could be distinct, with additional NE and DA release promoting learning or homosynaptic glutamate-induced synaptic change.
A specific role for the cAMP cascade in the conversion of short-term memory to long-term memory has been proposed (Bailey et al., 1996).
The primary tests of this hypothesis in vertebrates use tetanic stimulation to activate a short-duration, homosynaptic glutamate-synaptic potentiation. If manipulations like agonists of 13-adrenergic or D1/D5 receptors are added to increase the activation of the cAMP cascade, then the hypothesis predicts the conversion from shorter-duration (early long-term potentiation or LTP) to longerduration potentiation (late LTP). Experiments of this type provide the most direct evidence that NE and DA act as synergistic learning signals with homosynaptic glutamate mechanisms.

HOMOSYNAPTIC LONG-TERM POTENTIATION AND NOREPINEPHRINE AND DOPAMINE
Reward or punishment recruits a change from weak to enduring LTP in the dentate gyrus. Such change does not occur in the presence of a 13-adrenergic receptor antagonist. Because NE-cell activity is associated with reward or punishment, the antagonist result is consistent with NE acting as a synergistic learning signal to facilitate homosynaptic potentiation (Seidenbecher et al., 1997). An exploration of novel environments transforms early LTP in the dentate gyrus into late LTP, requiring the activation of 13-adrenergic receptors (Straube et al., 2003) Novel environments also recruit a change from weak to enduring LTP in area CA1 of the hippocampus, where blockade of DA (both D2 and D llD5) receptors prevents the effect (Li et al., 2003) and as reviewed earlier, novelty triggers both NE and DA cell activity.
The exogenous application of cAMP-coupled NE and DA agonists induces a switch from shortterm to long-term homosynaptic plasticity at the glutamate synapses. Applying a 13-adrenergic agonist lowers the threshold for LTP in CA3 (Hopkins & Johnston, 1988), whereas D1/D5 agonists switch early-phase to late-phase LTP in the frontal cortex (Gurden et al., 2000), the hippocampus (Kusuki et al., 1997;Swanson-Park et al., 1999), and enhance early-phase LTP magnitude in the hippocampus as well (Otmakhova & Lisman, 1996). These effects differ from the direct heterosynaptic effects described earlier, which did not require the tetanization of glutamate pathways.

OTHER ASPECTS OF MODULATOR FUNCTION
Norepinephrine and dopamine cAMP-coupled receptor activation could promote' homosynaptic glutamate mechanism in direct ways, as well as interacting through second messenger cascade synergy. The activation of D1/D5 receptors in the frontal cortex (Lavin & Grace, 2001;Dong & White, 2003), striatum (West & Grace, 2002) (Kitai & Surmeier, 1993), and hippocampus (Pedarzani & Storm, 1995) can induce increased cell excitability. The NE activation of 13adrenergic receptors also increases cell excitability (Lacaille & Schwartzkroin, 1988;Foehring et al., 1989;Stanton, 1992;Pedarzani & Storm, 1996). Both DA and NE have been reported to reduce feed-forward inhibition concomitant with DA (Bissiere et al., 2003) or NE (Brown, 2003) pathway activation. Norepinephrine can also transiently suppress the higher beta and gamma frequency EEG oscillations that are associated with binding stable representations while promoting plasticity by enhancing theta rhythms (Brown, 2003). Dopamine suppresses higher frequency oscillations in certain models (Weiss et al., 2003). Finally, both NE (Stanton et al., 1989) and DA (Flores-Hernandez et al., 2002) facilitate NMDA currents. Together, these actions would directly promote glutamate-associated plasticity and new learning. The release of either catecholamine, however, engages a much more complex suite of actions than those enumerated here, with the involvement of multiple receptor types. The net effect would be dependent on the cells and on the circuits that were influenced.

THE INVERTED U-CURVE
One feature of NE and DA in the mediation of learning and memory is an inverted U-curve relation with the neurotransmitter level. This relation is illustrated in odor-preference memory in the rat pup.
If a low dose of a 13-adrenergic agonist is paired with odor, then no learning occurs. If a medium dose is paired with odor, then learning is successful. If a high dose is paired with odor, then learning fails (Sullivan et al., 1989). the hippocampus but impair normal spatial learning and memory (Pineda et al., 2004). The authors suggest that the system becomes too plastic to be functionally useful. Whether a similar explanation will account for other inverted U-curve relations of cAMP to memory remains to be discovered. In the odor preference learning model, greater 13-adrenergic receptor activation does not produce enhanced odor-nerve excitatory post-synaptic potentials (EPSPs) or learning (Yuan et al., 2000).
In the dunce mutation in Drosophila, a decrement in the breakdown of cAMP through the loss of a phosphodiesterase gene prevents normal avoidance-learning to odor-shock pairing. Although an elevation in cAMP is critical for acquiring avoidance responses to odor-shock pairing in Drosophila, excessive elevation appears deleterious (Davis, 1996). Thus, when this cascade is part of the learning signal, inverted U-curve relations between cAMP and learning and memory appear to occur in both invertebrate and vertebrate nervous systems. Optimal requirements for cAMP signaling in memory remain to be defined.
Depleting 5-HT makes medium doses ineffective and high doses necessary, shifting the inverted U-curve to the right (Langdon et al., 1997). A weak stroking input summates with a low dose to produce an effective learning stimulus, but the same stroking input pushes a medium dose into the ineffective range . Thus signaling 'windows' exist for initiating memory. Similar signaling windows have been described for DA and NE in prefrontal working-memory models (Arnsten, 1997), but working memory requires a transient representation rather than the sustained connection changes considered here. That the bases for the inverted U-curves in these two types of memory are similar is unlikely.
A recent study suggests that excessive cAMP levels, created by removing an inhibitory constraint on adenylcyclase, enhance homosynaptic LTP in REFERENCES Abrams TW, Yovell