Disturbed Anterior Prefrontal Control of the Mesolimbic Reward System and Increased Impulsivity

Article title:

Disturbed anterior prefrontal control of the mesolimbic reward system and increased impulsivity in bipolar disorder

Author names:

Sarah Trost1, MD, Esther Kristina Diekhof1,2, PhD, Kerstin Zvonik1, PhD, Mirjana Lewandowski1, Juliana Usher1, MD, Maria Keil1, David Zilles1, MD, Peter Falkai3, MD, Peter Dechent4, PhD, Oliver Gruber1, MD

Corresponding Author:

Sarah Trost, MD

Centre for Translational Research in Systems Neuroscience and Clinical Psychiatry

Department of Psychiatry and Psychotherapy

Georg August University Goettingen, Germany

Telephone (-fax): +49 551 39-10115/-6615 (-8952)

e-mail:

Supplementary materials

Material and Methods

Participants: All patients were recruited from the Department of Psychiatry and Psychotherapy at the University Medical Center Goettingen, and healthy control subjects were recruited from the local population and the hospital staff. All participating subjects were Caucasians of self-reported European ancestry. The study was performed in accordance with the ethical standards laid down in the current version (dating from 2008) of the Declaration of Helsinki.

Task/Experimental Procedure: First, participants underwent a training session outside the scanner. This session took place the day before the actual fMRI measurement was realized. The training started with an operant conditioning task which took 5 minutes. Subjects were seated in front of a monitor in a separate room with the index and middle finger of their right hand on two different buttons of a keypad. Subsequently, squares of eight different colors were presented on the monitor in a shuffled mode. Subjects were instructed to respond to each of the presented colored squares (stimulus) once by pressing one of the buttons on the keypad. The left button (index finger) was connoted with collecting the color of the square, while the right button (middle finger) was connoted with rejecting the color of the presented square. Button choice was free in the conditioning phase and subjects were encouraged to explore the stimulus-response-reward contingencies.By exploring the stimulus-response-reward contingencies, subjects were conditioned to associate two of the eight colors (red and green) with an immediate reward (bonus of +10 points), while the other colors were associated with a neutral outcome. The goal of this operant conditioning task was to establish stimulus-response-reward contingencies for the next phase of the experiment.

Subsequently, subjects were familiarized with the actual reward task, a sequential forced-choice task. Subjects had to pursue a superordinate long-term goal during task blocks of 4-7 trials to acquire 50 points at the end of each block.To reach the superordinate long-term goal(50 points), subjects had to collect two target stimuli by pressing the left button.

Two different types of blocks had to be performed. In the first type of blocks, the “desire context” (DC), subjects were allowed to collect the priorly conditioned stimuli in addition to the two target colors. In the second type of blocks, the “reason context” (RC), the conditioned stimuli had to be rejectedby right button presses in order to successfully pursue the long-term goal. The context changed after every second block and was indicated by a cue (DC: “B” for Bonus; RC: “Z” for german “Zielverfolgung”: target). If a subject collected a non-target or falsely rejected a target, the block immediately stopped in both types of blocks. A feedback appeared on the screen (“goal not achieved”) and the subject gained 0 points for the overall outcome of this block. In the DC, subjects were allowed to collect the conditioned stimuli and were immediately rewarded with +10 points each, while rejection of the conditioned stimulus did not lead to a termination of the block. On the other hand, falsely collecting the conditioned stimuli in the RC led to a stop and 0 outcome of the block (feedback “goal not achieved”). So, during the RC, subjects were forced to overcome the tendency to acquire immediate reward in order to reach the superordinate long-term goal. This condition therefore constituted a “desire-reason dilemma” (DRD)(Diekhof and Gruber, 2010; Diekhof et al, 2012b).

On the following day, subjects performed two sessions of the task in the scanner. Each session consisted of 20 blocks (120 trials). Depending on the behavioral success, subjects could perform up to 60 target color trials, 30 non-target color trials and 30 conditioned color (Bonus) trials in each context per session.

Half of the blocks were performed in the DC and the RC, respectively. Cues indicating the context appeared for 1800ms. Blocks started with presentation of the two target colors for 1500ms. Subsequently, a blank screen was displayed for 200ms before individual squares were shown for 900ms in alternation with blank screen intervals of 200ms duration. Subjects had to press the response buttons within the 900ms period in which the individual squares were displayed. After correct answers, “0” was displayed within the colored squares as immediate feedback. Failure to implement the superordinate task goal or failure to answer within the 900ms led to the termination of the current block and zero outcome (feedback “goal not achieved”).

At the end of each block, a feedback about the outcome was given for 700ms followed by a blank screen (100ms). The total feedback, which indicated the overall outcome of the session was presented at the end of each session.

Participants could gain a maximum of 1150 points per session. Points acquired during the two sessions in the scanner were cashed into real money. Subjects could receive up to 30€ which were added to the general allowance of 30€ for participation.

fMRI data analyses:SPM5 preprocessing comprised coregistration, correction of movement-related artifacts (realignment and unwarping), corrections for slice-time acquisition differences and low-frequency fluctuactions, normalization into standard stereotactic space (skull-stripped EPI template by the Montreal Neurological Institute (MNI)), and spatial smoothing with an isotropic Gaussian kernel filter of 9mm full-width half-maximum. Statistical analyses used a general linear model (GLM), which comprised 3 regressors (i.e., goal-relevant targets, neutral non-targets, conditioned reward non-targets), both for the “desire context” and for the “reason context”. The cues and the block feedback for either successful goal completion or overall goal failure were also modeled as independent regressors, which resulted in a total of 9 onset regressors.

Incorrectly answered trials and trials in which a conditioned reward stimulus was not collected for an immediate bonus in the “desire context” were excluded from the analyses.A vector representing the temporal onsets of stimulus presentation was convolved with a canonical hemodynamic response function (hrf) to produce a predicted hemodynamic response to each experimental condition.

Psychophysiological interaction analyses (PPI analyses): We assessed the functional interactions of the ventral striatum (vStr)/Nucleus accumbens (Nacc) with the anteroventral prefrontal cortex (avPFC) in the “desire-reason dilemma” situation,i.e. when immediate reward contingencies and the superordinate goal competed for action control. We used PPI analyses as introduced by Friston et al. (Friston et al. 1997) to assess functional connectivity patterns. We selected the bilateral vStr (+/-12 12 -3) as seed areas for the PPI. Individual blood oxygenation level-dependent (BOLD) signal time courses were extracted from right and left vStr coordinates (+/-12 12 -3), which served as physiological vectors in the PPI analyses. The psychophysiological vector consisted of the contrast that compared conditioned reward stimuli presented in the “reason context” with those in the “desire context”.

Using MATLAB and SPM5, the hemodynamic signals were first deconvolved using a parametric empirical Bayesian formulation and mean-corrected. Then the PPI term was built separately for each of the two regions (right and left vStr) by multiplying the deconvolved and mean-corrected BOLD signal with the respective psychological vector. After convolution with the hrf, mean correction, and orthogonalization, the three regressors (PPI term, physiological vector, and psychological vector) went into the statistical analysis to determine context-dependent changes of functional connectivity over and above any main effect of task or any main effect of activity in the corresponding brain areas. In the PPI contrasts, the PPI term was computed against implicit baseline. Random-effect analyses (p < 0.001, uncorrected) were performed on single-subject PPI contrast images.

Results

Behavioral results: Subjects achieving less than 70 percentage of correctly accepted target stimuli and correctly rejected non-target stimuli and accepting less than 10 out of 30 conditioned bonus stimuli in the DC were a priori excluded from the analysis because of insufficient performance. Therefore, three bipolar patients were excluded.