SUPPLEMENTAL DATA
Demographic, neuropsychological, and medication information for Parkinson’s and control subject groups
The majority of Parkinson’s patients (n = 19) were recruited and diagnosed by a neurologist at Columbia Presbyterian Hospital in New York. Other patients were recruited and diagnosed by neurologists at Robert Wood Johnson University Hospital in New Jersey and Beth Israel Hospital in New York. Disease severity was determined by these neurologists to be Hoehn-Yahr 2 or 2.5 for all patients. All subjects were taking levodopa (L-DOPA) for the treatment of Parkinson’s disease. All subjects were also taking carbidopa, which inhibits peripheral metabolism of L-DOPA. Seventeen subjects were additionally taking a D2 receptor agonist (Mirapex, 12 subjects; Permax, 1; Requip, 4). Eleven subjects were taking one or more additional drugs to treat Parkinson’s disease (Amantadine, 5 subjects; Comtan, 3; Selegiline, 7). One subject was taking Seroquel, an antipsychotic used to treat the side effects of Parkinson’s medications which may have serotonergic side effects. Nine subjects were taking serotonergic drugs (Amitriptyline, 1 subject; Buspar, 1; Citalopram, 1; Nortriptyline, 1; Paxil, 1; Remeron, 2; Zoloft, 2) and four subjects were taking cholinergic drugs (Aricept, 1 subject; Kemedrin, 2; Orphenadrine, 1). The mechanism of action for many of these drugs is poorly understood. However, it is thought that some may have adrenergic or noradrenergic side effects (Amantadine, Amitriptyline, Buspar, Nortriptyline, Remeron, Selegiline) or cholinergic side effects (Amantadine, Amitriptyline, Nortriptyline).
Subjects were typically tested in their homes. Fifteen patients were tested on medication first and eleven were tested off medication first. Four patients were tested with an earlier version of the instructions, training, and questionnaires in which they made choices in a stable environment to a performance criterion (34-227 trials) rather than a fixed number of trials. The four patients then completed the same 800-trial experiment as other subjects. Unified Parkinson’s Disease Rating Scale (UPDRS) on medication motor scale (section III) scores were not measured for those patients, and the L-DOPA dosage is unknown for one patient. One included patient was able to complete only 464 trials (6 blocks) of the experiment on medication due to equipment failure.
The majority of elderly control subjects (n = 19) were recruited and tested at the Cresskill Senior Center in Cresskill, New Jersey. Young and elderly subjects were also recruited by flyers in the New York University area. No control subject was taking dopaminergic, serotonergic, or cholinergic medications. Young subjects did not complete neuropsychological tests and were not matched to patients or elderly subjects in any way.
We matched Parkinson’s patients and elderly subjects for gender, age, and education. The two groups did not have significantly different scores on average for any neuropsychological test (see supplemental Table 1). No subject had been diagnosed with clinical depression and, consistent with that, no subject had a score higher than 18 on the Beck’s Depression Inventory. Two Parkinson’s patients and two matched elderly controls were excluded from calculation of verbal IQ because they were not native English speakers.
Pre-experiment and post-experiment questionnaires
Pre-experiment questions
- When you start the game, there are more crabs near one of the two traps. Is that one trap better for the entire game?
- If a crab is caught in one of your traps, how long does it stay in the trap?
- At any time in the game, there will be more crabs near one of the traps. Does the worse trap still catch crabs?
- Will you catch more crabs if you play the game slowly?
- If you catch several crabs in a row from one trap, are your chances of catching more crabs from that same trap worse?
Post-experiment questions
- In the training session (NOT the game), how did you make your choices?
- In the game, sometimes there were more crabs near the red trap and sometimes there were more near the green trap. How many times do you think the crabs moved from one side to the other?
- How often did you feel that you knew where the crabs were?
- The computer told you how many crabs you caught in your last 40 choices. Did you use that information to make your choices?
- In the game (NOT the training session), how did you make your choices?
- If you know that the red trap is 5 times more likely to catch a crab than the green trap, which strategy should catch the most crabs? (If the crabs do not move.)
Questions were multiple-choice. Four patients were given slightly different questionnaires with similar content. Subjects answered mean ± SEM, 4.67 ± 0.05 (n = 96 sessions) of the 5 pre-experiment questions correctly. The last post-experiment question did have a correct answer (assuming reward probabilities sum to less than 0.36): “Choose the red trap 5 times, then the green trap 1 time and only one time, then go back to the red trap”. Thirteen of 26 young subjects identified the correct answer, 6 of 26 elderly subjects did, and 6 of 22 patients did in one, but not both, of the two sessions. This suggests that young subjects might have a better understanding of the task than other subject groups, but young subjects were not matched to other groups in any way and we draw no conclusions from differences in decision making between young subjects and the other subject groups.
Matching law analysis and human vs. monkey reinforcement learning
We used the generalized matching law (Baum, 1974) to estimate reward sensitivity in each group using a 50-trial analysis period after allowing 20 trials for choice behavior to stabilize. Parameter estimates are listed in supplemental Table 2. A minority of blocks from each group was excluded. This occurred when no rewards were received from one of the two targets during the analysis period. Results from two monkeys tested in a similar task are included for comparison (Lau and Glimcher, 2005). The monkeys made choices between red and green targets for liquid rewards. Reward contingencies were similar but blocks were longer (130 trials on average). Humans and monkeys matched with comparable reward sensitivities in this reinforcement-learning task. Reward and choice effects estimated by linear regression were also similar between humans and monkeys (compare Figures 9A,C and Lau and Glimcher, 2005). Electrophysiological results suggest that, in monkeys performing a similar task, striatal activity is correlated with trial-by-trial action values computed using the linear regression approach (Lau and Glimcher, 2008). When facing a similar reinforcement-learning task, the steady-state and trial-by-trial choice behavior of humans and monkeys were well described by the same models. This similarity in choice behavior suggests that the two species may employ a similar reinforcement learning mechanism in this task, and the results of experiments with one species may generalize to the other.
Can subjects predict block transitions and change choice behavior accordingly?
Due to the regularity in block length (70-90 trials), it is possible that subjects might learn to predict the unsignaled block transitions and that this might have affected some of our conclusions. To address this possibility, the second post-experiment question asked subjects: “How many times do you think the crabs moved from one side to the other?” (1-5 / 6-15 / 16-30 / 31-50 times). Subjects completed 10 blocks of trials, but this question was answered correctly (6-15 times) in the minority (41%, 39/95) of sessions, suggesting that many subjects did not have a clear idea of how many blocks they faced.
While subjects played the game, the display indicated the number of crabs caught over the past 40 trials. We also considered whether this information could be useful for detecting block transitions, for example, if the number tended to drop after a block transition. We computed the average number displayed for the 10 trials before and after all block transitions faced by our subjects and found that they were the same (mean ± SD, 9.94 ± 2.52 before, 9.94 ± 2.45 after, n = 932 block transitions), suggesting that this element of the display provided little or no information to our subjects about block boundaries. In the fourth post-experiment question, we also asked subjects whether they used this information to make their choices and the answer was yes in only 16% (15/96) of sessions.
Finally, we checked whether parameter estimates based on the choice behavior we observed differed in early (trials 1-30) and late (trials 31-60) block phases. To accomplish this we fit separate learning rate parameters for positive and negative outcomes and a choice parameter for each subject group in each block phase using the previously estimated group noise parameter (β = 1.12). All learning rate parameters were similar in early and late phases (Wald test, all p > 0.23). In both phases, learning rates were higher in Parkinson’s patients on medication than off for positive outcomes (both p < 0.004) but not negative outcomes (both p > 0.17) as in the overall fits. Elderly subjects perseverated marginally more in the late than early phases (p = 0.047), but choice parameters were similar in early and late phases in the Parkinson’s groups (both p > 0.12). All differences between group choice parameters in the overall fits were maintained in early and late block phases (all p < 0.003). Thus we found little evidence that choice behavior differs in early and late block phases, or that subjects were able to predict block transitions or use information displayed to identify block transitions.
REFERENCES
Baum WM (1974) On two types of deviation from the matching law: bias and undermatching. J Exp Anal Behav 22:231-242.
Lau B, Glimcher PW (2005) Dynamic response-by-response models of matching behavior in rhesus monkeys. J Exp Anal Behav 84:555-579.
Lau B, Glimcher PW (2008) Value representations in the primate striatum during matching behavior. Neuron 58:451-463.
Simuni T, Jaggi JL, Mulholland H, Hurtig HI, Colcher A, Siderowf AD, Ravina B, Skolnick BE, Goldstein R, Stern MB, Baltuch GH (2002) Bilateral stimulation of the subthalamic nucleus in patients with Parkinson disease: a study of efficacy and safety. J Neurosurg 96:666-672.
Number of subjects / 26 / 26 / 26
Female:Male / 14:12 / 12:14 / 12:14
Age / 22.8 (3.1) / 67.3 (10.0) / 65.7 (8.7)
Years of education / 15.6 (2.9) / 16.8 (2.5)
NAART Verbal IQ / 114.4 (6.7) / 114.4 (7.8)
COWAT-FAS / 46.4 (12.3) / 45.5 (13.3)
Digit span / 17.2 (4.1) / 16.6 (4.5)
Beck’s Depression Inventory II / 4.6 (3.7)a / 6.9 (4.8)a
Logical memory I / 20.0 (6.4)b / 23.4 (6.2)b
Logical memory II / 16.4 (7.6) / 19.7 (6.9)
Logical memory difference / 3.7 (2.8)c / 3.7 (2.6)c
Disease duration / 8.3 (4.3)
UPDRS motor scale on medication / 19.9 (7.5)d
UPDRS motor scale off medication / 29.9 (10.5)d
UPDRS motor scale difference / 10.0 (5.4)d
L-DOPA (mg/day) / 381 (187)
L-DOPA equivalent dosage (mg/day) / 473 (211)e
Supplemental Table 1. Values represent means (and standard deviations) of demographic variables for Parkinson’s patients, healthy elderly control subjects, and healthy young subjects. There were no significant differences between Parkinson’s and elderly subjects for age, years of education, verbal IQ, FAS scores, or digit span. (a) There was a trend toward higher Beck’s Depression Inventory II scores for Parkinson’s patients than elderly subjects (unpaired t test, t(50) = 1.94, p = 0.059), although no subject had ever been diagnosed with depression. (b) There was a trend toward higher Logical Memory I and II scores in Parkinson’s patients than elderly subjects (Logical Memory I, t(50) = 1.95, p = 0.057; Logical Memory II, t(50) = 1.64, p = 0.11). (c) As expected, individual Logical Memory scores were lower after a 15-minute interval in both Parkinson’s patients and elderly subjects (paired t test, both t(25) > 6.55, p < 0.0001). (d) The ranges for UPDRS motor scale scores were 10-39 on medication and 10-58 off medication. UPDRS motor scale scores were significantly higher when Parkinson’s patients were tested off medication than on medication (paired t test, t(23) = 9.07, p < 0.0001). (e) L-DOPA equivalent dosages were calculated according to Simuni et al. (2002): 100 mg L-DOPA = 130 mg controlled-release L-DOPA = 1 mg Permax = 1.5 mg Mirapex = 9 mg Requip. Non-dopaminergic drugs including Amantadine, Comtan, Selegiline, and Seroquel were not included in these calculations.(NAART, North American Adult Reading Test; COWAT, Controlled Oral Word Association Test; UPDRS, Unified Parkinson’s Disease Rating Scale.)
A
Young / 26 / 260 / 22 / 0.43 (0.02) / 0.03 (0.05) / 0.65
Elderly / 26 / 260 / 25 / 0.26 (0.02) / -0.01 (0.04) / 0.50
PD off / 26 / 260 / 30 / 0.23 (0.02) / -0.07 (0.04) / 0.42
PD on / 26 / 256 / 18 / 0.29 (0.02) / 0.01 (0.05) / 0.46
B
Subject / Blocks / Excluded / Slope / Bias / R2HC2011 / 10 / 3 / 0.52 (0.06) / -0.22 (0.19) / 0.94
HC2620 / 10 / 1 / 0.33 (0.05) / 0.17 (0.09) / 0.88
PD2710off / 10 / 0 / 0.29 (0.03) / 0.07 (0.07) / 0.92
PD2710on / 10 / 1 / 0.35 (0.05) / 0.05 (0.12) / 0.87
C
Monkey / Blocks / Excluded / Slope / Bias / R2Monkey H / 265 / 6 / 0.51 (0.01) / 0.05 (0.03) / 0.88
Monkey B / 191 / 10 / 0.54 (0.01) / -0.04 (0.03) / 0.89
Supplemental Table 2. A, Group-level results. B, Results for example subjects. C, For comparison, results for two monkeys from an experiment with similar reward contingencies (data from Lau and Glimcher, 2005). Fits of the generalized matching law to steady-state choice data. The data are plotted in Figure 4A-D. Values of parameter estimates (standard errors) are listed for slope a (reward sensitivity) and spatial bias c. Included are 50 trials from each block after discarding the initial 20 trials during which choice behavior stabilized. Blocks in which no rewards were received from one of the two options are excluded (number indicated).
1