\chapter*{2.1 $\bigm|$ Pupil-linked arousal is driven by decision uncertainty and alters serial choice bias}
\chaptermark{2.1 $\bigm|$ Pupil-linked arousal is driven by decision uncertainty and alters serial choice bias}
\addcontentsline{toc}{chapter}{2 $\bigm|$ Methods and Results}
\addcontentsline{toc}{chapter}{2.1 $\bigm|$ Pupil-linked arousal is driven by decision uncertainty and alters serial choice bias}
% restart
\setcounter{equation}{0}
\setcounter{figure}{0}
\begin{flushright}
Urai AE, Braun A, Donner TH. (2017) \textit{Nature Communications}, 8:14637
\end{flushright}
\vfill
\section*{Abstract}\label{abstract}
While judging their sensory environments, decision-makers seem to use
the uncertainty about their choices to guide adjustments of their
subsequent behaviour. One possible source of these behavioural
adjustments is arousal: Decision uncertainty might drive the brain's
arousal systems, which control global brain state and might thereby
shape subsequent decision-making. Here, we measure pupil diameter, a
proxy for central arousal state, in human observers performing a
perceptual choice task of varying difficulty. Pupil dilation, after
choice but before external feedback, reflects three hallmark signatures
of decision uncertainty derived from a computational model. This
increase in pupil-linked arousal boosts observers' tendency to alternate
their choice on the subsequent trial. We conclude that decision
uncertainty drives rapid changes in pupil-linked arousal state, which
shape the serial correlation structure of ongoing choice behaviour.
\smallskip
% footnote for citation information
\blfootnote{This chapter was reprinted under a CC-BY 4.0 license. doi: 10.1038/ncomms14637. \fauxsc{Author contributions:}
Conceptualization, A.E.U. and T.H.D.; Investigation, A.E.U.; Formal Analysis, A.E.U. and A.B.;
Software, data curation and visualization, A.E.U.; Writing, A.E.U. and T.H.D.; Supervision, T.H.D.}
\clearpage
% ========================================== %
\section*{Introduction}\label{introduction}
In perceptual and sensory-motor tasks, humans and animals behave as if
they make use of decision uncertainty -- the probability that a choice
is correct, given the sensory evidence \parencite{kepecs2008,ma2014,pouget2016}.
Theoretical accounts postulate that decision uncertainty should shape
subsequent decision processing and, thereby, subsequent choice
behaviour \parencite{kepecs2012,meyniel2015a,pouget2016}. But how decision uncertainty is
transformed into subsequent behavioural adjustments has, so far,
remained elusive.
One prominent idea is that the brain broadcasts uncertainty signals
across brain-wide neural circuits via low-level arousal
systems \parencite{dayan2000,meyniel2015a,yu2005a}. Arousal systems might be driven by
uncertainty \parencite{aston-jones2005,deberker2016,dayan2006,meyniel2015a,nassar2012,yu2005a},
and they profoundly shape the global state of the brain through the action of modulatory
neurotransmitters \parencite{harris2011,lee2012,mcginley2015}. Uncertainty-dependent changes
in global brain state, in turn, might translate into adjustments of
choice behaviour. The goal of our study was to investigate whether
arousal (1) reflects decision uncertainty in a perceptual choice task;
and (2) predicts changes in subsequent choice behaviour.
Changes in central arousal state (as assessed by various measures of
cortical dynamics) are tightly coupled to fluctuations in pupil diameter
under constant luminance \parencite{eldar2013,reimer2014,vinck2015,mcginley2015a,mcginley2015}. We here built on
this connection and monitored pupil diameter as a proxy for central
arousal state. We used a model based on statistical decision theory,
illustrated in \autoref{fig:natcomm_Figure1}, in which decision uncertainty is defined as the
probability a choice is correct, given the available
evidence \parencite{pouget2016,sanders2016}. This operationalization of decision
uncertainty obviates the need for subjective confidence
reports \parencite{kepecs2012}, bridging to the insight from animal
physiology that neurons in a number of brain regions encode decision
uncertainty, as defined in \autoref{fig:natcomm_Figure1} \parencite{kepecs2008,komura2013,lak2014,teichert2014}.
\begin{figure}
\centering
\includegraphics{figures/pupilUncertainty_figure1.eps}
\caption{\textbf{Operationalizing decision uncertainty.}
(\textbf{a}) Computations underlying choice and decision uncertainty.
Due to noise, repeated presentations of a generative stimulus produce a
normal distribution of internal responses centered at the mean of this
generative stimulus. The internal response on each trial
dv\textsubscript{i} is a sample drawn from this distribution. It is
compared to a decision bound or criterion c, to compute the binary
choice as well as a graded measure of decision confidence (or its
complement: uncertainty). (\textbf{b}) For two example levels of
evidence strength, the average confidence is indicated by the shaded
regions, separately for correct (blue) and error (red) trials.
(\textbf{c}) Confidence (top) and uncertainty (bottom) as a function of
evidence strength (100 bins), separately for correct and error trials.
The two levels of evidence indicated by symbols (circles, triangles)
correspond to the two example levels of evidence strength in panels
\textbf{a}, \textbf{b}. (\textbf{d}) Accuracy as a function of decision
uncertainty (100 bins). (\textbf{e}) Accuracy as a function of evidence
strength (100 bins), separately for trials with high and low decision
uncertainty (median split). For details, see Methods and
\parencite{kepecs2008,sanders2016}.}
\label{fig:natcomm_Figure1}
\end{figure}
The model assumes that observers base their judgment of each stimulus on
a noisy decision variable, sampled from a distribution that depends on
the identity and strength of the stimulus (\autoref{fig:natcomm_Figure1}a). Two-alternative
forced choice tasks entail comparing this decision variable with a
decision bound. When the decision variable happens to fall on the wrong
side of the bound, errors occur. This happens more often for weaker
stimuli, because the distributions corresponding to the two possible
stimuli show higher overlap (\autoref{fig:natcomm_Figure1}b). A monotonic function of the
distance between the decision variable and the bound is a metric of
decision confidence; uncertainty is its
complement \parencite{hebart2016,kepecs2008,sanders2016} (\autoref{fig:natcomm_Figure1}a and Methods).
This model predicts three signatures of decision
uncertainty \parencite{kepecs2008,sanders2016}: (1) uncertainty decreases with
evidence strength for correct choices (blue line in \autoref{fig:natcomm_Figure1}c) but,
counter-intuitively, increases with evidence strength for incorrect
choices (red line in \autoref{fig:natcomm_Figure1}c); (2) uncertainty predicts a monotonic
decrease in choice accuracy from 100 to 50\% (\autoref{fig:natcomm_Figure1}d); (3) higher
uncertainty predicts lower choice accuracy, even for the same evidence
strength (\autoref{fig:natcomm_Figure1}e). The opposite, monotonic scaling of uncertainty
with evidence strength for correct and error trials (\autoref{fig:natcomm_Figure1}c) also
emerges from a variety of dynamic decision-making models, including race
models \parencite{kepecs2008}, Bayesian attractor
models \parencite{bitzer2015}, and biophysically detailed circuit models of
cortical dynamics \parencite{insabato2010,wei2015a}.
We systematically manipulated the strength of sensory evidence and
tested whether pupil responses exhibited the three signatures derived
above. We then quantified the predictive effects of pupil-linked arousal
on subsequent behaviour in terms of the key elements of the perceptual
decision process: response time (RT), perceptual sensitivity, lapse
rate, and choice bias. Choice bias was decomposed into an overall bias
for one choice, and a serial bias dependent on the history of previous
choices or stimuli. We found a predictive effect of pupil-linked arousal
responses on serial choice bias.
% ========================================== %
\section*{Results}\label{results}
\subsection*{Pupil responses reflect decision uncertainty}\label{pupil-responses-reflect-decision-uncertainty}
27 human observers performed a 2-interval forced choice visual motion
coherence discrimination task (\autoref{fig:natcomm_Figure2}a and Methods). We applied motion
energy filtering \parencite{adelson1985} to the stochastic random dot motion
stimuli, yielding a more fine-grained estimate of the decision-relevant
sensory evidence contained in the stochastic stimuli than the nominal
level of motion coherence (\autoref{fig:natcomm_Figure2}b,c and Methods). The absolute value
of this sensory evidence served as a single-trial measure of evidence
strength (\autoref{fig:natcomm_Figure2}b). As expected, stronger evidence yielded higher
choice accuracy and faster responses (\autoref{fig:natcomm_Figure2}d and \autoref{fig:figureS2}a).
\begin{figure}
\centering
\includegraphics{figures/pupilUncertainty_figure2.eps}
\caption{\textbf{Perceptual choice task and behaviour.}
(\textbf{a}) Behavioural task. Dynamic random dot patterns were
displayed throughout each trial. In two successive intervals (onset cued
by beeps), the dots moved in one of the four diagonal directions (fixed
per observer): A first `reference' interval with always 70\% motion
coherence, and a second `test' interval with varying levels of motion
coherence, larger or smaller than the reference. Observers reported
whether the test stimulus contained stronger or weaker motion than the
reference by pressing one of two buttons. They received auditory
feedback after a variable delay. (\textbf{b}) Quantifying evidence
strength. Each random dot stimulus was convolved with a set of
spatio-temporal filters \parencite{adelson1985} to obtain a time course of
motion energy. The difference between mean motion energy during test and
reference intervals was used as a measure of single-trial measure
evidence strength. (\textbf{c}) Probability distribution of evidence
strength as a function of difference in nominal motion coherence.
(\textbf{d}) Accuracy and median RT as a function of evidence strength
(6 bins). (\textbf{e}) Median RT as function of evidence strength (6
bins), split by correct and error trials. (N=27, group mean $\pm$ s.e.m.)}
\label{fig:natcomm_Figure2}
\end{figure}
In line with previous work \parencite{sanders2016}, RT exhibited all three
signatures of decision uncertainty derived in \autoref{fig:natcomm_Figure1} above (\autoref{fig:natcomm_Figure2}e
and \autoref{fig:figureS1}b,c). This was true despite the interrogation
protocol \parencite{bogacz2006}, in which the test stimulus had a fixed
duration, its offset prompted the choice, and observers were instructed
to maximize accuracy without speed pressure (response deadline was 3
seconds after test offset). Specifically, RT decreased with evidence
strength on correct trials but increased with evidence strength on
errors (\autoref{fig:natcomm_Figure2}e). Further, RT predicted accuracy over a wide range,
but not below 50\% (\autoref{fig:figureS1}b), indicating that RT
reflected decision uncertainty rather than error
detection \parencite{kepecs2008}. We next assessed whether decision
uncertainty also affected pupil-linked arousal.
The pupil dilated during decision formation, peaking just after the
choice (button press) as observed in previous work \parencite{degee2014},
and then dilated again after feedback (\autoref{fig:natcomm_Figure3}a). Between these two
peaks, dilation amplitudes diverged between different conditions, as
predicted by decision uncertainty (compare with \autoref{fig:natcomm_Figure1}c): Pupil
responses were smallest after correct decisions based on strong
evidence, they were overall larger after errors than correct choices,
and largest after errors made on trials with strong evidence (\autoref{fig:natcomm_Figure3}a).
\begin{figure}
\centering
\includegraphics{figures/pupilUncertainty_figure3.eps}
\caption{
\textbf{Pupil dilation after choice and before feedback
reflects decision uncertainty.}
(\textbf{a}) Time course of pupil responses throughout the trial. Time
courses were baseline-corrected and split by correct and error as well
as three bins of evidence strength. Mean pupil dilation in the 250 ms
before feedback (grey box) was used as a single-trial measure of pupil
response. (\textbf{b}) Time course of uncertainty scaling in the pupil,
computed as sample-by-sample regression of baseline-corrected pupil
dilation onto evidence strength. Lower bars indicate p \textless{} 0.05
from a cluster-corrected permutation test, of the difference between
each time course and zero, and between the two time courses.
(\textbf{c}) Regression weights for the linear relationship between
evidence strength and pupil responses. (\textbf{d}) Individual
perceptual sensitivity, separately for lowest and highest pupil
tertiles. (\textbf{e}) Individual logistic regression weights, using
pupil responses to predict single-trial choice correctness. In
\textbf{b}-\textbf{e} z-scored, log-transformed RTs were removed from
the pupil signal via linear regression. *** p \textless{} 0.001, ** p
\textless{} 0.01, permutation test. (N=27, group mean $\pm$ s.e.m.)}
\label{fig:natcomm_Figure3}
\end{figure}
To quantify the temporal evolution of uncertainty scaling in the pupil,
we regressed baseline-corrected pupil time courses against each trial's
evidence strength, separately for correct and error trials. From choice
onwards, pupil dilation scaled positively with evidence strength on
error trials, and negatively on correct trials (\autoref{fig:natcomm_Figure3}b,c and
\autoref{fig:figureS3}a). In other words, the scaling of the pupil
response with evidence strength diagnostic of decision
uncertainty emerged in the interval between choice and feedback. Consequently, this
uncertainty scaling was not a response to the external information about
choice correctness provided by the external feedback, but rather
reflected internal decision-related computations as described in Figure
1. For simplicity, we refer to the single-trial pupil dilation averaged
across the 250 ms interval before feedback as `pupil response' in the
following.
The pupil response also exhibited the other two signatures of decision
uncertainty predicted by the model in \autoref{fig:natcomm_Figure1}. Larger pupil responses
were accompanied by an overall lower choice accuracy (\autoref{fig:natcomm_Figure3}e and
\autoref{fig:figureS3}c), and psychophysical sensitivity was lower on
trials with a larger pupil response (\autoref{fig:natcomm_Figure3}d and \autoref{fig:figureS3}b).
Specifically, the pupil response did not predict choice accuracy
below 50\%, suggesting that it did not signal the detection of errors
(\autoref{fig:figureS3}c).
The scaling of the pupil response with decision uncertainty was not
inherited from the analogous scaling of RT, but was also present after
first removing (via linear regression) the trial-to-trial variations
accounted for by RT (\autoref{fig:figureS3}d-f). Indeed, trial-to-trial
correlations between pupil responses and RTs were generally small
(Pearson correlation, average r: 0.087 range: -0.042 to 0.302, for
log-transformed RT). For all subsequent analyses reported in this paper,
we removed RT-fluctuations from the trial-to-trial fluctuations of
single-trial pupil responses via linear regression (see Methods).
\subsection*{Pupil-linked arousal alters subsequent choice
behaviour}\label{pupil-linked-arousal-alters-subsequent-choice-behaviour}
We proceeded to test whether uncertainty-related pupil responses
predicted changes in subsequent choice behaviour. It has been proposed
that arousal signals control various aspects of learning and
decision-making \parencite{aston-jones2005,dayan2006,dayan2000,meyniel2015a,yu2005a}. In the context of our task,
the choice parameters of interest were perceptual sensitivity (measured
as the slope of the psychometric function, \autoref{fig:figureS4}a),
lapse rate (measured as the vertical distance of the asymptotes of the
psychometric function from 0 or 1, \autoref{fig:figureS4}a), bias
(measured as the horizontal shift of the psychometric function,
\autoref{fig:figureS4}a), and RT. For RT, we focussed on increases after
error trials, an effect referred to as post-error
slowing \parencite{dutilh2012}, which was found to be modulated by
pupil-linked arousal in a speeded RT task \parencite{murphy2016}. Choice
bias was further decomposed into two parameters: overall bias (i.e., a
general tendency towards one choice option, averaged across the entire
experiment, \autoref{fig:figureS4}b) and serial bias (i.e., a local,
choice history-dependent tendency towards one option that becomes
evident when conditioning the psychometric function on the preceding
choice, \autoref{fig:figureS4}c) \parencite{fernberger1920,frund2014,yu2008}. Because in our
task (as common in laboratory choice tasks), the sensory evidence was
independent across trials, any serial bias was maladaptive, reducing
observers' performance below the optimum they could achieve given their
perceptual sensitivity.
\begin{figure}
\centering
\includegraphics{figures/pupilUncertainty_figure4.eps}
\caption{
\textbf{Pupil responses and RT predict reductions in serial
choice bias.}
(\textbf{a}) Serial choice bias, quantified as the history-dependent
shift of the psychometric function, for tertiles of previous trial pupil
responses. (\textbf{b}) Absolute choice bias, measured as the intercept
of a logistic psychometric function, for tertiles of previous trial
pupil responses. (\textbf{c}) Perceptual sensitivity, measured as the
slope of a logistic psychometric function, for tertiles of previous
trial pupil responses. (\textbf{d}) Lapse rate, measured as the
probability of stimulus-independent guesses, for tertiles of previous
trial pupil responses. (\textbf{e}) Post-error slowing, for tertiles of
previous trial pupil responses. (\textbf{f}-\textbf{j}) as in
\textbf{a}-\textbf{e}, but for tertiles of previous trial RT. *** p
\textless{} 0.001, * p \textless{} 0.05, main effect of pupil/RT bin on
repetition probability computed from a one-way repeated measures ANOVA.
Unfilled markers indicate p \textgreater{} 0.05, with
Bf\textsubscript{10} from a Bayesian repeated measures ANOVA written in
panel. (N=27, group mean $\pm$ s.e.m.)}
\label{fig:natcomm_Figure4}
\end{figure}
The pupil response predicted a reduction of serial choice bias (\autoref{fig:natcomm_Figure4}a and \autoref{fig:figureS5}). When a choice was followed by a small
pupil response, observers tended to repeat this choice on the next
trial; when the previous pupil response was large, this serial bias was
abolished (\autoref{fig:natcomm_Figure4}a). This predictive effect was similar for correct
and error trials (\autoref{fig:figureS6}a). An analogous pattern of
predictive effects was observed when binning by previous trial RT: Fast,
but not slow, RTs were followed by a tendency to repeat the previous
choice (\autoref{fig:natcomm_Figure4}f and \autoref{fig:figureS6}b).
The pupil response predicted none of the other choice parameters on the
next trial (assessed by one-way repeated measures ANOVA), neither
overall choice bias (signed overall bias: F\textsubscript{(2,52)} =
0.939, p = 0.398, Bf\textsubscript{10} = 0.221; absolute value of
overall bias: F\textsubscript{(2,52)} = 1.817, p = 0.173, \autoref{fig:natcomm_Figure4}b),
nor perceptual sensitivity (F\textsubscript{(2,52)} = 1.936, p = 0.155,
\autoref{fig:natcomm_Figure4}c), nor lapse rate (F\textsubscript{(2,52)} = 2.213, p = 0.120,
\autoref{fig:natcomm_Figure4}d), nor RT (overall RT: F\textsubscript{(2,52)} = 3.232, p =
0.048, Bf\textsubscript{10} = 1.207; post-error slowing:
F\textsubscript{(2,52)} = 2.056, p = 0.138, \autoref{fig:natcomm_Figure4}e). Variations in
RT, likewise, did not predict a change in any of the other parameters of
the decision process (\autoref{fig:natcomm_Figure4}g-j, all p \textgreater{} 0.05). The
overall pattern of results implies that observers did not simply act
more randomly after large pupil responses or RT. Random button presses
would have reduced sensitivity, in other words, decreased the slope of
the psychometric function, contrary to our observations (\autoref{fig:natcomm_Figure4}c,h).
Rather, the pattern of results implies that, after large pupil responses
or RT, observers' tendency towards one or the other choice became less
history-dependent.
In sum, large pupil responses and slow RTs were neither followed by
improved processing of sensory evidence (a common effect of
attention, \cite{ress2000}), nor a change in overall response bias.
Large pupil responses and slow RTs were followed by only minor (and
statistically not significant) changes in stimulus-independent lapses as
well as small adjustments in speed accuracy trade-off, as observed after
response conflict, errors, or large pupil responses in speeded RT
tasks \parencite{botvinick2001,gao2009,murphy2016}. The weak effect on post-error slowing
might be due to the use of an interrogation protocol in our study, which
did not require observers to optimize their speed-accuracy
trade-off \parencite{bogacz2006}. However, both RT and pupil-linked arousal
had a robust effect on serial choice bias, reducing an overall
repetition bias that predominated across the group of observers. This
effect of both uncertainty-related measures on the serial correlation
structure of choice behaviour has so far been unknown. We therefore
proceeded to model and comprehensively quantify this effect at the level
of individual observers.
\subsection*{Pupil-linked arousal predicts choice
alternation}\label{pupil-linked-arousal-predicts-choice-alternation}
To this end, we extended a previously established regression model of
serial choice biases \parencite{frund2014} with pupil- and RT-dependent
modulatory effects. The basic model (i.e., without modulatory terms)
quantified the impact of the previous seven choices and stimuli on the
current choice bias in terms of linear combination weights (\autoref{fig:natcomm_Figure5}a,
see Methods and \parencite{frund2014}. We added to this model
multiplicative interaction terms, that quantified how much the effect of
previous stimuli and choices was modulated by either pupil response or
RT on those same trials (\autoref{fig:natcomm_Figure5}a). Simultaneously modeling the
effects of both pupil responses and RT enabled us to estimate their
independent impact on serial choice bias; we found the same results when
fitting a separate regression model for each modulatory variable
(\autoref{fig:figureS7}).
The model fits revealed robust, and idiosyncratic, patterns of serial
choice biases in most observers (\autoref{fig:natcomm_Figure5}c,d; see \autoref{fig:figureS2}b,c for individual sessions). As expected, the contribution of past
stimuli and choices to current behaviour was strongest when sensory
evidence was weak and decayed strongly with evidence strength (\autoref{fig:natcomm_Figure5}b). The weight of the immediately preceding choice was generally
stronger than the weight of the previous stimulus (\autoref{fig:natcomm_Figure5}d). The
effect of previous choices lasted up to seven trials in the past
(corresponding to about 60 s, \autoref{fig:natcomm_Figure5}c), but had the largest absolute
magnitude on the preceding trial (\autoref{fig:natcomm_Figure5}c, grey dashed line). There
was large inter-individual variability in choice weights (\autoref{fig:natcomm_Figure5}c,d).
While the majority of observers systematically repeated their choices
(purple symbols; 12 significant at p \textless{} 0.05), a good fraction
tended to alternate their choices (orange symbols; 7 significant at p
\textless{} 0.05).
\begin{figure}
\centering
\includegraphics{figures/pupilUncertainty_figure5.eps}
\caption{
\textbf{Modeling the modulation of serial choice bias.}
(\textbf{a}) Schematic of the regression model with modulatory terms.
(\textbf{b}) The contribution of history terms (past choices and
stimuli) as a fraction of the total variance in the decision
variable \parencite{frund2014}, decreased with stronger sensory evidence.
(\textbf{c}) Choice weights for the previous 7 trials, obtained from the
history model without modulatory terms. Each line corresponds to one
observer. Purple, `repeaters' with positive choice weight for lag 1.
Orange, `alternators' with negative choice weights for lag 1. Black
line, group mean. Grey dashed line, group mean of absolute choice
weight. (\textbf{d}) Choice weights at lag 1 plotted against the
corresponding stimulus weights. Colored dots and error bars indicate
individual observers $\pm$ 68\% confidence intervals obtained from a
bootstrap. See Methods for an interpretation of this graph in terms of
behavioural strategy. (\textbf{e}) Regression weights for the
interaction between previous pupil response or RT and previous choices
or stimuli. N=27, group mean $\pm$ s.e.m. (\textbf{f}) Correlation between
choice weights and their modulation by pupil dilation or RT. Colors
indicate the choice weight as derived from the basic model in
\textbf{d}. Error bars are 68\% confidence intervals obtained from a
bootstrap. The intercept of the least-squares regression line,
corresponding to the mean beta weight across the group, is indicated
with a triangle on the y-axis. (\textbf{g}) Beta weights for interaction
between previous pupil response or RT and previous choices. Group split
based on the sign of individual choice weights. *** p \textless{} 0.001,
** p \textless{} 0.01, * p \textless{} 0.05, n.s. p \textgreater{} 0.05,
Pearson's correlation coefficient or permutation test.}
\label{fig:natcomm_Figure5}
\end{figure}
Observers' serial choices biases were unrelated to the (small) serial
correlations between stimuli. The transition probabilities between
stimulus categories (i.e. s2\textless{}s1 or s2\textgreater{}s1) were
close to 0.5 (range across observers: 0.475 to 0.508), and did not
correlate with individual choice weights (Pearson correlation r = 0.010,
p = 0.960, Bf\textsubscript{10} = 0.149) or stimulus weights (Pearson
correlation r = -0.176, p = 0.381, Bf\textsubscript{10} = 0.217).
Likewise, the auto-correlation of absolute motion coherence differences
(i.e., absolute levels of evidence strength) was close to 0 (range
across observers: -0.061 to 0.028) and did not correlate with individual
choice weights (Pearson correlation r = 0.123, p = 0.541,
Bf\textsubscript{10} = 0.179) or stimulus weights (Pearson correlation r
= -0.142, p = 0.480, Bf\textsubscript{10} = 0.190).
Critically, pupil responses and RT both negatively interacted with the
effect of previous choices (\autoref{fig:natcomm_Figure5}e), in line with the observation
that large pupil responses or long RTs were followed by less choice
repetition (\autoref{fig:natcomm_Figure4}a,f). By contrast, neither pupil responses nor RT
interacted with the effect of the previous stimulus (\autoref{fig:natcomm_Figure5}e). Pupil
responses beyond one trial in the past, as well as baseline pupil
diameter on the current trial, did not predict a modulation of serial
biases (\autoref{fig:figureS8}). Moreover, these results were not
accounted for by trial-to-trial variations in trial timing or the
passage of time between trials (\autoref{fig:figureS9}).
The pupil response after feedback did not contain information predictive
of serial choice bias, beyond the information already present during the
pre-feedback interval. The post-feedback pupil responses similarly
predicted modulation of serial choice biases, but no longer did so when
removing (via linear regression) variance explained by pre-feedback
pupil responses from the post-feedback pupil signal (\autoref{fig:figureS10}).
While the modulatory effects associated with pupil responses and RT were
both negative on average, such an overall reduction of the group-level
repetition bias (\autoref{fig:natcomm_Figure4}a,f) might be due to two alternative scenarios
at the level of individual observers: either a reduction of each
observer's intrinsic serial choice bias for repetition or alternation
(referred to as `bias reduction' hereafter); or, alternatively, a
general boost of choice alternation, regardless of the observer's
intrinsic serial bias (referred to as `alternation boost'). We
quantified intrinsic serial bias as each observer's choice weight (i.e.,
the main effect of the previous on the current choice estimated by our
model). The bias reduction scenario predicts a negative correlation
between choice weights and modulation weights across observers. The
alternation boost scenario predicts negative individual modulation
weights for all observers, independently of their corresponding choice
weights (i.e., no correlation).
The analysis of these individual behavioural patterns revealed
dissociable effects of pupil-linked arousal and RT (\autoref{fig:natcomm_Figure5}f,g).
Modulation weights for the pupil were negative for most observers,
irrespective of their individual choice weight. When splitting all 27
observers into `alternators' and `repeaters' based on the sign of
their intrinsic bias (i.e., choice weight), we found no correlation
between individual modulation and choice weights (\autoref{fig:natcomm_Figure5}f, Pearson
correlation r = -0.017, p = 0.935, Bf\textsubscript{10} = 0.149).
Further, the modulation weights were negative for both subgroups, and
not significantly different between them (\autoref{fig:natcomm_Figure5}g). These observations
are consistent with the idea that pupil-linked arousal generally boosted
observers' tendency to alternate their choice on the next trial.
The alternation boost scenario for pupil responses was further supported
by a striking contrast to RT-linked modulations, which were in line with
the bias reduction scenario. The RT-linked modulation weights exhibited
a strong negative correlation with individual choice weights (\autoref{fig:natcomm_Figure5}f,
Pearson correlation r = -0.634, p \textless{} 0.001,
Bf\textsubscript{10}= 76.359), were negative only for the group of
repeaters, and differed significantly between alternators and repeaters
(\autoref{fig:natcomm_Figure5}g). Correspondingly, the correlations with individual choice
weights were significantly different for pupil- and RT-modulation
weights (\autoref{fig:natcomm_Figure5}f). Moreover, RT-dependent bias reduction was most
pronounced after incorrect choices, whereas the pupil-dependent
alternation boost was most pronounced after correct choices
(\autoref{fig:figureS11}).
In sum, the modulatory effects associated with post-decision
pupil-linked arousal and RT both shaped the serial correlation structure
of choices, but in distinct ways: pupil-linked arousal generally
promoted choice alternation, regardless of the observer's intrinsic
bias, whereas RT-linked processes generally reduced observers' intrinsic
bias.
% ========================================== %
\section*{Discussion}\label{discussion}
Decisions about an observer's sensory environment do not only depend on
the momentary sensory input, but also on the behavioural
context \parencite{gold2007}. One such contextual factor is the history
of preceding choices and stimuli, which robustly biases even highly
trained decision-makers \parencite{frund2014}. Although such serial choice
biases were first identified in psychophysical tasks about a century
ago \parencite{fernberger1920}, their determinants have remained poorly
understood. Previous treatments of serial choice biases have
conceptualized experimental history as sequences of binary external
events (past stimulus identities, choices, or
feedback) \parencite{abrahamyan2016,frund2014}. We here established that these serial
biases were also modulated by the decision-maker's pupil-linked arousal
state on the previous trial, which, in turn, reflected the uncertainty
about the observer's choice.
Several important features of our approach allowed us to move beyond
previous work linking human pupil dynamics to uncertainty and
performance monitoring. First, different from most previous studies, we
here unravelled the temporal evolution of uncertainty information in the
pupil response, enabling inferences about not only the existence, but
also the time course of this information (see \cite{oreilly2013}
for a similar approach). Second, the model-based definition of decision
uncertainty we used helped dissociate decision uncertainty from error
detection, which has previously been linked to pupil
dilation \parencite{wessel2011}. In a two-choice task, a signal encoding
decision uncertainty should predict behavioural performance over a range
from 100\% to 50\% correct (corresponding to 50\% for the maximum
uncertainty signal, or larger when encoding is imprecise). By contrast,
an error detection signal should predict performance over the range
100\% to 0\% correct \parencite{kepecs2008}. Our measurements were more
consistent with decision uncertainty than error detection (\autoref{fig:figureS3}c). Third, in our task, decision uncertainty critically depended
on internal noise (the primary source of the variance in \autoref{fig:natcomm_Figure1}a). By
contrast, previous studies linking uncertainty to pupil
dynamics \parencite{deberker2016,nassar2012,oreilly2013,preuschoff2011} have used tasks in which the
primary source of uncertainty was in the observers' environment. Last,
in contrast to most previous pupillometry
studies \parencite{degee2014,lempert2015,preuschoff2011} we comprehensively quantified the
predictive effects of pupil-linked arousal on the parameters of choice
beyond the current trial, thereby complementing recent work on the
effects of pupil-linked arousal on learning \parencite{nassar2012,oreilly2013}. Taken
together, our results critically advance the understanding of how
internal decision uncertainty is encoded in pupil-linked arousal in
humans, in a way that builds a direct bridge to single-unit recording
studies of decision uncertainty in animals \parencite{kepecs2008,komura2013,lak2014,teichert2014}.
The neural sources of task-evoked pupil responses at constant luminance
are not yet fully identified \parencite{mcdougal2008}, but mounting evidence
points to the noradrenergic locus coeruleus (LC) \parencite{joshi2016,murphy2014,varazzani2015}
(a core component of the brain's arousal system \parencite{aston-jones2005}) as
well as the superior and inferior colliculi \parencite{wang2015}.
Microstimulation of all three structures triggers pupil
dilation \parencite{joshi2016}. Among these structures, activity of the LC
(spontaneous or evoked by electrical stimulation) is followed by pupil
dilation at the shortest latency \parencite{joshi2016}. The LC also has
widespread, modulatory projections to the cortex implicated in
regulating central arousal \parencite{aston-jones2005}. Dopaminergic and
cholinergic systems, which are closely connected with the
LC \parencite{sara2009}, are likewise implicated in central arousal
state \parencite{mcginley2015} and may also contribute to task-evoked pupil
responses.
We propose that decision-makers' uncertainty about their choices might
shape serial choice biases by recruiting pupil-linked neuromodulatory
systems. Frontal brain regions encoding decision uncertainty send
descending projections to several of these
systems \parencite{aston-jones2005,sara2009}, which in turn project to large parts of
the cortex, including networks of regions involved in perceptual
inference and decision-making \parencite{siegel2011}. Neuromodulators like
noradrenaline can profoundly alter the dynamics and topology of cortical
networks \parencite{eldar2013,mcginley2015,polack2013,marder2012}. Thus, these brainstem arousal
systems might be in an ideal position to transform variations in
decision uncertainty into adjustments of choice
behaviour \parencite{meyniel2015a,yu2005a}.
The behavioural effect of pupil-linked arousal might be explained by at
least two (not mutually exclusive) scenarios. First, arousal responses
might promote choice alternation at the level of response preparation,
by altering the state of the motor system \parencite{delange2013}. Second,
the arousal response might modulate the decision stage -- specifically
the dynamic updating of beliefs about the upcoming evidence, for example
by shifting the criterion (assumed to be constant in signal detection
theory, \autoref{fig:natcomm_Figure1}) from one choice to the next. When this criterion is
shifted in the direction opposite to the last choice, alternation
ensues. In line with these ideas, changes in pupil-linked arousal state
can indeed translate into specific behavioural
effects \parencite{eldar2013,degee2014}, presumably by interacting with selective
cortical circuitry \parencite{donner2013a}.
Our current observations are not easily reconciled with existing
theoretical accounts of the impact of phasic arousal on decision-making.
One account posits that threshold crossing of the decision variable
triggers phasic noradrenaline release, facilitating the translation of
the decision into a behavioural response \parencite{aston-jones2005}. In
contrast to our observations, this framework focuses on functional
effects of phasic arousal within the same trial, rather than subsequent
ones, and it predicts improvements in sensitivity and/or
RT \parencite{cavanagh2014}, rather than changes in bias. Other accounts have
proposed that phasic noradrenaline release facilitates a `network
reset' \parencite{bouret2005}, enabling the transition of neural decision
circuits to a new state \parencite{dayan2006}. Our group-level finding that
high pupil-linked arousal reduces serial biases might be interpreted as
the discarding of post-decisional activity traces due to network
reset \parencite{karlsson2012,tervo2014}. However, our analysis of individual choice
patterns revealed that pupil-linked arousal boosted alternation also in
those observers who already exhibited a tendency to alternate their
choices, which is not easily reconciled with the network reset idea.
Previous theories of arousal and neuromodulation have coarsely
distinguished between two timescales of arousal fluctuations: Tonic
fluctuations over the course of seconds to hours, and phasic responses
on a sub-second timescale, time-locked to rapid cognitive
acts \parencite{aston-jones2005,dayan2006,yu2005a}. Changes in tonic arousal occur
spontaneously \parencite{mcginley2015,steriade2000}, and might also track changes in
task utility or uncertainty \parencite{aston-jones2005,deberker2016,nassar2012,yu2005a}. Pupil-linked
changes in tonic arousal strongly shape the operating mode of cortical
circuits, including early sensory cortices, on slow
timescales \parencite{mcginley2015}. Phasic pupil-linked arousal responses,
on the other hand, predict behaviour related to the same transient
cognitive processes that drive them \parencite{einhauser2008,degee2014,preuschoff2011}. The
uncertainty-linked pupil responses we identified here built up slowly
after choice and predicted choice behaviour several seconds later. Thus,
our current results suggest that pupil-linked arousal systems are driven
by, and interact with, cognitive processes also at intermediate
timescales; faster than tonic arousal, but more sustained than
task-evoked phasic responses.
The dissociation between pupil- and RT-linked modulatory effects (\autoref{fig:natcomm_Figure5}f and \autoref{fig:figureS11}) on serial choice bias suggests that
decision uncertainty signals were propagated along distinct central
neural pathways, one linked to pupil responses and the other to RT,
which then shaped serial choice biases in different ways. Even if the
same uncertainty signals fed into these pathways, they might have become
decoupled through independent internal noise. Specifically, it is
tempting to speculate that the pupil-linked alternation boost reflected
neuromodulator release from brainstem centers (such as noradrenaline
from the LC, \cite{tervo2014}), whereas RT-linked bias reduction was
driven by frontal cortical areas involved in explicit performance
monitoring and top-down control (such as anterior cingulate
cortex, \cite{botvinick2001,ebitz2015,yeung2004}). Top-down effects of prefrontal cortex
on decision-making \parencite{botvinick2001,miller2001} are commonly associated with
explicit strategic effects that are adaptive within the experimental
task. Indeed, the RT-linked modulation of serial bias was adaptive, in
that it generally reduced observers' intrinsic serial bias. By contrast,
pupil-linked arousal modulated serial choice patterns in a way that was
maladaptive for part of the observers (the alternators). This finding
might be related to the observation that maladaptive serial choice
biases remain prevalent even in highly trained observers who know the
statistics of the task \parencite{fernberger1920,frund2014}. Taken together, the
dissociation between pupil- and RT-linked effects suggest that serial
choice biases result from a complex interplay between low-level,
pupil-linked arousal systems and higher-level systems for strategic
control. Future studies should pinpoint the neural systems underlying
these distinct effects, as well as their
interactions \parencite{tervo2014}.
In conclusion, our study identified decision uncertainty as a high-level
driver of phasic arousal, and it uncovered a role of this pupil-linked
arousal response in shaping the dynamics of serial choice biases -- a
pervasive but often ignored characteristic of human decision-making.
These insights shed new light on the link between decision uncertainty,
pupil-linked arousal state, and serial dependencies in decision-making.
They set the stage for further investigations into the neural bases of
arousal-dependent modulations of serial choice behaviour.
% ========================================== %
\section*{Methods}\label{methods}
\subsection*{Operationalizing decision uncertainty}\label{operationalising-decision-uncertainty}
In signal detection theory, a decision variable $dv_i$ is drawn on
each trial from a normal distribution $\mathcal{N}(\mu,\sigma)$ with $\mu$
corresponding to that trial's sensory evidence and $\sigma$ reflecting
the internal noise. In \autoref{fig:natcomm_Figure1}, we used the range of single-trial
motion energy values $\lbrack - 6,6\rbrack$ as our $\mu$. We
estimated $\sigma$ from the data using a probit psychometric function
fit on data combined across observers. The probit slope
$\beta = 0.367$, where its inverse $\sigma = 2.723$ reflected the
standard deviation of the $dv$ distribution. The decision bound
$c$ was set to $0$, reflecting an observer without overall choice
bias. The two pairs of distributions in \autoref{fig:natcomm_Figure1} were generated using
$\mu = - 1$ and $\mu = 1$ for weak evidence, and
$\mu = - 4$ and $\mu = 4$ for strong evidence. To calculate
the relationship between evidence strength and decision uncertainty
(\autoref{fig:natcomm_Figure1}c), we simulated a normal distribution of $dv$ for
each level of evidence strength, with $\mu = \lbrack 0,6\rbrack\ $and
$\sigma = 2.723$. Since these uncertainty computations are symmetrical
with respect to choice identity, we visualized only the pattern
corresponding to $\mu > 0$ (stimulus B in \autoref{fig:natcomm_Figure1}a). All samples from
such a distribution were split into correct and error parts based on
their position with respect to the decision bound $c$. For each
combination of evidence strength and choice, the average uncertainty
level is
w
\begin{equation}
uncertainty = 1 - \frac{1}{n} \sum_{i = 1}^{n} f(|dv-c|)
\end{equation}
where $f$ is the cumulative distribution function of the normal distribution
\begin{equation}
f(x) = \frac{1}{2} \left[1 + \text{erf} \left ( \frac{x}{\sigma\sqrt{2}} \right) \right]
\end{equation}
which transforms the distance between $dv$ and $c$ into the
probability of a correct response \parencite{lak2014}.
We simulated ten million trials based on the range of evidence in the
data, and for each we computed a binary choice, the corresponding level
of decision uncertainty, and the accuracy of the choice. \autoref{fig:natcomm_Figure1}c-e
visualises the relationship between evidence strength, uncertainty and
choice accuracy in these simulated data.
\subsection*{Participants and sample size}\label{participants-and-sample-size}
Twenty-seven healthy human observers (10 male, aged 23 $\pm$ 5.2 years)
participated in the study. The ethics committee at the University of
Amsterdam approved the study, and all observers gave their informed
consent. We included all observers in each analyses presented in the
paper. Each observer participated in one practice session and five main
experimental sessions, each of approximately two hours and comprising
500 trials of the task. The number of observers was selected to allow
for robust analyses of individual differences, as in previous
pupillometry work from our lab \parencite{degee2014}, and the total number
of trials per observer was chosen to allow for robust psychometric
function fits and detection of subtle changes in the fit parameters.
\subsection*{Task and procedure}\label{task-and-procedure}
Observers performed a 2-interval forced choice motion coherence
discrimination task at constant luminance (\autoref{fig:natcomm_Figure2}a). Observers judged
the difference in motion coherence between two successively presented
random dot kinematograms (RDKs): a constant reference stimulus (70\%
motion coherence) and a test stimulus (varying motion coherence levels
specified below). The intervals before, in between, and after (until the
inter-trial interval) these two task-relevant stimuli had variable
duration (numbers in \autoref{fig:natcomm_Figure2}a) and contained incoherent motion. A beep
(50 ms, 440 Hz) indicated the onset of each (test and reference)
stimulus. After offset of the test stimulus, observers had 3 seconds to
report their judgment (button press with left or right index finger,
counterbalanced across observers). After a variable interval (1.5-2.5
s), a feedback tone was played (150 ms, 880 Hz or 200 Hz, feedback-tone
mapping counterbalanced across observers). Dot motion was stopped 2-2.5
s after feedback, with stationary dots indicating the inter-trial
interval, during which observers were allowed to blink their eyes.
Observers self-initiated the next trial by button press (range of median
inter-trial intervals across observers: 0.68 to 2.05 s).
The difference between motion coherence of test and reference was taken
from three sets: easy (2.5, 5, 10, 20, 30), medium (1.25, 2.5, 5, 10,
30) and hard (0.625, 1.25, 2.5, 5, 20). All observers started with the
easy set. We switched to the medium set when their psychophysical
thresholds (70\% accuracy defined by a cumulative Weibull fit) dropped
below 15\%, and to the hard set when thresholds dropped below 10\%, in a
given session.
Motion coherence differences were randomly shuffled within each block.
We applied a counterbalancing scheme ensuring that within a block, each
stimulus category (s2 \textgreater{} or \textless{} s1) was followed by
itself or its opposite equally often \parencite{brooks2012}. The algorithm
generated sequences of 53 trials, of which the first 50 were used per
block.
\subsection*{Random dot kinematograms
}\label{random-dot-kinematograms}
Stimuli were generated using PsychToolbox-3 \parencite{kleiner2007}
and presented on a 22" CRT monitor with a resolution of 1024 x 768
pixels and a refresh rate of 60 Hz. A red `bulls-eye' fixation
target \parencite{thaler2013} of 1.5$^\circ$ diameter was present in the centre of
the screen. Dynamic random noise was presented in a central annulus
(outer radius 12$^\circ$, inner radius 2$^\circ$) around fixation. The annulus was
defined by a field of dots with a density of 1.7
dots/degrees$^2$, resulting in 768 dots on the screen in
any given frame. Dots were 0.2$^\circ$ in diameter and had 100\% contrast from
the black screen background. All dots were divided into `signal dots'
and `noise dots', whose proportions defined the varying motion
coherence levels. Signal dots were randomly selected on each frame, and
moved with 11.5$^\circ$/s in one of four diagonal directions (counterbalanced
across observers). Signal dots that left the annulus wrapped around and
reappeared on the other side. Signal dots had a limited `lifetime' and
were re-plotted in a random location after being on the screen for 4
consecutive frames. Noise dots were assigned a random location within
the annulus on each frame, resulting in `random position noise with a
`different' rule \parencite{scase1996}. Three independent motion
sequences were interleaved \parencite{roitman2002}, making the effective
speed of signal dots in the display 3.8$^\circ$/s.
\subsection*{Motion energy filtering}\label{motion-energy-filtering}
Due to the stochastic nature of the dynamic random dot kinematograms,
the sensory evidence fluctuated within and across trials, around the
nominal motion coherence level. To quantify behaviour and pupil
responses as a function of the actual, rather than the nominal,
single-trial evidence, we used motion energy filtering to estimate those
fluctuations \parencite{adelson1985}.
Two spatial filters, resembling weighted sinusoids in opposite phase,
were defined by
\begin{equation}f_1(x,y) = \cos^4(a) \cos(4a) \exp({-\frac{y'^2}{2\sigma^2_g}}) \end{equation}
\begin{equation}f_2(x,y) = \cos^4(a) \sin(4a) \exp({-\frac{y'^2}{2\sigma^2_g}}) \end{equation}
where $a = \tan^{-1}(x'/\sigma_c)$. The
parameters $\sigma_g = 0.05$ and $\sigma_c = 0.35$ defined the
carrier sinusoid and the Gaussian envelope, respectively, in line with
the response properties of MT neurons \parencite{kiani2008}. The
coordinate system $(x,y)$ was rotated to match the stimulus' target
direction or its 180$^\circ$ opposite. Two temporal filters were defined by
\begin{equation}g_1(t) = (kt)^{n_{s}} \exp(-kt) \ \left[ \frac{1}{n_{s}!} - \frac{(kt)^2}{(n_{s}+2)!} \right] \end{equation}
\begin{equation}g_2(t) = (kt)^{n_{f}} \exp(-kt) \ \left[ \frac{1}{n_{f}!} - \frac{(kt)^2}{(n_{f}+2)!} \right] \end{equation}
Where $k = 60$ reflected the envelope of the temporal filters, and
$n_s = 3$ and $n_f = 5$ controlled the width of the slow and
fast filters, respectively \parencite{kiani2008}. A pair of
spatio-temporal filters in quadrature pair was obtained by
$f_1g_1 + f_2g_2$ and $f_2g_1 - f_1g_2$. We
convolved each filter with the single-trial random dot movies. The
resulting values were squared, and summed together across the two
filters \parencite{adelson1985}.
This filtering procedure was performed for each observer's individual
target direction as well as its 180$^\circ$ opposite. To avoid cardinal biases
in motion perception, we used the four diagonals as target directions
counterbalanced across observers. Outputs of the two filtering
operations were subtracted to yield a direction-selective signal over
time \parencite{kiani2008}. To obtain a single measure of sensory evidence
per trial, we averaged over all timepoints within each stimulus
interval, and took the difference between motion energy in the first and
second interval as our measure of single-trial sensory evidence.
Evidence strength was defined by taking the absolute value of this
sensory evidence, collapsing over the two stimulus identities (\autoref{fig:natcomm_Figure2}b).
\subsection*{Pupillometry }\label{pupillometry}
Observers sat in a dark room with their head in a chinrest at 50 cm from
the screen. Horizontal and vertical gaze position, as well as the area
of the pupil, were monitored in the left eye using an EyeLink 1000
desktop mount (SR Research, sampling rate: 1,000 Hz). The eye tracker
was calibrated before each block of 50 trials.
Missing data and blinks, as detected by the EyeLink software, were
padded by 150 ms and linearly interpolated. Additional blinks were found
using peak detection on the velocity of the pupil signal and linearly
interpolated. We estimated the effect of blinks and saccades on the
pupil response through deconvolution, and removed these responses from
the data using linear regression using a procedure detailed in ref
\parencite{knapen2016}. The residual pupil time series were band-pass
filtered using a 0.01 to 10 Hz second-order Butterworth filter, z-scored
per run, and resampled to 100 Hz. We epoched trials, and baseline
corrected each trial by subtracting the mean pupil diameter 500 ms
before onset of the reference stimulus.
We included all trials from all five main sessions (i.e., excluding the
practice session) in the analyses reported in this paper. The time
series of consecutive trial-wise stimuli, choices, RTs and pupil
responses was necessary for the regression model of serial bias
modulation. Observers were well-practiced in the task structure after
the practice session. As a consequence, they made few blinks during the
trial intervals (on average across observers, only 7.7\% of trials
contained more than 50\% interpolated samples). The percentage of
interpolated trials did not correlate with the estimated effect of pupil
responses on serial choice bias (r = -0.268, p = 0.175,
Bf\textsubscript{10} = 0.369).
\subsection*{Quantifying pupil timecourses and single-trial
responses}\label{quantifying-pupil-timecourses-and-single-trial-responses}
To characterize the time-course of uncertainty encoding in the pupil
response, we regressed across-trial evidence strength onto each sample
of the baseline-corrected pupil signal, separately for correct and error
trials (\autoref{fig:natcomm_Figure3}b). The design matrix for this regression also included
an intercept and three nuisance covariates: (i) log-transformed RTs,
(ii) sample-by-sample horizontal gaze coordinates and (iii) sample-by
sample vertical gaze coordinates. We tested the significance of this
regression timecourse using cluster-based permutation
statistics \parencite{blair1993}.
We took the mean baseline-corrected pupil signal during 250 ms before
feedback delivery as our single-trial measure of pupil response. Because
of the temporal low-pass characteristics of the sluggish peripheral
pupil apparatus \parencite{hoeks1993}, trial-to-trial variations in RT can
cause trial-to-trial in pupil responses, even in the absence of
amplitude variations in the underlying neural responses. To specifically
isolate trial-to-trial variations in the amplitude (not duration) of the
underlying neural responses, we removed components explained by RT via
linear regression
\begin{equation}\mathbf{y'} = \mathbf{y} - (\mathbf{y}^{T}\mathbf{r})\mathbf{r} \end{equation}
where $\mathbf{y}$ was the original vector of pupil responses, $\mathbf{r}$
was the vector of the corresponding single-trial RTs (log-transformed
and normalized to a unit vector), and $T$ denoted matrix transpose.
The residual $\mathbf{y'}$ thus reflected pupil responses, after
removing variance explained by trial-by-trial RTs. This residual pupil
response was used for all analyses reported in the main text.
\subsection*{Quantifying post-error
slowing}\label{quantifying-post-error-slowing}
We quantified post-error slowing, for tertiles of previous trial pupil
responses, as described by \cite{dutilh2012}. Briefly, we
selected those error trials that were both preceded and followed by a
correct trial, and subtracted the pre-error RT from the associated
post-error RT. This procedure ensured that estimates of post-error
slowing could not be driven by error-unrelated, intrinsic fluctuations
in RT over the course of a session. Before this subtraction, we removed
trial-by-trial evidence strength from RTs using linear regression, to
account for the large variations in RT with stronger sensory evidence
(\autoref{fig:natcomm_Figure2}d).
\subsection*{Quantifying the psychometric
function}\label{quantifying-the-psychometric-function}
We modeled the psychometric function (\autoref{fig:figureS4}a) as
follows. The probability of a particular response $r_t = 1$ on trial
$t$ was described as:
\begin{equation}
\label{eq:psychfunc}
P(r_t = 1 | \tilde{s_t}) = \gamma + (1 - \gamma - \lambda) g(\delta + \alpha \tilde{s_t})
\end{equation}
where $\lambda$ and $\gamma$ were the probabilities of
stimulus-independent errors (`lapses'), $\tilde{s_{t}}$ was the
signed stimulus intensity (here: signed sensory evidence as in Figure
2b), $g\left( x \right) = 1/(1 + e^{- x})$ was the logistic function,
$\alpha$ was perceptual sensitivity, and $\delta$ was a bias term.
The free parameters $\gamma,\lambda,\alpha$ and $\delta$ were
estimated by minimizing the negative log-likelihood of the data (using
Matlab's \emph{fminsearchbnd}). We constrained $\gamma$ and
$\lambda$ to be identical, so as to estimate a single,
choice-independent lapse rate.
For the quantification of serial choice bias (\autoref{fig:figureS5}),
we binned the data by previous choices and by previous pupil responses
or RT. For each of those subsets of trials, we fit the psychometric
function (equation \ref{eq:psychfunc}) to the choices on the subsequent trials. The
resulting bias term $\alpha$ was transformed from log-odds into
probability by $p = e^{\alpha} / (1 + e^{\alpha})$.
This quantified $P(r_{t} = 1)$ for ambiguous evidence
(i.e., strength of zero). Collapsing these values across the two choice
options (shown separately in \autoref{fig:figureS5}) yielded the pooled
measure of choice repetition probability in \autoref{fig:natcomm_Figure4}a,f.
\subsection*{Quantifying perceptual sensitivity using cumulative Weibull function
fits}\label{quantifying-perceptual-sensitivity-using-cumulative-weibull-function-fits}
In \autoref{fig:natcomm_Figure3}d, S1c and S3b, we fit a cumulative Weibull function to
accuracy as a function of evidence strength. The probability of a
correct response $c_{t} = 1$ on trial $t$ was defined as:
\begin{equation}P(c_t = 1|s_t) = 1 - (0.5 - \lambda) f\left( (\frac{s_t}{\theta})^\beta \right) \end{equation}
where $s_{t}$ was the absolute evidence strength,
$f(x) = \ \left( 1 - e^{- x} \right)$ was the cumulative Weibull
function, $\lambda$ was the lapse rate, $\theta$ was the threshold
indicating at which level of evidence strength an accuracy of
\textasciitilde{}80\% is achieved, and $\beta$ was the slope of the
cumulative Weibull function. The free parameters $\theta,\beta$ and
$\lambda$ were estimated by minimizing the negative log-likelihood of
the data (using Matlab's \emph{fminsearchbnd}). Perceptual sensitivity
was then defined as $1/\theta$.
\subsection*{Modeling the modulation of serial choice bias by
uncertainty-dependent
variables}\label{modelling-the-modulation-of-serial-choice-bias-by-uncertainty-dependent-variables}
We modeled the pupil- and RT-linked modulation of serial choice bias by
extending an established regression approach \parencite{frund2014}. The
basic regression model extended the psychometric function model (equation \ref{eq:psychfunc})
by means of a history-dependent bias term
$\delta_{\text{hist}}\left( \mathbf{h}_{t} \right)$, which was a
linear combination of previous stimuli and choices
\begin{equation}
P(r_t = 1|\tilde{s_t}\mathbf{h}_t) = \gamma + (1-\gamma-\lambda)g(\delta(\mathbf{h}_t) + \alpha\tilde{s_t})
\end{equation}
with
\begin{equation}
\delta(\mathbf{h}_t) = \delta' + \delta_{hist}(\mathbf{h}_t) = \delta' + \sum_{k = 1}^{K} \omega_k h_{kt}
\end{equation}
where the bias term $\delta(\mathbf{h}_{t})$ was the sum
of the overall bias $\delta'$ (see equation \ref{eq:psychfunc}) and the history bias
$\delta_{hist}(\mathbf{h}_{t}) = \sum_{k = 1}^{K}\omega_{k}h_{kt}$,
where $\omega_{k}$ were the weights assigned to each previous stimulus
or choice. We here modeled
%\begin{equation}
\begin{equation}
\mathbf{h}_{t} = ( r_{t - 1},\ r_{t - 2},\ r_{t - 3},\ r_{t - 4},\ r_{t - 5},\ r_{t - 6},\ r_{t - 7},
\ z_{t - 1},\ z_{t - 2},\ z_{t - 3},\ z_{t - 4},\ z_{t - 5},\ z_{t - 6},\ z_{t - 7})
\end{equation}
as a concatenation of the last seven responses and stimuli \parencite{frund2014}. This procedure allowed us to quantify
the effect of past trials on current choice processes (\autoref{fig:natcomm_Figure5}c). We
convolved every set of seven past trials with three exponentially
decaying basis functions \parencite{frund2014}. Positive history weights
$\omega_{k}$ indicated a tendency to repeat the previous choice, or to
make a choice that matched the previous stimulus. Negative weights
described a tendency to alternate the corresponding history feature.
To model the effect of pupil-linked uncertainty on history biases, we
extended this model by adding a multiplicative interaction term
$\sum_{k = 1}^{K}{\omega_{k}'h_{\text{kt}}p_{\text{kt}}}$, which
described the interaction of pupil responses with the choice and
stimulus identities at the last seven lags:
\begin{equation}
P(r_t = 1|\tilde{s_t},\mathbf{h}_t,\mathbf{p}_t) = \gamma + (1-\gamma-\lambda)g(\delta(\mathbf{h}_t, \mathbf{p}_t) + \alpha\tilde{s_t})
\end{equation} \\
with
\begin{equation}
\delta(\mathbf{h}_t,\mathbf{p}_t) = \delta' + \delta_{hist}(\mathbf{h}_t,\mathbf{p}_t) = \delta' + \sum_{k = 1}^{K} \omega_k h_{kt} + \omega'_kh_{kt}p_{kt} + \omega_k'' p_{kt}
\end{equation} \\
where $\omega_{k}'$ were the history x pupil interaction weights,
$\omega_{k}''$ were the pupil weights and
$$p_{\text{kt}} = (p_{t - 1},\ p_{t - 2},\ p_{t - 3},\ p_{t - 4},\ p_{t - 5},\ p_{t - 6},\ p_{t - 7\ })$$
was a concatenation of the last seven pupil responses. The term
$\omega_{k}'' p_{\text{kt}}$ acted as a nuisance covariate. To
simultaneously model the effects of pupil responses and log-transformed
RT, our model also included RT and history x RT terms, generated using
the same procedure.
All parameters were fit using an expectation maximization algorithm. To
assess whether individual observers were significantly influenced by
their experimental history, we ran 1,000 iterations of permuting all
trials, fitting the full model, and subsequently comparing the
likelihood of the intact model to this null distribution (where
permutation nullifies true history effects) \parencite{frund2014}.
Confidence intervals for individual regression weights were obtained
from a bootstrapping procedure.
\subsection*{Serial bias and outcome-dependent choice
strategies}\label{serial-bias-and-outcome-dependent-choice-strategies}
The history weights for past stimuli and responses allowed us to
characterize different decision strategies \parencite{frund2014} (Figure
5d). Positive weights associated with the previous choice, or the
previous stimulus category, indicate a tendency to repeat this previous
choice, or to make a choice corresponding to the previous stimulus,
respectively. Negative weights correspond to a tendency to alternate
previous choice or stimulus. In the left and right triangle of this
strategy space, the magnitude of the response weight is larger than the
magnitude of the stimulus weight. Consequently, strategies are dominated
by the previous choice and can be simply defined as choice alternation
(left) or choice repetition (right).
In the upper and in the lower triangle, the magnitude of the stimulus
weight is larger than the magnitude of the response weight, so
strategies are dominated by the identity of the previous stimulus (which
is only known to the observer as a function of their previous response
and feedback). In the upper and lower triangle, strategies are thus
defined by the sign of the stimulus weight. In the upper triangle
stimulus weights are positive, indicating a tendency to repeat the
previous stimulus. On a correct trial, previous choice and stimulus are
equal and therefore, repeating the previous stimulus is equal to
repeating the previous choice (a win-stay strategy). On errors, the
previous choice is opposite to the previous stimulus and repeating the
previous stimulus is equal to alternating the previous choice
(lose-switch strategy). Conversely, in the lower triangle stimulus
weights are negative, reflecting a tendency to alternate the previous
stimulus. This implies a tendency to alternate the previous choice if
the previous choice was correct (win-switch strategy) and a tendency to
repeat the previous choice in case of a previous error (lose-stay
strategy).
The weights for previous choices and stimuli can easily be combined to
obtain weights reflecting a tendency to repeat previous correct or
incorrect choices (\autoref{fig:figureS6}). Specifically, correct
weights are defined by $choice + stimulus$, and error weights by $choice -
stimulus$ \parencite{frund2014}. The same holds for modulation weights.
This transformation is identical to fitting a model with regressors for
previous successes and failures \parencite{abrahamyan2016,busse2011}.
\subsection*{Statistical tests}\label{statistical-tests}
We used non-parametric permutation testing to test for the group-level
significance of individually fitted parameter values (\autoref{fig:natcomm_Figure3} and
\autoref{fig:natcomm_Figure5}e,g). We randomly switched labels of individual observations either
between two paired sets of values, between one set of values and zero,
or between two unpaired groups. After repeating this procedure 10,000
times, and computing the difference between the two group means on each
permutation, the p-value was the fraction of permutations that exceeded
the observed difference between the means. All p-values reported were
computed using two-sided tests.
In \autoref{fig:natcomm_Figure4}, we split the data into tertiles of pupil response or RT,
and computed next trial serial choice bias, signed choice bias, overall
choice bias, perceptual sensitivity, lapse rate, RT and post-error
slowing in each bin. We used a repeated-measures ANOVA to test for the
main effect of bin on each dependent variable. We further used Bayes
Factors (Bf), obtained from a Bayesian one-factor
ANOVA \parencite{rouder2012}, to support conclusions about null effects
observed. Bf\textsubscript{10} quantifies the evidence in favour of the
null or the alternative hypothesis, where Bf\textsubscript{10} $< \frac{1}{3}$
or $> 3$ is taken to indicate substantial
evidence for H\textsubscript{0} or H\textsubscript{1}, respectively.
Bf\textsubscript{10} = 1 indicates inconclusive evidence. We similarly
computed Bf\textsubscript{10} for correlations, based on the Pearson
correlation coefficient \parencite{wetzels2012}.
The p-value for the difference between the two correlation coefficients
(choice weight by pupil modulation weight vs. choice weight by RT
modulation weight), shown in \autoref{fig:natcomm_Figure5}f, was obtained through permutation
testing. To generate a null distribution of no difference, we randomly
switched (or not, dependent on a simulated coin flip) each observer's RT
and pupil modulation weights, after which we computed the
between-subject correlation between choice weights and pupil modulation
weights as well as between choice and RT modulation weights. Repeating
this procedure 10,000 times generated a distribution of the difference
in correlation, under the null hypothesis of no difference.
\subsection*{Data availability}\label{data-availability}
All raw and processed data, as well as the code to reproduce all
analyses and figures, are available at \url{http://dx.doi.org/10.6084/m9.figshare.4300043}.
\subsection*{Acknowledgements}
We thank O’Jay Medina for assistance with data collection, all members of the Donnerlab for valuable discussions, and Konstantinos Tsetsos, Jan Willem de Gee, Niklas Wilming, Camile Correa, Florent Meyniel and Sander Nieuwenhuis for helpful comments on the manuscript. We acknowledge computing resources provided by NWO Physical Sciences.
This research was supported by the German Academic Exchange Service (DAAD) and G.-A. Lienert Foundation (to A.E.U.) and the German Research Foundation (DFG), SFB 936/A7, SFB 936/Z1, DO 1240/2-1 and DO 1240/3-1, and European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 604102 (Human Brain Project) (to T.H.D.).
\clearpage
\section*{Supplementary Figures}
\renewcommand{\thefigure}{S\arabic{figure}}
\setcounter{figure}{0}
\vfill
\begin{figure}[ht]
\centering
\includegraphics{figures/pupilUncertainty_figureS1.eps}
\caption{
\textbf{RTs scale with decision uncertainty}. (\textbf{a}) RT distributions from stimulus offset, shown for all trials (left), split into five bins of evidence strength (middle), and separately for correct and error trials (right). For each observer, the number of trials was counted in each 40-ms wide bin from 0 to 1.5 seconds after stimulus offset, and normalized by the total number of trials. Shaded error bars indicate group median and inter-quartile range. Dotted line indicates group mean of individual RT medians. (\textbf{b}) RT predicted choice accuracy over a range from about 85\% to about 60\% correct, and not below chance level (50\%). This relationship is consistent with decision uncertainty, but not error detection, which predicts accuracies of a range from 100\% to 0\% correct. Left: Accuracy for 12 bins of RT, shaded error bars indicate group mean $\pm$ s.e.m. Right: Individual logistic regression weights, using RT to predict single-trial accuracy. (\textbf{c}) Slow RTs reflected lower perceptual sensitivity. Left: Average cumulative Weibull psychometric function fits and data points (group mean $\pm$ s.e.m.), separately for the lowest and highest RT tertiles. Right: Individual perceptual sensitivity, separately for lowest and highest RT tertiles. In \textbf{b-c}, we z-scored and log-transformed RTs within each block and removed trial-to-trial variability shared with pupil responses via linear regression before computing statistics. *** p \textless{} 0.001, permutation test. (N=27, group mean $\pm$ s.e.m.)}
\label{fig:figureS1}
\end{figure}
\vfill
\begin{figure}[ht]
\centering
\includegraphics{figures/pupilUncertainty_figureS2.eps}
\caption{\textbf{Behaviour over sessions.} Data from each observer were collected over 5 main experimental sessions of 500 trials. Data from the practice session are not shown here. After discarding trials in which no response was recorded, each session contained an average of 498 trials (range 465-500). (\textbf{a}) Psychometric and chronometric functions, as in \autoref{fig:natcomm_Figure2}d, separately for each session. (\textbf{b}) History kernels as in \autoref{fig:natcomm_Figure5}c, separately for each session. (N=27, group mean $\pm$ s.e.m.) (\textbf{c}) Individual history kernels as in \autoref{fig:natcomm_Figure5}c, separately for each session. Colors indicate the choice weight as derived from the model in \autoref{fig:natcomm_Figure5}c-d, fit across all sessions combined. To complement these visual representations of behaviour over sessions, we computed repetition probability for three bins of pupil responses (\autoref{fig:natcomm_Figure4}a), separately in each of the five sessions. Using a repeated measures ANOVA, we found no main effect of session (F\textsubscript{4,104} = 1.591, p = 0.182, Bf\textsubscript{10} = 0.078) nor an interaction between session and pupil bin (F\textsubscript{8,208} = 1.333, p = 0.229, Bf\textsubscript{10} = 0.023) on repetition probability. This analysis indicates that history biases do not detectably change over the course of learning, adding further evidence to the idea that serial choice biases are stable, individual traits.}
\label{fig:figureS2}
\end{figure}
\begin{figure}[ht]
\centering
\includegraphics{figures/pupilUncertainty_figureS3.eps}
\caption{
\textbf{Pupil responses scale with decision uncertainty.} Testing for all three signatures of decision uncertainty derived from the model in \autoref{fig:natcomm_Figure1}. (\textbf{a}) Pupil responses scaled oppositely with evidence strength on correct and error trials. (\textbf{b}) High pupil responses reflected lower perceptual sensitivity. Average cumulative Weibull psychometric function fits (see Methods) and data points, separately for the lowest and highest tertiles of pupil responses. (\textbf{c}) Accuracy as a function of pupil responses (12 bins). Pupil responses predicted uncertainty over a range between 100\% and 50\% correct, but clearly not below 50\% correct. This scaling is more consistent with decision uncertainty than with error awareness (which predicts accuracies down to 0\%). Note that the analysis is limited by noise corrupting the single-trial pupil measurements. To address this issue, we fit a line to the data in \textbf{c}, extended its negative range to reach 100\% accuracy, and then extended its positive range, with an equal distance. The result, shown in the inset, provided a rough estimate of the relationship expected, based on our result, if single-trial pupil-linked arousal could be measured without noise. Again, this analysis indicates that the scaling of pupil responses with accuracy is more consistent with decision uncertainty than with error awareness. (\textbf{d-e}) Same as \textbf{a-c}, after removing trial-by-trial fluctuations in log-transformed RT from the pupil signal using linear regression. The scaling of the pupil response with decision uncertainty was not inherited from the analogous scaling of RT. (N=27, group mean $\pm$ s.e.m.)}
\label{fig:figureS3}
\end{figure}
\begin{figure}[ht]
\centering
\includegraphics{figures/pupilUncertainty_figureS4.eps}
\caption{
\textbf{Quantifying choice bias using psychometric function fits.} (\textbf{a}) A logistic psychometric function quantifies separate aspects of choice behaviour. The slope of the function indicates the observer's perceptual sensitivity. The intercept indicates a horizontal shift of the psychometric function, reflecting a bias towards a specific choice independent of the sensory evidence. The vertical offsets from the two asymptotes indicate the fraction of stimulus-independent errors ('lapses'). See also Methods. (\textbf{b}) Example psychometric functions with corresponding data points, for an example observer with a bias towards $choice{1}$ (left) and an observer with a bias towards $choice_{-1}$ (right). (\textbf{c}) History-dependent choice bias. Example observer with a tendency to repeat the previous choice. }
\label{fig:figureS4}
\end{figure}
\begin{figure}[ht]
\centering
\includegraphics{figures/pupilUncertainty_figureS5.eps}
\caption{
\textbf{Pupil modulation of history-dependent choice bias.} (\textbf{a}) Modulation of repetition probability by previous trial pupil response. $P(choice = 1)$ was computed from the intercept of the logistic function (see Methods), for tertiles of previous trial pupil responses. (\textbf{b}) as in a, but for tertiles of previous trial RT. The two choice identities were collapsed to obtain the measure of repetition probability in \autoref{fig:natcomm_Figure4}a,f. (N=27, group mean $\pm$ s.e.m.)}
\label{fig:figureS5}
\end{figure}
\begin{figure}[ht]
\centering
\includegraphics{figures/pupilUncertainty_figureS6.eps}
\caption{
\textbf{Pupil responses and RT modulate serial choice bias within categories of accuracy.} (\textbf{a}) Repetition probability, for tertiles of previous trial pupil responses, separately for correct (left) and error (right) trials. (\textbf{b}) As in \textbf{a}, but for tertiles of previous trial RT. (\textbf{c}) Beta weights for repeating a previous correct vs. incorrect choice (see Methods). (\textbf{d}) Pupil- and RT modulation weights for repeating a previous correct vs. incorrect choice. Beta weights were obtained from the model show in \autoref{fig:natcomm_Figure5}e-g, with pupil- and RT-linked modulatory terms included in the same regression model. Statistics indicate the main effect of a one-way ANOVA (\textbf{a, b}) or a permutation test (\textbf{c, d}). *** p < 0.001,* p < 0.05, n.s. p > 0.05. (N=27, group mean $\pm$ s.e.m.)
These results indicate that the modulatory effect of pupil responses (and RT) on serial choice biases was not purely driven by higher pupil responses on error trials. Instead, serial choice bias was modulated by trial-to-trial fluctuations in pupil-linked arousal within categories of trial outcomes.
}
\label{fig:figureS6}
\end{figure}
\begin{figure}[ht]
\centering
\includegraphics{figures/pupilUncertainty_figureS7.eps}
\caption{
\textbf{Modeling results do not depend on simultaneous fitting of both pupil and RT.} Running two separate regression models, one including only pupil response and one only including RT as a modulatory variable, gives the same results as shown in \autoref{fig:natcomm_Figure5} (where the two were included in the regression model simultaneously). (\textbf{a-c}) as in \autoref{fig:natcomm_Figure5}e-g, but with data obtained from two separate regression models.}
\label{fig:figureS7}
\end{figure}
\begin{figure}[ht]
\centering
\includegraphics{figures/pupilUncertainty_figureS8.eps}
\caption{
\textbf{Predictive effect is specific for pupil response on the preceding trial.} (\textbf{a}) Baseline pupil diameter on the current trial did not predict a modulation of serial choice bias. Repetition probability, for tertiles of current trial baseline pupil diameter (main effect of one-way repeated measured ANOVA, F\textsubscript{2,52} = 1.164, p = 0.320). (\textbf{b}) Pupil modulation of choice bias was only significant (** p \textless{} 0.01) across the group of observers at lag 1 (same data as \autoref{fig:natcomm_Figure5}e), and did not reach significance beyond one trial in the past. This finding indicates that the modulation of choice biases by pupil responses was more short-lived than the overall serial choice biases shown in \autoref{fig:natcomm_Figure5}c. (N=27, group mean $\pm$ s.e.m.) }
\label{fig:figureS8}
\end{figure}
\begin{figure}[ht]
\centering
\includegraphics{figures/pupilUncertainty_figureS9.eps}
\caption{
\textbf{Serial choice biases are not explained by variations in interval timing.} (\textbf{a}) To measure the passage of time between trials, we computed the latency between the onset of each test stimulus and the onset of the next trial?s test stimulus. These latencies correlated with RTs (mean Spearman's $\pm$ 0.296, range 0.086 to 0.726). Removing these trial-by-trial latencies from RT (using linear regression) did not abolish the effect of RTs on serial choice bias (main effect of RT bin, F\textsubscript{2,52} = 10.846, p \textless{} 0.001, Bf\textsubscript{10}ha = 225.756). (\textbf{b}) Latencies did not predict a modulation of serial choice bias (main effect of latency bin, F\textsubscript{2,52} = 1.541, p = 0.224, Bf\textsubscript{10} = 0.349). These results suggest that the uncertainty component of RTs, rather than the passage of time between trials, modulated serial choice bias. (\textbf{c}) We tested whether the modulation of serial bias by pupil response could be explained by trial-to-trial variations in the jittered interval between s1 and s2, or between button press and feedback delivery. When these random variations were long, they could cause larger pupil responses, irrespective of the amplitude of the underlying neural input, by driving the peripheral pupil apparatus for a longer duration. We removed these trial-to-trial interval durations from pupil responses using linear regression, and reran the analysis shown in \autoref{fig:natcomm_Figure5}e. Although pupil responses were weakly correlated to the interval between s1 and s2 (mean Spearman's $\rho$ -0.007, range -0.055 to 0.047, significant in 3 out of 27 observers) and the interval between button press and feedback (mean Spearman's $\rho$ 0.056, range -0.025 to 0.290, significant in 13 out of 27 observers), removing this variance from trial-by-trial pupil responses did not change the predictive effect of pupil responses on serial choice bias. Statistics indicate the main effect of a one-way ANOVA (a, b) and permutation test (c). *** p \textless{} 0.001, ** p \textless{} 0.01, * p \textless{} 0.05, n.s. p \textless{} 0.05. (N=27, group mean $\pm$ s.e.m.)}
\label{fig:figureS9}
\end{figure}
\begin{figure}[ht]
\centering
\includegraphics{figures/pupilUncertainty_figureS10.eps}
\caption{
\textbf{Serial choice biases are not explained by post-feedback pupil responses. }To test whether serial choice biases were modulated by pupil responses to the feedback tone beyond pre-feedback uncertainty signaling, we computed post-feedback values as the mean pupil diameter 515-765 ms after feedback tone delivery. This window was defined as the peak of the grand average pupil response, and its length set equal to our pre-feedback window. (\textbf{a}) Serial choice bias for tertiles of previous trial post-feedback pupil responses (main effect of pupil bin, F\textsubscript{2,52} = 5.479, p = 0.007, Bf\textsubscript{10} = 6.014). (\textbf{b}) Beta weights for the interaction between previous trial post-feedback pupil response and choice or stimulus, as in \autoref{fig:natcomm_Figure5}e. (\textbf{c}) We removed the effect of single-trial pre-feedback from the post-feedback signal using linear regression. The residual reflected the effect of feedback on uncertainty scaling in the pupil, after taking into account the scaling already present before the feedback tone. Serial choice bias, for tertiles of residual pupil responses (main effect of pupil bin, F\textsubscript{2,52} = 1.063, p = 0.353) (d) Modulation weights for post-feedback pupil responses, with pre-feedback pupil responses added as a covariate in the same regression model. The information about serial biases was already contained in the pupil signal before feedback delivery. Statistics indicate the main effect of a one-way ANOVA (\textbf{a, c}) and permutation test (\textbf{b, d}) . *** p \textless{} 0.001, ** p \textless{} 0.01, n.s. p \textgreater{} 0.05. (N=27, group mean $\pm$ s.e.m.)}
\label{fig:figureS10}
\end{figure}
\begin{figure}[ht]
\centering
\includegraphics{figures/pupilUncertainty_figureS11.eps}
\caption{
\textbf{Differential gating of individual choice modulation by trial outcome.} Pupil-and RT-linked modulations of serial choice bias were differentially gated by trial outcome. We computed correct and error modulation weights from choice and stimulus modulation weights (see Methods). (\textbf{a}) Correlation between choice weights and pupil modulation weights, separately for correct and incorrect choices. (\textbf{b}) Correlation between choice weights and RT modulation weights, separately for correct and incorrect choices. Colors indicate the choice weight as derived from the basic model in \autoref{fig:natcomm_Figure5}c. Error bars indicate a 68\% confidence interval obtained from a bootstrap. Triangles mark the intercept of a linear regression line; filled triangles indicate a group-level effect different from zero (as in \autoref{fig:figureS6}d). *** p \textless{} 0.001, ** p \textless{} 0.01, * p \textless{} 0.05, n.s. p \textgreater{} 0.05.
\autoref{fig:natcomm_Figure5}f shows that RT reduced observers' intrinsic serial biases while pupil responses generally promoted choice alternation. These results further dissociate these modulatory effects, in showing that they were `gated' by trial outcome in distinct ways: Large pupil-linked arousal pushed observers to increase their intrinsic serial bias after correct trials, as indicated by the positive correlation in a. After error trials, on the other hand, a correlation of the opposite sign was observed indicating that across trial outcomes, these two effects nullified and lead to an overall boost in alternation. This stood in sharp contrast to the group-level effect of RT, which predicted a reduction in intrinsic serial bias across the group. This effect was strongly present after error trials (\textbf{b}), suggesting an adaptive control mechanism could be at work only after negative feedback is received. After correct trials, high RTs indicated a slight reduction in bias, but this negative correlation was not significant across the group. }
\label{fig:figureS11}
\end{figure}
%\end{document}
\renewcommand{\thefigure}{\arabic{figure}}