Abstract

Decision confidence is a forecast about the probability that a decision will be correct. From a statistical perspective, decision confidence can be defined as the Bayesian posterior probability that the chosen option is correct based on the evidence contributing to it. Here, we used this formal definition as a starting point to develop a normative statistical framework for decision confidence. Our goal was to make general predictions that do not depend on the structure of the noise or a specific algorithm for estimating confidence. We analytically proved several interrelations between statistical decision confidence and observable decision measures, such as evidence discriminability, choice, and accuracy. These interrelationships specify necessary signatures of decision confidence in terms of externally quantifiable variables that can be empirically tested. Our results lay the foundations for a mathematically rigorous treatment of decision confidence that can lead to a common framework for understanding confidence across different research domains, from human and animal behavior to neural representations.

1  Introduction

Previous theoretical studies have offered a number of different approaches to understand the statistical and algorithmic issues involved in computing and deploying decision confidence. For instance, a signal detection theory framework is often employed for probing decisions under uncertainty and can provide a strong basis for understanding decision confidence as well (Fleming & Dolan, 2010; Kepecs, Uchida, Zariwala, & Mainen, 2008; Ma, 2010; Maniscalco & Lau, 2012; Ratcliff & Starns, 2009). Sequential sampling models have been used to understand how decisions are reached based on noisy evidence across time (Bogacz, Brown, Moehlis, Holmes, & Cohen, 2006). These can be readily extended with a computation of confidence. Perhaps the most intuitive extension is within the race model framework, where the difference between decision variables for the winning and losing races provides an estimate of confidence (Vickers, 1979; Kepecs et al., 2008; Moreno-Bote, 2010; Pleskac & Busemeyer, 2010; Zylberberg, Barttfeld, & Sigman, 2012; Drugowitsch, Moreno-Bote, & Pouget, 2014; Schustek & Moreno-Bote, 2014). Mechanistically, neural network models based on attractor dynamics have also been used to study how confidence can be computed by neural circuits (Insabato, Pannunzi, Rolls, & Deco, 2010; Rolls, Grabenhorst, & Deco, 2010; Wei & Wang, 2015).

Such computational models have also helped to interpret experimental studies on the neural basis of decision confidence. Confidence is an internal variable to a decision maker, and hence it remains unclear what is the appropriate way to identify a confidence computation or how confidence in nonhuman animals can be quantified without verbal reports of their subjective feelings. To resolve this quandary, previous studies employed quantitative models that could provide a formal prediction for what a representation of the internal variable of “confidence” would look like in terms of observable and quantifiable parameters. For instance, the orbitofrontal cortex of rats, a region implicated in the prediction of outcomes, was found to carry neural signals related to confidence (Kepecs et al., 2008). This was established by identifying unique signatures of confidence common to signal detection theory and the race model of decision making. Similarly, signal detection theory predictions have been used to understand correlates of decision confidence in the dorsal pulvinar (Komura, Nikkuni, Hirashima, Uetake, & Miyamoto, 2013) and sequential sampling models in the parietal cortex (Kiani & Shadlen, 2009). Without such computational foundations, it would not be possible to identify and rigorously study representations of confidence in neurons. Beyond a description of how confidence could be computed, signal detection theory has also been used as a starting point for evaluating the metacognitive sensitivity of human confidence reports (Ferrell & McGoey, 1980; Higham & Arnold, 2007; Higham, Perfect, & Bruno, 2009; Kunimoto, Miller, & Pashler, 2001; Lachman, Lachman, & Thronesbery, 1979; Nelson, 1984) and implicit behavioral reports of confidence in rats (Lak et al., 2014).

Here we approached the well-studied topic of decision confidence from a mathematical statistics perspective. The starting point of our framework is a normative definition of confidence that relates confidence to evidence through conditional probability (Kahneman & Tversky, 1972). We had two main goals. First, compared to prior studies, we attempted to make as few assumptions as possible about the structure of noise, the decision rule, and the specific algorithm used for estimating confidence. Second, we approached the question of confidence from a psychophysical perspective so it may be useful for psychological and neural studies that often manipulate evidence discriminability by varying the uncertainty of stimulus in a graded manner.

We began from first principles in statistics by positing that confidence is a probability estimate describing a belief (Cox, 2006). Thus, confidence can be related to the available evidence supporting the same belief through a conditional probability. As such, Bayes’ rule provides a way to understand confidence in terms of quantifiable evidence (Ferrell & McGoey, 1980; Griffin & Tversky, 1992). Formally, decision confidence can be defined as a probability estimate that the chosen hypothesis is correct, given the available internal evidence—referred to as the percept. The difficulty with this definition of decision confidence is that it depends on a variable internal to the decision maker. Therefore, it is unclear how useful it is to make general predictions without strong assumptions about perception, that is, how the internal percept is generated from the external evidence. Here we show that it is possible to analytically derive several novel predictions that interrelate confidence with choice correctness and evidence discriminability with few or no assumptions about the percept distribution or about the transfer functions between stimulus, percept, and choice.

2  Results

From a statistical perspective, a decision process can be viewed as a hypothesis test that evaluates the outcome of a choice against a null hypothesis representing its collective alternatives. Statistical decision confidence can then be defined as a Bayesian posterior probability, which quantifies the degree of belief in the correctness of the chosen hypothesis. In this view, both choice and confidence depend on the quality and amount of evidence informing the particular choice. Therefore, we mathematically formalized evidence discriminability and used ideas from psychophysics to measure the quality of evidence presented to a decision maker.

Based on these definitions, we derived four general properties of statistical decision confidence. First, confidence predicts accuracy: the level of confidence predicts the expected fraction of correct choices, as often intuitively posited. Second, confidence increases with the discriminability of presented evidence for correct choices, but counterintuitively, for incorrect choices, confidence decreases with increasing evidence discriminability. Third, when presented with a zero-discriminability choice (i.e., an equal amount of evidence supporting each hypothesis, implying chance decision accuracy), the mean decision confidence is precisely 0.75. Fourth, while evidence discriminability itself predicts accuracy (a property referred to as the psychometric function), confidence provides further information, improving the prediction of accuracy for any given level of discriminability.

2.1  Defining Statistical Decision Confidence

To provide the most general statistical model of a decision process, we define all relevant components (stimulus, percept, choice, confidence) as random variables and the functions that link them (perception, decision) as probabilistic functions. This way the theory we present in this letter applies to both stochastic and deterministic decision rules potentially involving multidimensional stimuli and multiple choices. Decisions are based on an internal variable (decision variable or percept, ), which is the decision maker’s estimate of a corresponding external variable (stimulus or evidence, D). (See Figure 1.)

Figure 1:

A framework for statistical decision confidence. A stochastic framework of perceptual decision making can be formalized by introducing a small set of random variables. Random variables are denoted by capital letters and their realizations in lowercase.

Figure 1:

A framework for statistical decision confidence. A stochastic framework of perceptual decision making can be formalized by introducing a small set of random variables. Random variables are denoted by capital letters and their realizations in lowercase.

Definition 1.

Let us denote the external variable D and realizations of this random variable d (referred to as evidence). Let us denote the corresponding internal variable and realizations of this random variable (referred to as percept; often referred to elsewhere as the decision variable). We define another random variable called the choice, denoted by (realizations denoted by ). The choice is a probabilistic function of the percept: .

The choice can be evaluated in terms of a hypothesis testing problem:

  1. Null hypothesis (): The choice is incorrect;

  2. Alternative hypothesis (): The choice is correct.

Thus, the choice is designated correct if the alternative hypothesis is true and incorrect otherwise. The evaluation can equivalently be defined as a binary random variable (outcome, ) that is a probabilistic function of choice. Next, confidence (c) can be defined as the probability of the alternative hypothesis being true (i.e., ) provided the percept and the choice.

Definition 2.
Define confidence as
formula
2.1
Equivalently,
formula
2.2

As previously, the random variable will be denoted by C and its realizations by c. Note that for deterministic choice models, determines , so . We can now define a function that determines confidence from percept and choice.

Definition 3.
Define the belief function as
formula
2.3
where denotes percept space and denotes the range of all possible choices (i.e., the choice space).

2.2  Choice Accuracy Equals Statistical Decision Confidence

Intuitively, confidence, being defined as an estimate of choice correctness, should predict the expected outcome. We provide a formal treatment of the relationship between confidence and accuracy below.

Definition 4.
Accuracy is the expected proportion of correct choices:
formula
2.4

We seek to determine the following function: , , where Ac is the accuracy for choices with a given confidence. Our claim is that this function is the identity.

Theorem 1.
Accuracy equals confidence:
formula
2.5
Proof.
For every given value of confidence, there is a set of percept-choice pairs leading to the same confidence value. Let us denote the image of c by the inverse belief function as , the set of percept-choice pairs mapping onto c. Let us first assume that I is a countable set. Accuracy for confidence c is determined by the probability of a correct choice if over the probability of encountering the confidence level of c (i.e., ):
formula
From the definition of joint probability,
formula
As we know that ,
formula
However, I is not necessarily a countable set. We can rewrite the equations in continuous form to apply to any set as follows:
formula
Here is a random variable that is 1 if the choice is correct and 0 otherwise (outcome, see above):
formula

Note that these considerations about confidence do not depend on a particular theory of perception, that is, the function mapping from the external variable (stimulus) onto the internal percept: . Furthermore, the derivation also does not depend on a particular theory of decision, that is, the function between the percept and the choice: . This includes both deterministic and stochastic decision models, the latter referring to models where a percept does not uniquely determine a choice. In case of deterministic decision models, the percept unequivocally determines the choice; thus, in the equations, we could drop the choice from the inverse picture of confidence, taking only the percept into account: instead of . However, as this simplified version would not include stochastic decision models, we chose to adhere to the general formalization.

Another notable aspect of this derivation is that there is no need for a relation to be defined on the percept space. However, if the choice is fixed (or determined by the percept, as in deterministic decision models), confidence defines a natural relation on percepts by . More precisely, the order relation on confidence values can be pulled back to the percept space by taking and restricting it to a particular choice.

Therefore, we can define the relative terms low-confidence percept and high-confidence percept based on the relation of confidence values the percepts map onto by the belief function; we will use this concept while proving theorem 7. Note that this relation always refers to fixed choices.

2.3  Confidence Increases with Increasing Evidence Discriminability for Correct Choices and Decreases for Incorrect Choices

Psychophysical studies require that decision performance is measured at varying levels of decision difficulty. This necessitates the quantification of the decision difficulty axis, along which the proportion of correct choices can then be measured. Such interrelations, termed psychometric functions, provide a good handle on behavioral performance, allowing the detection of subtle changes in behavior. However, there is no single way of grading choice difficulty, resulting in a broad variety of such measures, complicating the theoretical treatment of psychometric functions. Therefore, we define evidence discriminability by its property of measuring difficulty as a class of functions in order to provide a general treatment of the interrelations of choice difficulty and confidence.

Definition 5.
Define evidence discriminability as a (deterministic) function of the evidence distribution:
formula
2.6
The evidence discriminability function has to fulfill the following property:
formula
2.7
that is, higher discriminability should be equivalent to greater expected outcome (higher probability of correct choices). Any monotonically increasing function of expected outcome satisfies this criterion and can serve as evidence discriminability.

Having defined evidence discriminability, we can now examine how confidence changes with evidence discriminability separately for correct and incorrect choices. We show below that while confidence increases with increasing evidence discriminability for correct choices, it counterintuitively decreases for incorrect choices.

Theorem 2.

Let us assume that:

  • Belief independence: the belief function () is independent of evidence discriminability

  • Percept monotonicity: for any given confidence c, the relative frequency of percepts mapping to c by changes monotonically with evidence discriminability for any fixed choice.

Under these assumptions, confidence increases for correct choices and decreases for incorrect choices with increasing evidence discriminability.

Proof.

We begin with the somewhat counterintuitive claim regarding the incorrect choices. Let us first examine the two assumptions in more detail.

The first assumption postulates that the function from percept-choice pairs to confidence does not change with evidence discriminability. Thus, whenever we calculate the expected value of confidence over a percept distribution, only the percept distributions will depend on evidence discriminability.

For incorrect choices, the second assumption means that with increasing evidence discriminability, the relative frequency of low-confidence percepts increases while the relative frequency of high-confidence percepts decreases in the percept distribution. Note that low-confidence and high-confidence percepts are defined here through the relation imposed by on the percepts (see our comment at the end of the previous section). As a trivial consequence of this definition, confidence changes monotonically along low- and high-confidence percepts.

Let us consider two different levels of evidence discriminability (), with corresponding distributions of percept restricted to incorrect choices P (, low evidence discriminability, i.e., “difficult choice”) and Q (, high evidence discriminability, i.e., “easy choice”). It is sufficient to show that the expected value of confidence is larger for than for :
formula
where p and q denote the probability density functions corresponding to P and Q, respectively. Note that can be thought of as the probability of the picture of c by restricted to incorrect choices in the percept space.
Equivalently,
formula
Let denote the interval where and the complementary interval where . The existence of these intervals is the consequence of the monotonicity assumption. Thus, there is a critical confidence value (denoted here by ccrit) for which and . We then rewrite confidence as if and if ; thus, for both cases. Applying these notations,
formula
In the last step, the first term is 0, since
formula
The second term is positive, since is positive on and the probability density functions are evaluated on I1, where . Finally, the third term is also positive, because is positive on and the probability density functions are evaluated on I0, where . As a consequence, the sum is positive, which completes the proof for incorrect choices.

For correct choices, high-confidence percepts are increasingly more likely with increasing evidence discriminability; thus, they present an opposite pattern compared to incorrect choices. Therefore, a symmetric derivation proves the increase of confidence with increasing evidence discriminability for correct choices.

The assumption that is independent of evidence discriminability is necessary for this derivation. In this framework, confidence is defined through the true distributions of correct and incorrect choices, kept fixed; therefore this assumption is met. However, if confidence values are updated based on distributions reflecting varying values of evidence discriminability, then the belief function will differ according to evidence discriminability; thus, the above proof does not apply. Furthermore, the expected value of confidence cannot decrease with increasing evidence discriminability for incorrect choices. For the lowest levels of discriminability, when the outcome is at chance level, confidence will fall to its lowest possible value, reflecting equal probabilities of the null and alternative hypothesis regardless of the percept. This represents situations in which the decision maker is provided with information about evidence discriminability, for example, by grouping decisions of similar discriminabilities (as in a block experimental design), providing an opportunity to learn about evidence discriminability and update the distributions underlying confidence accordingly. Thus, theorem 7 applies only when updating confidence based on knowledge of evidence discriminability is prevented, for example, by randomizing the order of choices with different discriminability levels in an interleaved design.

2.4  Confidence Predicts Outcome Beyond Evidence Discriminability

Psychometric functions reveal accuracy for any given level of evidence discriminability. While confidence also changes with evidence discriminability, it is not obvious whether it carries additional information allowing better prediction of outcome for a given level of evidence discriminability. Below we show that it does.

Theorem 3.

For any given evidence discriminability, accuracy for low-confidence choices is not larger than that of high-confidence choices (splitting the confidence distribution at any particular value). A strict inequality holds in all cases when accuracy is dependent on the percept.

Proof.

Let us take the set of low-confidence percept-choice pairs corresponding to the low-confidence choices by and, similarly, the set of high-confidence percept-choice pairs corresponding to the high-confidence choices. By the definition of confidence (see definition 2 in section 2.1), low-confidence percept-choice pairs cannot have higher accuracy than the high-confidence percept-choice pairs. If all percepts are associated with the same accuracy (either when the percept does not carry information about the hypotheses of choice, or when the percept determines the correct choice with a probability of one), the two accuracies are equal. Otherwise, the two accuracies should necessarily differ, in which case the strict inequality holds.

Thus, even for the same level of difficulty, the internal noise (e.g., imperfect perception) can result in different percepts, some being “easier” and others “harder.” While the decision maker has access to the this internal variable, the experimenter does not. However, the confidence report contains at least part of the relevant information, providing additional information to the experimenter, which improves the experimenter’s estimate of accuracy.

2.5  The Average Confidence in Neutral Evidence

Next, we examine the average confidence at neutral evidence, that is, evidence carrying no information about the correct choice, for one-dimensional variables.

Theorem 4.

Assuming

  • The percept is determined by a symmetric distribution centered on the evidence (“symmetric noise model”),

  • The evidence is distributed uniformly over the evidence space,

  • The choice is deterministic,

the average confidence for neutral evidence is precisely 0.75.

Proof.

We first prove the following lemma.

Lemma 1.
Integrating the product of the probability density function and the distribution function of any probability distribution symmetric to zero over the positive half-line results in 3/8:
formula
2.8
Proof.
,
formula
using that
formula
for changing the integral boundaries and then swapping x and t in the second integral term. Applying the above equation, we can write
formula
Proof of Theorem 4.
Confidence for neutral evidence is determined by the percept corresponding to neutral evidence and the probability of being correct provided the percept and the choice. Thus, the average confidence for neutral evidence can be calculated by integrating over the distribution of percepts provided neutral evidence (indicated here by d = 0):
formula
Since we assumed deterministic choice (the third assumption), confidence is determined by the percept. Therefore, we can drop from the equation:
formula
Based on our first assumption, the percept is determined by a symmetric distribution around the evidence. Denote the density function of this symmetric (“noise”) distribution f and its distribution function F. Since the percept distribution is symmetric,
formula
As a consequence of the second assumption of uniform evidence distribution, for :
formula
using the theorem of total probability. In the last step, we use that fD and are constant because of the uniformity assumption. (Note that we restrict the support of to that of .) Thus,
formula
By applying the lemma,
formula

Note that one of the critical assumptions we made is that the evidence is distributed uniformly over the evidence space. While real-life scenarios will often have nonuniform evidence distributions, this uniformity property holds approximately true for many psychophysics experiments using interleaved evidence strengths. Therefore, theorem 9 provides a quantitatively testable prediction about confidence reports in psychophysics experiments. More generally, this proof points to apparent overconfidence in percepts with neutral evidence in situations when the difficulty of the decisions cannot be determined. The degree of overconfidence will depend on the actual integrals involved.

2.6  Monte Carlo Simulations Illustrating the Signatures of Decision Confidence

To illustrate our theory, we created a Monte Carlo simulation of the normative definition of confidence. For the simulation, we assumed that gaussian noise () corrupts the external evidence to generate an internal percept. We used a deterministic decision rule based on the sign of the percept. Thus, outcomes were correct if the sign of the evidence and percept matched. Using these definitions to determine percept and outcome for each trial, we generated a large number of trials, each producing a choice and a percept. We then discretized the percepts of all trials into 200 bins and determined the fraction of correct decisions in each bin. Confidence for each trial was assigned by matching the trial’s percept to the corresponding confidence value, equal to the fraction of correct trials for that trial’s percept bin. This enabled us to explore predicted interrelationships between confidence, evidence discriminability, and choice. Figure 2A shows that confidence predicts the mean choice accuracy (see theorem 5). Figure 2B demonstrates that mean confidence for a given level of evidence discriminability increases for correct and decreases for incorrect choices (see theorem 7), and that the mean confidence for neutral evidence is (see theorem 9). Figure 2C illustrates that for each given level of evidence discriminability, accuracy for high-confidence choices is greater than for low-confidence choices (see theorem 8). Note that while accuracy across simulation trials with low confidence falls to chance (see Figure 2A), on the converse, the average confidence for neutral evidence is at midrange (see Figure 2B). This seemingly contradictory result is a consequence of grouping decisions in different ways. In Figure 2A, accuracy is conditioned on a group of choices with a given confidence level. In Figure 2B, individual percepts corresponding to neutral evidence will often be far away from neutral so as to lead to an apparent “overconfidence.” Nevertheless, the neutral evidence group of choices as a whole, by definition, will lead to chance level accuracy. Taken together, these plots illustrate four signatures of decision confidence in terms of externally quantifiable variables that can be experimentally examined.

Figure 2:

The normative model of confidence predicts specific interrelationships between evidence, outcome and confidence. Monte Carlo simulations of the normative model (10 billion trials). Bins with fewer than 100 simulation data points were omitted. (A) Confidence equals accuracy. (B) Average confidence increases with evidence discriminability from for correct choices (green) and decreases for errors (red). (C) Conditioning on high (light grey) or low (dark grey) confidence (split at ) segregates psychometric performance.

Figure 2:

The normative model of confidence predicts specific interrelationships between evidence, outcome and confidence. Monte Carlo simulations of the normative model (10 billion trials). Bins with fewer than 100 simulation data points were omitted. (A) Confidence equals accuracy. (B) Average confidence increases with evidence discriminability from for correct choices (green) and decreases for errors (red). (C) Conditioning on high (light grey) or low (dark grey) confidence (split at ) segregates psychometric performance.

2.7  Relating p-Values to Confidence

We derived properties of statistical confidence based on the definition of decision confidence as a Bayesian posterior probability. We next sought to demonstrate the generality of these predictions by determining whether they apply to confidence values produced by other statistical approaches. Therefore, we constructed a simulation to test the properties of p-values produced by a common statistical test for evaluating a choice between two hypotheses. First, we examined the one-sided, two-sample Student t-test (see Figures 3A–3C). Samples of 20 measurements were drawn from two gaussian distributions on each simulation trial, where the simulated task was to identify which underlying distribution had a larger mean. To create graded discriminability, we varied the distance between the means from −0.5 to 0.5 with uniform probability. A simulation trial was designated as correct if the mean of the 20 samples drawn from the distribution with the higher mean was higher than the mean of the 20 samples drawn from the distribution with the lower mean. We computed the p-value for each trial using a one-sided two-sample t-test to provide a measure of statistical confidence () in the chosen response. Thus, each simulation trial yielded an outcome (correct or error) and a measure of statistical confidence. Second, we also performed simulations of a bootstrap test (Efron & Tibshirani, 1993), which does not depend on a gaussian assumption about the underlying distributions (see Figures 3D–3F). Exponential sample distributions were used. Offsets for the population means were uniform, ranging between 0 and 1, and the bootstrap sample size was 1000. As shown in Figure 3, the p-values derived from a t-test and a two-sample bootstrap test for difference between means reveal the same pattern of interrelationships we derived from the Bayesian confidence definition. Thus, the predictions we derived for statistical decision confidence are valid across different statistical approaches: Bayesian, frequentist, and bootstrap statistics.

Figure 3:

Two statistical tests reproduce patterns predicted by the normative model. (A–C) A simulation of 10 million trials evaluated the one-sided, two-sample Student t-test p-value with respect to accuracy and evidence discriminability. Since p indicates uncertainty, axes show to indicate confidence. (A) is positively correlated with accuracy. (B) is monotonically increasing with evidence discriminability for correct trials (green) and decreasing for error trials (red). (C) P-values contain information about outcome even at fixed evidence discriminability. High-confidence trials are shown in light grey; low-confidence trials are shown in dark grey. (D–F) A simulation of the p-value in a one-sided bootstrap test for an ordinal relationship between two means, using exponential distributions.

Figure 3:

Two statistical tests reproduce patterns predicted by the normative model. (A–C) A simulation of 10 million trials evaluated the one-sided, two-sample Student t-test p-value with respect to accuracy and evidence discriminability. Since p indicates uncertainty, axes show to indicate confidence. (A) is positively correlated with accuracy. (B) is monotonically increasing with evidence discriminability for correct trials (green) and decreasing for error trials (red). (C) P-values contain information about outcome even at fixed evidence discriminability. High-confidence trials are shown in light grey; low-confidence trials are shown in dark grey. (D–F) A simulation of the p-value in a one-sided bootstrap test for an ordinal relationship between two means, using exponential distributions.

3  Discussion

We presented a normative statistical framework that enables the comparison of statistical decision confidence with confidence measures in other domains. Unlike signal detection theory and other algorithmic frameworks that simulate confidence judgments based on assumptions about the underlying evidence distributions, we show that a strict analytical treatment is possible in a distribution-free manner.

We analytically derived a set of properties of confidence defined as the Bayesian posterior probability of a chosen hypothesis being correct (Pouget, Drugowitsch, & Kepecs, 2016). First, confidence predicts accuracy: the level of confidence predicts the expected fraction of correct choices. This property corresponds most directly to the intuitive notion of confidence as a graded forecast about accuracy. Second, mean confidence for a given level of external evidence is larger for correct than incorrect choices and in fact varies with an opposite sign with evidence discriminability for correct versus incorrect choices. Specifically, mean confidence levels increase with the ease of discriminability for correct choices, but counterintuitively, confidence decreases with increasing evidence discriminability for incorrect choices. This surprising dissociation is a consequence of the differences in the distributions of conditional percepts between correct and incorrect choices. Third, and perhaps most surprising, when presented with equal amounts of evidence supporting each hypothesis—in other words, an indiscriminable choice that will lead to chance accuracy—the mean decision confidence is much greater than chance: precisely . Fourth, while the psychometric function defines the average choice accuracy for a given level of external evidence, knowledge of confidence provides further information improving the prediction of accuracy for any given level of discriminability.

These four properties are signatures of confidence in terms of observable variables and thus useful for interpreting both behavioral and physiological experiments on decision confidence. Behaviorally, our framework makes clear, that even statistically optimal confidence reports can appear to show systematic miscalibration. This mismatch between confidence reports and accuracy is most dramatically illustrated by the mean confidence for neutral evidence that produces chance accuracy behaviorally (). As our framework makes clear this apparent miscalibration does not imply imperfect prediction of accuracy; rather it is a straightforward consequence of conditioning confidence reports on external variables of the task (e.g., stimulus difficulty) that are not available to the decision maker (see also Drugowitsch et al., 2014). This property of statistical confidence carries important implications for the interpretation of studies demonstrating overconfidence in low discriminability and underconfidence in high-discriminability conditions, a controversial phenomenon termed the hard-easy effect (Drugowitsch et al., 2014; Ferrell, 1995; Harvey, 1997; Juslin, Winman, & Olsson, 2000; Merkle, 2009; Moore & Healy, 2008). More generally, one has to be careful when analyzing behavior or neural activity by conditioning on external variables not available to the decision maker. When internal representations are examined as a function of external variables, a computational theory is needed to understand how observables conditioned on the external variables are linked to the internal representations. Therefore, rather than revealing miscalibration, conditioning on external variables can be used to test signatures of decision confidence we derived (see Figure 2) and will be valuable in interpreting putative confidence-related neural activity as well (Kepecs et al., 2008; Komura et al., 2013; Kiani & Shadlen, 2009).

The framework we presented can be interpreted as a prescriptive model, describing how the computation of confidence ought to be performed. In this sense, it is useful for describing what a neural representation of confidence or its behavioral report should look like. Beyond this, we expect that our mathematical framework will serve as a departure point for quantitatively studying the contribution of confidence to different behaviors and identifying confidence variables in other domains.

References

References
Bogacz
,
R.
,
Brown
,
E.
,
Moehlis
,
J.
,
Holmes
,
P.
, &
Cohen
,
J. D.
(
2006
).
The physics of optimal decision making: A formal analysis of models of performance in two-alternative forced-choice tasks
.
Psychological Review
,
113
(
4
),
700
765
.
Cox
,
D. R.
(
2006
).
Principles of statistical inference
.
Cambridge
:
Cambridge University Press
.
Drugowitsch
,
J.
,
Moreno-Bote
,
R.
, &
Pouget
,
A.
(
2014
).
Relation between belief and performance in perceptual decision making
.
PloS One
,
9
(
5
),
e96511
.
Efron
,
B.
, &
Tibshirani
,
R.
(
1993
).
An introduction to the bootstrap
.
Boca Raton, FL
:
CRC Press
.
Ferrell
,
W. R.
(
1995
).
A model for realism of confidence judgments: Implications for underconfidence in sensory discrimination
.
Perception and Psychophysics
,
57
(
2
),
246
254
.
Ferrell
,
W. R.
, &
McGoey
,
P. J.
(
1980
).
A model of calibration for subjective probabilities
.
Organizational Behavior and Human Performance
,
26
(
1
),
32
53
.
Fleming
,
S. M.
, &
Dolan
,
R. J.
(
2010
).
Effects of loss aversion on post-decision wagering: Implications for measures of awareness
.
Consciousness and Cognition
,
19
(
1
),
352
363
.
Griffin
,
D.
, &
Tversky
,
A.
(
1992
).
The weighing of evidence and the determinants of confidence
.
Cognitive Psychology
,
24
(
3
),
411
435
.
Harvey
,
N.
(
1997
).
Confidence in judgment
.
Trends in Cognitive Sciences
,
1
(
2
),
78
82
.
Higham
,
P. A.
, &
Arnold
,
M. M.
(
2007
).
Beyond reliability and validity: The role of metacognition in psychological testing
. In
R. A.
DeGregorio
(Ed.),
New developments in psychological testing
(pp.
139
162
).
Happauge, NY
:
Nova Science
.
Higham
,
P. A.
,
Perfect
,
T. J.
, &
Bruno
,
D.
(
2009
).
Investigating strength and frequency effects in recognition memory using type-2 signal detection theory
.
Journal of Experimental Psychology: Learning, Memory, and Cognition
,
35
(
1
),
57
80
.
Insabato
,
A.
,
Pannunzi
,
M.
,
Rolls
,
E. T.
, &
Deco
,
G.
(
2010
).
Confidence-related decision making
.
Journal of Neurophysiology
,
104
(
1
),
539
547
.
Juslin
,
P.
,
Winman
,
A.
, &
Olsson
,
H.
(
2000
).
Naive empiricism and dogmatism in confidence research: A critical examination of the hardeasy effect
.
Psychological Review
,
107
(
2
),
384
396
.
Kahneman
,
D.
, &
Tversky
,
A.
(
1972
).
Subjective probability: A judgment of representativeness
.
Cognitive Psychology
,
3
(
3
),
430
454
.
Kepecs
,
A.
,
Uchida
,
N.
,
Zariwala
,
H. A.
, &
Mainen
,
Z. F.
(
2008
).
Neural correlates, computation and behavioural impact of decision confidence
.
Nature
,
455
(
7210
),
227
231
.
Kiani
,
R.
, &
Shadlen
,
M. N.
(
2009
).
Representation of confidence associated with a decision by neurons in the parietal cortex
.
Science
,
324
(
5928
),
759
764
.
Komura
,
Y.
,
Nikkuni
,
A.
,
Hirashima
,
N.
,
Uetake
,
T.
, &
Miyamoto
,
A.
(
2013
).
Responses of pulvinar neurons reflect a subject’s confidence in visual categorization
.
Nature Neuroscience
,
16
(
6
),
749
755
.
Kunimoto
,
C.
,
Miller
,
J.
, &
Pashler
,
H.
(
2001
).
Confidence and accuracy of near-threshold discrimination responses
.
Consciousness and Cognition
,
10
(
3
),
294
340
.
Lachman
,
J. L.
,
Lachman
,
R.
, &
Thronesbery
,
C.
(
1979
).
Metamemory through the adult life span
.
Developmental Psychology
,
15
(
5
),
543
551
.
Lak
,
A.
,
Costa
,
G. M.
,
Romberg
,
E.
,
Koulakov
,
A. A.
,
Mainen
,
Z. F.
, &
Kepecs
,
A.
(
2014
).
Orbitofrontal cortex is required for optimal waiting based on decision confidence
.
Neuron
,
84
(
1
),
190
201
.
Ma
,
W. J.
(
2010
).
Signal detection theory, uncertainty, and Poisson-like population codes
.
Vision Research
,
50
(
22
),
2308
2319
.
Maniscalco
,
B.
, &
Lau
,
H.
(
2012
).
A signal detection theoretic approach for estimating metacognitive sensitivity from confidence ratings
.
Consciousness and Cognition
,
21
(
1
),
422
430
.
Merkle
,
E. C.
(
2009
).
The disutility of the hard-easy effect in choice confidence
.
Psychonomic Bulletin and Review
,
16
(
1
),
204
213
.
Moore
,
D. A.
, &
Healy
,
P. J.
(
2008
).
The trouble with overconfidence
.
Psychological Review
,
115
(
2
),
502
517
.
Moreno-Bote
,
R.
(
2010
).
Decision confidence and uncertainty in diffusion models with partially correlated neuronal integrators
.
Neural Computation
,
22
(
7
),
1786
1811
.
Nelson
,
T. O.
(
1984
).
A comparison of current measures of the accuracy of feeling-of-knowing predictions
.
Psychological Bulletin
,
95
(
1
),
109
133
.
Pleskac
,
T. J.
, &
Busemeyer
,
J. R.
(
2010
).
Two-stage dynamic signal detection: A theory of choice, decision time, and confidence
.
Psychological Review
,
117
(
3
),
864
901
.
Pouget
,
A.
,
Drugowitsch
,
J.
, &
Kepecs
,
A.
(
2016
).
Confidence and certainty: Distinct pobabilistic quantities for different goals
.
Nat. Neuroscience
,
19
,
366
374
.
Ratcliff
,
R.
, &
Starns
,
J. J.
(
2009
).
Modeling confidence and response time in recognition memory
.
Psychological Review
,
116
(
1
),
59
83
.
Rolls
,
E. T.
,
Grabenhorst
,
F.
, &
Deco
,
G.
(
2010
).
Decision-making, errors, and confidence in the brain
.
Journal of Neurophysiology
,
104
(
5
),
2359
2374
.
Schustek
,
P.
, &
Moreno-Bote
,
R.
(
2014
).
A theory of decision-making using diffusion-to-bound models: Choice, reaction-time and confidence
.
BMC Neuroscience
,
15
(
Suppl. 1
),
P88
.
Vickers
,
D.
(
1979
).
Decision processes in visual perception
.
New York
:
Academic Press
.
Wei
,
Z.
, &
Wang
,
X.-J.
(
2015
).
Confidence estimation as a stochastic process in a neurodynamical system of decision making
.
Journal of Neurophysiology
,
114
(
1
),
99
113
.
Zylberberg
,
A.
,
Barttfeld
,
P.
, &
Sigman
,
M.
(
2012
).
The construction of confidence in a perceptual decision
.
Frontiers in Integrative Neuroscience
,
6
.