The value of confidence: Confidence prediction errors drive value-based learning in the absence of external feedback

Lena Esther Ptasczynski; Isa Steinecker; Philipp Sterzer; Matthias Guggenmos

doi:10.1371/journal.pcbi.1010580

What is it about?

We all make choices on a daily basis. Long established is the fact that the value of choice options is flexibly and continuously updated by external reinforcement, such as social kudos or money. But does learning stagnate if feedback is not available such as when practicing an instrument from home? Our work builds on recent empirical evidence indicating that the subjective feeling of confidence acts as an internally generated reinforcement learning signal when external feedback is unavailable. A tentative conclusion from these earlier studies is that internal feedback utilizes a similar neural machinery and computational logic as in instances of learning in which decision-contingent external reward or feedback is available. In the present study, we sought to test the generalizability of such confidence-based learning signals by investigating them in a key domain of reward-based learning: instrumental conditioning. We reasoned that if choice confidence acts as a reinforcement signal in the absence of feedback, it should also affect the subjective value of chosen options. Indeed, the idea that the mere act of a choice impacts the subjective values of choice options has been considered before. The most prominent example is Leo Festinger’s cognitive dissonance theory, which posits that the act of a choice influences subjective values as a form of post-hoc rationalization. To experimentally probe such value-based learning in the absence of external feedback and the role of confidence therein, we designed a value-based decision-making task in which participants had to learn the value of initially neutral stimuli in phases with and without external reward feedback, while reporting their subjective confidence after each choice. In agreement with our hypothesis, we found signatures of such confidence-based learning, including an increase in subjective confidence and choice consistency in phases without feedback, both pointing to a self-reinforcement of choice options. To better understand the mechanisms of value learning in the absence of feedback, we devised a family of computational models in which learning is based on confidence prediction errors (analogous to reward prediction errors). A statistical model comparison demonstrated that these confidence-based learning models outperformed classical reinforcement learning models (which would predict either no change in subjective values or devaluation over time). Intriguingly, an analysis of computational parameters showed that individuals with more volatile reward-based learning also showed more volatile confidence-based learning suggesting a common underlying trait.

Photo by Towfiqu barbhuiya on Unsplash

Why is it important?

Together, our findings provide evidence for a fundamental parallel between external reward-based and internal confidence-based feedback in human instrumental conditioning.

This page is a summary of: The value of confidence: Confidence prediction errors drive value-based learning in the absence of external feedback, PLoS Computational Biology, October 2022, PLOS,
DOI: 10.1371/journal.pcbi.1010580.
You can read the full text:

Read

Contributors

The following have contributed to this page

Lena Esther Ptasczynski
Charité - Universitätsmedizin Berlin

Confidence is key: learned reward values are updated by choice confidence

What is it about?

Why is it important?

Contributors

You might also like

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Confidence is key: learned reward values are updated by choice confidence

What is it about?

Featured Image

Why is it important?

Read the Original

Contributors

Share this page:

You might also like

Immediate impact of stay-at-home orders to control COVID-19 transmission on mental well-being in Bangladeshi adults: Patterns, Explanations, and future directions

Smoothing splines of apex predator movement: Functional modeling strategies for exploring animal behavior and social interactions

Rods contribute to the light-induced phase shift of the retinal clock in mammals

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management