What is it about?

When a robot has to make decisions in the real world (say, finding its way across a hazardous radioactive environment) it has to balance two things: time spent thinking about what to do, and time spent actually doing it. Thinking too little leads to bad choices, but thinking too much wastes time, energy, or, in our radioactive area example, leads to it absorbing more radiation while thinking. Usually a human designer just fixes how much thinking happens in advance. We instead teach the robot to manage its own reasoning: to learn when to pause and think, where it is safest or cheapest to do so, and how hard to think in each situation. We show that a robot that controls its own thinking in this way makes better, cheaper decisions compared to approaches that cannot adapt their reasoning to their surroundings.

Featured Image

Why is it important?

Most existing work on getting an agent to "think about its own thinking" only decides how long to deliberate before committing to a plan. They cannot interleave thinking and acting as they go, account for deliberation being riskier or costlier in some places than others, or cope with the uncertainty of real environments. To our knowledge, ours is the first method to handle the *when*, *where*, and *how* of an agent's reasoning together in a single framework, for the kind of unpredictable problems real robots actually face.

Perspectives

Humans rarely think for a fixed amount of time. We sense when a decision deserves real deliberation and when it doesn't, and we adjust on the fly. Most AI agents lack that instinct entirely. What I like about this work is that it gives an agent a way to develop this instinct: to learn for itself how much reasoning each moment is worth, rather than having that judgement hard-coded by a designer in advance.

Matthew Budd
Stateful Robotics

Read the Original

This page is a summary of: Think Fast! Learning to Control Online Reasoning in Stochastic Environments, International Foundation for Autonomous Agents and Multiagent Systems,
DOI: 10.65109/mxnx1909.
You can read the full text:

Read

Contributors

The following have contributed to this page