AI becomes curiouser and curiouser, but not too curious

Share
AI trained on Mario Kart and other video games using a new algorithm to optimise curiosity is a step towards making AI agents as smart as kids, say experts

Researchers in the United States have created an algorithm designed to prevent artificial intelligence from becoming “too curious” and are training AI agents to use it with video games.

Experts working at MIT’s Improbable AI Laboratory and Computer Science and Artificial Intelligence Laboratory (CSAIL) say their algorithm automatically increases curiosity when it's required and then suppresses it if the agent has enough supervision to know what to do.

“Reinforcement learning” has previously been employed by systems which involve an AI agent iteratively learning from being rewarded for good behaviour and punished for bad. These agents can struggle to balance the time spent discovering better actions and the time spent taking actions that led to high rewards in the past. Too much curiosity can distract the agent from making good decisions, say researchers, while too little means the agent will never discover good decisions.

MIT’s new algorithm was tested on over 60 video games and succeeded at both hard and easy exploration tasks. Previous algorithms have only been able to tackle only a hard or easy domain, so the new method requires fewer data.  

“If you master the exploration-exploitation trade-off well, you can learn the right decision-making rules faster — and anything less will require lots of data, which could mean suboptimal medical treatments, lesser profits for websites, and robots that don't learn to do the right thing,” says Pulkit Agrawal, an Assistant Professor of Electrical Engineering and Computer Science (EECS) at MIT, Director of the Improbable AI Lab, and CSAIL affiliate who supervised the research. 

“Imagine a website trying to figure out the design or layout of its content that will maximise sales,” he says. “If one doesn’t perform exploration-exploitation well, converging to the right website design or the right website layout will take a long time, which means profit loss.”

New algorithm reduces a week of work to a few hours

In experiments, researchers divided games like Mario Kart and Montezuma’s Revenge into two different categories: one where supervision was sparse - meaning the agent had less guidance, which was considered “hard” exploration games - and a second where supervision was denser, or the “easy” exploration games. The team’s algorithm consistently performed well in both kinds of games.

“Getting consistent good performance on a novel problem is extremely challenging — so by improving exploration algorithms, we can save your effort on tuning an algorithm for your problems of interest, says Zhang-Wei Hong, an EECS PhD student, CSAIL affiliate, and co-lead author along with Eric Chen on a new paper about the work. We need curiosity to solve extremely challenging problems, but on some problems, it can hurt performance. Previously what took, for instance, a week to successfully solve the problem, with this new algorithm, we can get satisfactory results in a few hours.”

One of the greatest challenges for current AI and cognitive science is balancing exploration and exploitation, something children do seamlessly, but a challenge to reproduce for computers, says Alison Gopnik, Professor of Psychology and Affiliate Professor of Philosophy at the University of California at Berkeley. “This paper uses impressive new techniques to accomplish this automatically, designing an agent that can systematically balance curiosity about the world and the desire for reward, [thus taking] another step towards making AI agents (almost) as smart as children.”

Share

Featured Articles

Responsibility in the Age of AI: O’Reilly President Examines

O’Reilly President Laura Baldwin discusses the legal challenges unmitigated and unobserved use of Gen AI may present to enterprises

Schneider Electric Enhances AI Data Centre Operations

Schneider Electric teams with Nvidia to advance AI data centres, whilst emphasising global sustainability in energy management

How Can AI Firms Pay Publishers? Perplexity Has a Plan

AI search firm Perplexity extends its content licensing programme to 14 new media partners, offering revenue share and API access for publisher content

PwC and AWS Join Forces on Enterprise AI Controls System

AI Strategy

How Amazon Nova is Redefining AI for Enterprise Solutions

AI Strategy

MHP Study: AI Reshapes Global Auto Industry Trust Landscape

AI Strategy