AI learns to “exploit” and “abuse” the environment in a new experiment

What’s the deal: A new experiment by OpenAI, the research company co-founded by Elon Musk, has revealed some interesting insights into computer intelligence. In a multi-agent intelligence research, using the simple principles of hide-and-seek, the company reports AI agents discovering ways to “exploit” and “abuse” the virtual environment of the simulation they were put in.

What precisely: The researchers’ objective was to find out if and how simple competitive rules in a multi-agent virtual environment could lead to intelligent behavior.

The agents played millions of rounds of hide-and-seeks against each other and against older versions of themselves (self-play). They were able to learn sophisticated tool use, discover various strategies and “find a way to exploit the environment […] in an unintended way.”

How: The AI was trained using reinforcement learning, without being incentivized for any particular skill or technique.

Researches said AI agents exhibited surprising behaviors, building their own shelters from scratch by arranging multiple objects and structures.

What next: “We hope that with a much larger and more diverse environment, truly complex and intelligent agents will one day emerge”, the researchers say.

As one viewer put it, how long until the AI discovers it’s inside a simulation?

Check a detailed video here

Further reading:
Autocurricula and the Emergence of Innovation from Social Interaction: A Manifesto for Multi-Agent Intelligence Research