Artificial Intelligence (AI) has infiltrated video game development on different levels. However, an AI that can play video games like humans is wild. Google DeepMind team might have cracked the code with the creation of Scalable Instructable Multiworld Agent (SIMA).

Say Hello To SIMA From Google DeepMind, An AI That Can Play Video Games Like Humans

According to the SIMA team, this AI can “follow natural-language instructions to carry out tasks in a variety of video game settings”. Google DeepMind partnered with 8 video game studios to train SIMA using nine different video games. The list of games that SIMA learned from includes:

  • Eco (Strange Loop Games)
  • Goat Simulator 3 (Coffee Stain)
  • Hydroneer (Foulball Hangover)
  • No Man’s Sky (Hello Games)
  • Satisfactory (Coffee Stain)
  • Space Engineer (Keen Software House)
  • Teardown (Tuxedo Labs & Saber Interactive)
  • Valheim (Coffee Stain)
  • Wobbly Life (RubberbandGames)

These partnerships exposed SIMA to diverse video game environments. Google DeepMind team also built a unique environment in Unity which they called Construction Lab where the player needed to create sculptures from building blocks to test their intuitive understanding of the physical world and object manipulation capability.

“By learning from different gaming worlds, SIMA captures how language ties in with game-play behavior. Our first approach was to record pairs of human players across the games in our portfolio, with one player watching and instructing the other. We also had players play freely, then rewatch what they did and record instructions that would have led to their game actions.”

While Google DeepMind has previously worked on Atari games and created AlphaStar system which could play StarCraft II at human-grandmaster level, they called SIMA a new milestone that shifts the focus from just one game to “a general, instructable game-playing AI agent”.

“This research marks the first time an agent has demonstrated it can understand a broad range of gaming worlds, and follow natural-language instructions to carry out tasks within them, as a human might,” the team said.

How does SIMA work?

Google DeepMind described SIMA as “an AI agent that can perceive and understand a variety of environments, then take actions to achieve an instructed goal”. SIMA has two main components; a video model that can predict what will happen next on-screen and a precise image-language mapping model.

According to the developers, the model doesn’t need access to the game’s source code or bespoke APIs. Instead, it uses the two inputs; the natural-language instruction provided by the user and the images on the screen.

“SIMA uses keyboard and mouse outputs to control the games’ central character to carry out these instructions. This simple interface is what humans use, meaning SIMA can potentially interact with any virtual environment.”

SIMA is an ongoing project and the current version possesses around 600 basic skills including navigation and object interaction. According to the team behind the project, the current version performs simple tasks that can be completed within 10 seconds.

“We want our future agents to tackle tasks that require high-level strategic planning and multiple sub-tasks to complete, such as ‘Find resources and build a camp’.”

According to the team, the results that they have achieved with SIMA so far show that a new wave of AI can spring from generalist, language-driven AI agents. The team hopes to further build the model on more training environments and add more capable models.

The team expects the model to become more versatile and generalizable as they expose it to more training worlds. Consequently, they “hope to improve SIMA’s understanding and ability to act on higher-level language instructions to achieve more complex goals”. If you are wondering what the ultimate goal of SIMA is, here is what the team said:

“Ultimately, our research is building towards more general AI systems and agents that can understand and safely carry out a wide range of tasks in a way that is helpful to people online and in the real world.”

To understand more about the SIMA project, read their technical report here.

