What you need to know
- Google has introduced its latest AI SIMA, a “generalist agent” designed to help users complete tasks in video games.
- SIMA is language-based. That is, you need what the user sees on the screen and instructions for completing a task, such as gathering a resource.
- DeepMind says it has used nine games to train SIMA so far, but there is still a lot of work left to do before it can handle complex tasks and instructions.
Google DeepMind has announced its latest project, the Generalist AI Agent, which aims to help users perform tasks while playing games.
According to DeepMind, the latest AI is called Scalable Instructable Multiworld Agent, or SIMA. Google's AI-focused division states that SIMA can “perceive and understand a variety of environments and take action to achieve directed goals.”
DeepMind added that for SIMA to work, it needs what the user is looking at and “natural language” instructions provided by the user. This AI is said to use the user's standard keyboard and mouse inputs to move the user's character within the game world. What's more, SIMA “can interact with any virtual environment,” the post adds.
SIMA is tested on 600 basic skills, including turning left, climbing ladders, and opening the game's pause menu for settings. Google says SIMA can perform “simple tasks” in 10 seconds. Some of these tasks include directing his SIMA to drive a car in Goat Simulator 3 and walk to a spaceship in No Man's Sky.
To get SIMA to its current status, Google says it has partnered with game developers such as Hello Games, which created No Man's Sky, and Tuxedo Labs, which created Teardown. SIMA was also taught using an environment built on the Unity engine called “Construction Lab.” Through this, SIMA learned how to manipulate objects and gained a deeper understanding of the physical world within video games.
Google also turned to real gamers as a first approach to understanding how SIMA works. They supervised them as a pair, with one player playing the game and the other giving instructions on how to complete the task.
In total, Google's DeepMind trained SIMA using nine different games within each genre. During the course of its research, the department found that AI agents trained on multiple games produced better results than systems trained on a single game. Further investigation revealed that language dependence was most important for SIMA performance.
Without user input, SIMA is said to operate in an “appropriate but purposeless manner.” DeepMind observed that SIMA explores and collects resources, but will not follow game objectives unless strictly instructed to do so.
SIMA is still in its early stages, but Google seems confident in its “language-driven” capabilities to help gamers. More importantly, further research is reportedly needed as Google wants SIMA to understand “higher-level linguistic instructions to accomplish more complex goals.”