Google DeepMind Launches Robotics AI Capable of “Thinking” Before Acting
MOUNTAIN VIEW, CA - Google DeepMind has announced Gemini robotics 1.5, a new artificial intelligence system designed to imbue robots with the ability to reason and plan before executing tasks - a capability researchers describe as “thinking.” Alongside this, the company is releasing an “Embodied reasoning” (ER) model, now available in Google AI Studio, allowing developers to generate instructions for robots.
This advancement marks a notable leap toward more versatile and adaptable robots. Previously, AI models required bespoke programming for each robotic platform.Gemini Robotics 1.5, built upon the Gemini foundation models and fine-tuned for physical environments, can transfer skills learned on one robot – such as DeepMind’s two-armed Aloha 2 – to another, like the humanoid Apollo, without specialized adjustments. This “agentic capability” promises to streamline robotics development and unlock more complex, multi-stage tasks.
The system operates through a collaborative process. Gemini Robotics 1.5, the “action model,” receives instructions from the ER model and utilizes visual input to guide its movements. Crucially, it also independently considers the best approach to each step. “There are all these kinds of intuitive thoughts that help [a person] guide this task, but robots don’t have this intuition,” explained kanishka Rao of DeepMind. “One of the major advancements that we’ve made with 1.5 in the VLA is its ability to think before it acts.”
While gemini Robotics 1.5, the robot control model, remains available only to trusted testers, the release of the ER model in Google AI Studio opens the door for developers to experiment with robotic instruction generation. This development signals a move toward more elegant robotic systems capable of handling increasingly complex real-world scenarios, though widespread consumer applications like automated household chores remain a future goal.