Skip to content

Askdroid

Menu
  • Home
    • About Us
    • Contact us
  • AI
  • Robotics
  • Podcasts
  • News
  • Blog
Menu
  • Home
    • About Us
    • Contact us
  • AI
  • Robotics
  • Podcasts
  • News
  • Blog
Google RT 2 768x385
Previous Next
Ai Category: Robot Foundation ModelsAi Tags: chain-of-thought generalist robot policy Google DeepMind PaLI-X PaLM-E Transformer VLA
  • Profile
  • Title
  • Website URL
  • Short Description
  • Description
  • Tags
  • Company Name
  • Category
  • Country
  • License
  • Stage
  • Model Size
  • Hardware Requirement
  • API
  • Documentation
  • Paper / Publication
  • Robots Using

RT-2 (Robotics Transformer 2) is a Vision-Language-Action model published by Google DeepMind in 2023 that pioneered the modern VLA paradigm. It builds on Google’s pretrained vision-language models PaLI-X and PaLM-E (up to 55B parameters) and co-fine-tunes them on robot trajectory data alongside their original web-scale visual question answering and image captioning data. The key insight is to express robot actions as another natural-language token sequence, so the same transformer that learned to describe a kitchen can also output joint commands. Across 6,000 evaluation trials, RT-2 demonstrated significant improvements over RT-1 in generalising to novel objects, interpreting commands not present in the robot training data (such as ‘place the banana on the number 2’), and performing rudimentary semantic reasoning (‘pick up the smallest object’). With chain-of-thought prompting, it can carry out multi-step semantic plans — for example, identifying a rock as an improvised hammer or an energy drink for a tired person. RT-2 itself was not publicly released, but its design directly inspired open-source successors including OpenVLA, Ï€0, and OpenAI’s robotics work, making it the architectural template for the current VLA generation.

Google RT-2
Website URL

Vision-language-action model from Google DeepMind that co-fine-tunes large web-scale VLMs (PaLI-X, PaLM-E) with robot trajectory data. Robot actions are encoded as text tokens, allowing the model to inherit chain-of-thought reasoning and generalise to novel objects and instructions seen only on the internet.

RT-2 (Robotics Transformer 2) is a Vision-Language-Action model published by Google DeepMind in 2023 that pioneered the modern VLA paradigm. It builds on Google’s pretrained vision-language models PaLI-X and PaLM-E (up to 55B parameters) and co-fine-tunes them on robot trajectory data alongside their original web-scale visual question answering and image captioning data. The key insight is to express robot actions as another natural-language token sequence, so the same transformer that learned to describe a kitchen can also output joint commands. Across 6,000 evaluation trials, RT-2 demonstrated significant improvements over RT-1 in generalising to novel objects, interpreting commands not present in the robot training data (such as ‘place the banana on the number 2’), and performing rudimentary semantic reasoning (‘pick up the smallest object’). With chain-of-thought prompting, it can carry out multi-step semantic plans — for example, identifying a rock as an improvised hammer or an energy drink for a tired person. RT-2 itself was not publicly released, but its design directly inspired open-source successors including OpenVLA, Ï€0, and OpenAI’s robotics work, making it the architectural template for the current VLA generation.

chain-of-thought, generalist robot policy, Google DeepMind, PaLI-X, PaLM-E, Transformer, and VLA
Google DeepMind
Robot Foundation Models
United States
Research-only
Research Prototype
Up to 55B (PaLI-X variant)
Cloud-only
N/A
Documentation URL
Brohan et al., 'RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control' (2023) — arXiv:2307.15818
Demonstrated on Google's mobile manipulator (Everyday Robots platform); architectural template for OpenVLA and π0

Recent Posts

  • Versius Plus and the Gynecology Frontier: CMR Surgical’s FDA Submission and the Future of U.S. Surgical Robotics
  • Autonomous Drone Inspection in 2026: How Industrial Drones Are Replacing Human Inspectors
  • Amazon Sequoia: The Next-Generation Warehouse Robot Arriving in 2026
  • Pudu Robotics Raises 50M and Pivots to Industrial AMR Market in 2026
  • Rovex and BayCare Partner to Bring Hospital Transport Robots to Morton Plant (2026)

Recent Comments

No comments to show.

Archives

  • May 2026
  • April 2026
  • October 2024
  • August 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023

Categories

  • Blog
  • News
  • Podcast

Agriculture & Farming
AI Software & SaaS
Autonomous Systems
Aviation & Aerospace
Civil Engineering & Geospatial
Construction & Infrastructure
Defense & Security
Energy & Renewables
General Purpose & Humanoid
Hardware & Components
Healthcare & Medical
Hospitality & Wellness
Industries
Logistics & Warehousing
Manufacturing & Industrial
Product Type
Public Safety & Emergency
R&D & Developer Tools
Robotics Integration & Services
Robots & Automated Systems

Let's get in touch with us

At the intersection of innovation and technology, we are pioneers crafting a landscape for the digital age.
Please enable JavaScript in your browser to complete this form.
Name *
Loading

Contact Us

Call Us

+44 (0) 1483 870170

Email:

info@askdroid.com

Follow Us on

Copyright © 2026, Askdroid. All Rights Reserved
  • Home
    • About Us
    • Contact us
  • AI
  • Robotics
  • Podcasts
  • News
  • Blog
Change Location
Find awesome listings near you!