Gemini 2.5 Pro is Google DeepMind’s flagship multimodal model, originally released in March 2025 as a ‘thinking’ model that reasons
GPT-4o (‘Omni’) allows robots to see, hear, and speak in real-time. It acts as the high-level reasoning layer for mission
ORB-SLAM3 is the third-generation visual SLAM system from the SLAMLab at the University of Zaragoza, released in 2020 under GPLv3.
Piper is a fast, local neural text-to-speech (TTS) system originally developed by the Rhasspy / Open Home Foundation community. Active
Whisper is an open-source automatic speech recognition (ASR) and speech translation model released by OpenAI in September 2022 under the
Hailo Technologies is an Israeli edge-AI chip company founded in 2017 that designs purpose-built neural processing units for embedded vision
NVIDIA Jetson is a family of embedded AI computing modules purpose-built for running deep-learning inference at the edge. The current
Genesis is a comprehensive open-source physics simulation platform announced in December 2024 by Carnegie Mellon University in collaboration with more
Helix is Figure’s proprietary on-board Vision-Language-Action model, first announced in February 2025 and significantly upgraded as Helix 02 in January
RT-2 (Robotics Transformer 2) is a Vision-Language-Action model published by Google DeepMind in 2023 that pioneered the modern VLA paradigm.
MuJoCo (Multi-Joint dynamics with Contact) is a general-purpose physics engine designed for fast, accurate simulation of articulated systems — particularly
NVIDIA Isaac Sim is a reference application built on the NVIDIA Omniverse platform that lets developers design, simulate, test, and












