ALOHA (A Low-cost Open-source Hardware system for bimanual teleoperation) and its mobile successor Mobile ALOHA are open-source data-collection platforms developed by Tony Z. Zhao, Zipeng Fu, and Chelsea Finn at Stanford University, with Mobile ALOHA published at CoRL 2024. The original ALOHA is a static bimanual puppeteering rig that lets a human operator drive two robot arms by physically manipulating two leader arms, recording 14-dimensional joint trajectories at high frequency. Mobile ALOHA mounts ALOHA on a wheeled base and tethers the operator at the waist, capturing 16-dimensional whole-body actions (14 arm joints plus 2 base velocities) suitable for kitchen and household tasks. The total bill of materials, including onboard power, computing, and a consumer-grade laptop GPU, is roughly $32,000 — about an order of magnitude cheaper than commercial platforms like the PR2 or TIAGo. Trained via straightforward supervised behaviour cloning, with co-training on existing static ALOHA datasets, the system achieves up to 90% success rates on complex long-horizon tasks (sautéing shrimp, opening two-door cabinets, calling elevators, wiping spilled wine) from as few as 50 demonstrations, making it the dominant low-cost research platform for imitation-learning research.
Stanford's low-cost ($32k) bimanual mobile teleoperation platform for collecting high-quality demonstration data. Whole-body teleoperation with 14-DoF puppeteering arms plus a wheeled base. With 50 demonstrations and co-training on static ALOHA data, achieves up to 90% success rates on tasks like sautéing or wiping wine.
ALOHA (A Low-cost Open-source Hardware system for bimanual teleoperation) and its mobile successor Mobile ALOHA are open-source data-collection platforms developed by Tony Z. Zhao, Zipeng Fu, and Chelsea Finn at Stanford University, with Mobile ALOHA published at CoRL 2024. The original ALOHA is a static bimanual puppeteering rig that lets a human operator drive two robot arms by physically manipulating two leader arms, recording 14-dimensional joint trajectories at high frequency. Mobile ALOHA mounts ALOHA on a wheeled base and tethers the operator at the waist, capturing 16-dimensional whole-body actions (14 arm joints plus 2 base velocities) suitable for kitchen and household tasks. The total bill of materials, including onboard power, computing, and a consumer-grade laptop GPU, is roughly $32,000 — about an order of magnitude cheaper than commercial platforms like the PR2 or TIAGo. Trained via straightforward supervised behaviour cloning, with co-training on existing static ALOHA datasets, the system achieves up to 90% success rates on complex long-horizon tasks (sautéing shrimp, opening two-door cabinets, calling elevators, wiping spilled wine) from as few as 50 demonstrations, making it the dominant low-cost research platform for imitation-learning research.
