MolmoAct2: The First Fully Open Robot Controller That Beats Closed-Source Giants

A robot that can fold laundry, pack medication, and pour tea - controlled by a single model - sounds like science fiction. But it’s exactly what’s needed for real deployment. The problem? The best robot controllers are either closed-source (π0.5), too slow (reasoning models that generate hundreds of tokens before moving), or tied to hardware most labs can’t afford. MolmoAct2 (Fang, Duan et al., Allen AI / UW / Stanford / NVIDIA / MIT, May 2026) solves all five problems at once: it’s fully open (weights, code, data), runs at 55.79 Hz, deploys on platforms costing under $6,000, and achieves 97.2% success on LIBERO - beating every open and closed baseline. The secret? Let the robot’s action generator peek into the language model’s brain at every layer, not just the final output. ...

May 10, 2026

Utonia: One Encoder For All Point Clouds

A LiDAR on a self-driving car, a depth camera in a home robot, a satellite scanner, and a CAD model from a 3D printer — each produces a point cloud point cloud A set of 3D points (x, y, z) representing the shape of an object or scene. Each point can carry additional attributes: color, normal, intensity. , but with radically different density, scale, and geometry. Until now, each domain required its own model. The paper “Utonia: Toward One Encoder for All Point Clouds” breaks this pattern — one encoder, 137M parameters, five domains, and emergent behaviors nobody expected. ...

March 7, 2026

Green-VLA: One AI Brain for All Robots

The quest for a universal robot—one that can seamlessly switch between tasks, platforms, and environments—has long been the holy grail of robotics research. The paper “Green-VLA: Staged Vision-Language-Action Model for Generalist Robots” brings us closer to that vision with a revolutionary five-stage training framework that enables a single policy to control humanoids, mobile manipulators, and fixed-base robotic arms alike. The Problem: One Robot, Many Bodies Today’s robotic systems are typically specialists. A robotic arm in a factory excels at assembly but cannot navigate a warehouse. A mobile robot can move around but lacks fine manipulation skills. Training a separate AI for each type of robot is expensive, time-consuming, and fundamentally limits scalability. ...

February 8, 2026

ASkDAgger: How Artificial Intelligence Learns More Effectively by Asking Questions

In a world where robots and AI systems increasingly learn through observation and interaction with humans, the efficiency of this process remains a key challenge. Traditional Imitation Learning methods often require a human teacher to constantly supervise and correct errors, which is time-consuming and costly. A team of researchers led by Jelle Luijkx proposes a groundbreaking solution in their latest paper, “ASkDAgger: Active Skill-level Data Aggregation for Interactive Imitation Learning.” ...

August 8, 2025