Video-Action Models: Are video model backbones the future of VLAs?

This blog post is about mimic-video, our latest mimic release in which we instantiate a new class of Video-Action Models (VAM), grounding robotic policies in pretrained video models. We argue that video model backbones can be a much more natural choice for robotics foundation model pre-training compared to VLM backbones. Do you want to work with us on this and more? We just raised $16M and are actively hiring! Preface Vision-Language-Action Models (VLAs) have taken the robotics world by storm over the past two years....

January 6, 2026 · 11 min · Elvis Nava

Robotics in the era of the Scaling Hypothesis

This piece is my attempt to collect my thoughts on the current landscape in AI and Robotics, how this informs our approach at mimic in our quest towards solving general-purpose robotic dexterity, and what I think the future of the field will look like. Are you excited about this as well? We are hiring. Preface It was mid-2021 and I had just begun my PhD at ETH Zurich when I came across a blog post titled “The Scaling Hypothesis”....

October 29, 2024 · 12 min · Elvis Nava