Sign in
Modern Non-LLM AI

Robotics and Embodied AI

The unique challenges of teaching AI to act in the physical world

What it is

Robotics is uniquely hard for AI because: data is scarce (no internet of robot demonstrations), the real world is continuous and high-dimensional (vs. discrete tokens), simulation-to-reality transfer is imperfect, and physical actions have irreversible consequences.

LLM integration has improved generalization dramatically. Models like RT-2 and Google's PaLM-E use pre-trained vision-language models as the "brain" of the robot, enabling generalization to novel objects and instructions without task-specific training.

Humanoid robots (Figure, 1X, Unitree) are receiving significant investment because human-form robots can operate in human-designed environments and use human-collected demonstration data more directly.

Why it matters

Robotics represents one of the highest-impact potential applications of AI, physically capable, general-purpose robots would transform labor markets, manufacturing, and elder care. The technical challenges and the integration with LLMs are directly relevant to understanding the current AI landscape and where investment is flowing.

Resources

RT-2: New Model Translates Vision and Language into Action
deepmind.google· The definitive first-party explainer of RT-2 (vision-language-action models). Covers how web-scale training transfers to robot control, chain-of-thought reasoning for robots, and what this means for general-purpose robotics. Essential.
10 min
Precision Home Robots Learn with Real-to-Sim-to-Real
news.mit.edu· Accessible MIT press release explaining sim-to-real transfer through the RialTo framework. Covers digital twins, reinforcement learning in simulation, and real-world deployment. Good concrete example of the sim-to-real concept.
8 min
What is RT-2? Google DeepMind's Vision-Language-Action Model for Robotics
blog.google· More accessible consumer-facing version of the RT-2 announcement. Good for recruits who want the "so what" without deep technical details.
6 min