Pepijn eba3ab3741 annotate: address review feedback — bug fixes, docs/code drift, naming, cleanup
Bugs
  * validator: don't re-raise on unknown style. The second column_for_style
    lookup (used to route persistent vs event) now sits in try/except so an
    unknown style is recorded by _check_column_routing and skipped instead
    of crashing the whole validation pass.
  * general_vqa._target_cameras: when restrict_to_default_camera is set but
    the configured camera_key isn't one the provider exposes, warn and fall
    back to all cameras instead of returning a phantom key that KeyErrors
    deep in frame decode.
  * interjections: clamp interjection timestamps to frame_timestamps[0]
    rather than a hardcoded 0.0 (datasets can start at non-zero t).

Docs / code drift
  * annotation_pipeline.mdx: drop the phantom 'vocabulary discovery / phase
    0 / --vocabulary.* / canonical_vocabulary.json' section (none of it
    exists); describe the real describe->segment + coverage-stitch flow.
    Soften the src/lerobot/tools/ + TOOL_REGISTRY reference to 'not part of
    this PR' (matches tools.mdx, which already marks the runtime layer as
    not-yet-implemented). Fix the --push_to_hub/--new_repo_id wording. Note
    the default is now a single h200. Add a 'Contributing new modules'
    section inviting module / prompt / quality contributions.
  * executor docstring: six phases, no phantom phase 0.

run_hf_job.py
  * add the Apache 2.0 license header (was flagged repeatedly).
  * default to a single GPU: flavor=h200, parallel_servers=1, num_gpus=1
    (scale to h200x4 noted in the docstring).
  * pin the install to @main instead of the feature branch (won't break
    after merge).

Naming / cleanup
  * rename dest_repo_id -> new_repo_id across config / script / example /
    test to match the LeRobot dataset edit tools.
  * rename prompt templates module_N_*.txt -> descriptive (plan_*,
    interjections_*, vqa.txt) and update every load_prompt() call.
  * remove dead _messages_to_prompt (used only by the removed in-process
    backends).
  * declare _warned_decode_fail (frames) and _warned_no_camera (vqa) as
    real init=False dataclass fields instead of getattr monkey-patches.
  * scope bandit B607 to the two ffmpeg subprocess.run sites via
    '# nosec B607' and drop it from the global skip list.

Tests
  * fix stale canned-VLM markers ('ONE realistic interruption' ->
    'compact interjection', 'Update the memory' -> 'compressed semantic
    memory') and drop the dead 'concise hierarchical PLAN' plan responders
    (plan generation is deterministic now) in run_e2e_smoke,
    test_pipeline_recipe_render, test_modules.
  * run_e2e_smoke now asserts interjection + speech rows are produced so a
    stale marker can't silently pass again.
  * drop remaining 'PR 1' / 'PR 2' references from test comments / names.

Verified: tests/annotations + tests/datasets/test_language +
tests/scripts/test_lerobot_annotate (31 passed); make-style E2E smoke
(interjections=1 speech_atoms=2); pre-commit (ruff, mypy, bandit,
prettier) clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-03 18:30:46 +02:00
2025-03-13 14:05:55 +01:00
2025-06-05 17:48:43 +02:00
2026-04-06 12:23:37 +02:00
2026-02-28 14:41:28 +01:00
2024-03-25 12:28:07 +01:00
2026-01-16 14:38:42 +01:00

LeRobot, Hugging Face Robotics Library

Tests Tests Python versions License Status Version Contributor Covenant Discord

LeRobot aims to provide models, datasets, and tools for real-world robotics in PyTorch. The goal is to lower the barrier to entry so that everyone can contribute to and benefit from shared datasets and pretrained models.

🤗 A hardware-agnostic, Python-native interface that standardizes control across diverse platforms, from low-cost arms (SO-100) to humanoids.

🤗 A standardized, scalable LeRobotDataset format (Parquet + MP4 or images) hosted on the Hugging Face Hub, enabling efficient storage, streaming and visualization of massive robotic datasets.

🤗 State-of-the-art policies that have been shown to transfer to the real-world ready for training and deployment.

🤗 Comprehensive support for the open-source ecosystem to democratize physical AI.

Quick Start

LeRobot can be installed directly from PyPI.

pip install lerobot
lerobot-info

Important

For detailed installation guide, please see the Installation Documentation.

Robots & Control

Reachy 2 Demo

LeRobot provides a unified Robot class interface that decouples control logic from hardware specifics. It supports a wide range of robots and teleoperation devices.

from lerobot.robots.myrobot import MyRobot

# Connect to a robot
robot = MyRobot(config=...)
robot.connect()

# Read observation and send action
obs = robot.get_observation()
action = model.select_action(obs)
robot.send_action(action)

Supported Hardware: SO100, LeKiwi, Koch, HopeJR, OMX, EarthRover, Reachy2, Gamepads, Keyboards, Phones, OpenARM, Unitree G1.

While these devices are natively integrated into the LeRobot codebase, the library is designed to be extensible. You can easily implement the Robot interface to utilize LeRobot's data collection, training, and visualization tools for your own custom robot.

For detailed hardware setup guides, see the Hardware Documentation.

LeRobot Dataset

To solve the data fragmentation problem in robotics, we utilize the LeRobotDataset format.

  • Structure: Synchronized MP4 videos (or images) for vision and Parquet files for state/action data.
  • HF Hub Integration: Explore thousands of robotics datasets on the Hugging Face Hub.
  • Tools: Seamlessly delete episodes, split by indices/fractions, add/remove features, and merge multiple datasets.
from lerobot.datasets.lerobot_dataset import LeRobotDataset

# Load a dataset from the Hub
dataset = LeRobotDataset("lerobot/aloha_mobile_cabinet")

# Access data (automatically handles video decoding)
episode_index=0
print(f"{dataset[episode_index]['action'].shape=}\n")

Learn more about it in the LeRobotDataset Documentation

SoTA Models

LeRobot implements state-of-the-art policies in pure PyTorch, covering Imitation Learning, Reinforcement Learning, and Vision-Language-Action (VLA) models, with more coming soon. It also provides you with the tools to instrument and inspect your training process.

Gr00t Architecture

Training a policy is as simple as running a script configuration:

lerobot-train \
  --policy=act \
  --dataset.repo_id=lerobot/aloha_mobile_cabinet
Category Models
Imitation Learning ACT, Diffusion, VQ-BeT, Multitask DiT Policy
Reinforcement Learning HIL-SERL, TDMPC & QC-FQL (coming soon)
VLAs Models Pi0Fast, Pi0.5, GR00T N1.5, SmolVLA, XVLA

Similarly to the hardware, you can easily implement your own policy & leverage LeRobot's data collection, training, and visualization tools, and share your model to the HF Hub

For detailed policy setup guides, see the Policy Documentation. For GPU/RAM requirements and expected training time per policy, see the Compute Hardware Guide.

Inference & Evaluation

Evaluate your policies in simulation or on real hardware using the unified evaluation script. LeRobot supports standard benchmarks like LIBERO, MetaWorld and more to come.

# Evaluate a policy on the LIBERO benchmark
lerobot-eval \
  --policy.path=lerobot/pi0_libero_finetuned \
  --env.type=libero \
  --env.task=libero_object \
  --eval.n_episodes=10

Learn how to implement your own simulation environment or benchmark and distribute it from the HF Hub by following the EnvHub Documentation

Resources

Citation

If you use LeRobot in your project, please cite the GitHub repository to acknowledge the ongoing development and contributors:

@misc{cadene2024lerobot,
    author = {Cadene, Remi and Alibert, Simon and Soare, Alexander and Gallouedec, Quentin and Zouitine, Adil and Palma, Steven and Kooijmans, Pepijn and Aractingi, Michel and Shukor, Mustafa and Aubakirova, Dana and Russi, Martino and Capuano, Francesco and Pascal, Caroline and Choghari, Jade and Moss, Jess and Wolf, Thomas},
    title = {LeRobot: State-of-the-art Machine Learning for Real-World Robotics in Pytorch},
    howpublished = "\url{https://github.com/huggingface/lerobot}",
    year = {2024}
}

If you are referencing our research or the academic paper, please also cite our ICLR publication:

ICLR 2026 Paper
@inproceedings{cadenelerobot,
  title={LeRobot: An Open-Source Library for End-to-End Robot Learning},
  author={Cadene, Remi and Alibert, Simon and Capuano, Francesco and Aractingi, Michel and Zouitine, Adil and Kooijmans, Pepijn and Choghari, Jade and Russi, Martino and Pascal, Caroline and Palma, Steven and Shukor, Mustafa and Moss, Jess and Soare, Alexander and Aubakirova, Dana and Lhoest, Quentin and Gallou\'edec, Quentin and Wolf, Thomas},
  booktitle={The Fourteenth International Conference on Learning Representations},
  year={2026},
  url={https://arxiv.org/abs/2602.22818}
}

Contribute

We welcome contributions from everyone in the community! To get started, please read our CONTRIBUTING.md guide. Whether you're adding a new feature, improving documentation, or fixing a bug, your help and feedback are invaluable. We're incredibly excited about the future of open-source robotics and can't wait to work with you on what's next—thank you for your support!

SO101 Video

Built by the LeRobot team at Hugging Face with ❤️
Description
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
Readme Multiple Licenses 7.1 GiB
Languages
Python 99.9%