Stores OpenAI-style function schemas at ``meta/info.json["tools"]`` so
datasets can declare which tools are available (today: just ``say``;
tomorrow: per-dataset extensions). The ``DEFAULT_TOOLS`` constant
fills in for unannotated datasets so chat-template consumers don't
have to special-case anything.
Three pieces:
- ``language.py``: ``SAY_TOOL_SCHEMA`` and ``DEFAULT_TOOLS``
constants. Single source of truth — PR 2's writer and PR 3's
runtime tool registry will both import from here instead of
duplicating the dict.
- ``dataset_metadata.py``: ``LeRobotDatasetMetadata.tools`` property
reads ``info.json["tools"]`` and falls back to ``DEFAULT_TOOLS``.
Returns deep-copied dicts so callers can mutate the result safely.
- ``docs/source/tools.mdx``: spec page covering the catalog, per-row
invocations, and the three-step "how to add a new tool" workflow
(declare schema, implement, register). Linked from the docs
toctree under the Datasets section.
This lays the groundwork for PR 2's pipeline writing the catalog out
during annotation, and PR 3's ``src/lerobot/tools/`` package shipping
runnable implementations (one file per tool — first up:
``say.py`` wrapping Kyutai's pocket-tts).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds task-prompt diversity (Xiao 2022 / CAST) without touching
``meta/tasks.parquet`` or forcing recipes to opt in. The plan reserved
``task_aug`` as a future style; this lands it now.
- ``language.py``: add ``task_aug`` to ``CORE_STYLES`` and
``PERSISTENT_STYLES``. ``column_for_style("task_aug")`` returns
``language_persistent`` so PR 2 writers route it correctly.
- ``language_render.py``: ``_resolve_task`` now consults the persistent
slice for rows of ``style="task_aug", role="user"``. When any exist
it picks one deterministically by ``sample_idx`` (blake2b-keyed, not
Python's randomized hash) so an epoch sees every rephrasing of every
episode while the same sample still resolves identically across
reruns. Falls back to the canonical ``meta/tasks.parquet`` task when
no rephrasings are present, so existing datasets and unannotated runs
keep their behaviour. Explicit ``task=`` overrides still win.
- Tests: rephrasing coverage across samples, determinism on repeat
``sample_idx``, fallback when persistent has no ``task_aug`` rows,
and explicit override priority.
Recipes get this for free: any ``${task}`` placeholder rotates through
the available rephrasings. Recipes that want the literal canonical task
can override the binding.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Motion primitives are described in robot-frame (joint / Cartesian) terms,
not pixel space, so they are camera-agnostic. Only `vqa` (event) and
`trace` (event, pixel-trajectory) are view-dependent.
The `camera` field stays on PERSISTENT_ROW_FIELDS for schema symmetry —
the validator, resolver, and HF feature mapping behave identically across
the two columns regardless of which styles populate `camera` today —
but persistent rows now always have `camera=None` in practice.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a nullable `camera` field to the language row struct (both persistent
and event variants) so view-dependent styles like `vqa` can carry which
`observation.images.*` view they were grounded against. Without this,
multi-camera datasets ended up with multiple `(vqa, role)` rows at the
same timestamp that the resolver could not disambiguate.
- `language.py`: add `camera` to PERSISTENT_ROW_FIELDS / EVENT_ROW_FIELDS,
to both Arrow struct types and the HF datasets feature mappings;
introduce VIEW_DEPENDENT_STYLES = {vqa, motion, trace} plus
`is_view_dependent_style` and `validate_camera_field` helpers (camera
required iff style is view-dependent).
- `language_render.py`: thread an optional `camera=` kwarg through every
resolver (`active_at`, `emitted_at`, `nth_prev`, `nth_next`) and through
`_matching_rows` / `_select_*`, so recipes can disambiguate per-camera
VQA with `emitted_at(t, style=vqa, role=assistant, camera=...)`.
Without a `camera` filter, multi-row matches keep raising the existing
ambiguity error — which is the desired behaviour on multi-camera data.
- `recipes/pi05_hirobot.yaml`: replace the single `ask_vqa` branch with
`ask_vqa_top` and `ask_vqa_wrist` per-camera sub-recipes (each carrying
the matching image block), keeping the original 0.20 budget and
documenting the customization point for datasets with different cameras.
- Tests: schema test asserts the new field order; new tests cover
`is_view_dependent_style`, `validate_camera_field` (both required and
forbidden directions), per-camera `emitted_at` filtering, and the
ambiguity error when two cameras emit `(vqa, assistant)` at the same
timestamp without a `camera=` filter. RenderMessagesStep + dataset
passthrough fixtures updated to include the new field.
- `docs/source/language_and_recipes.mdx`: document the `camera` field,
the per-camera resolver pattern, and the canonical recipe convention.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Promote the previously-reserved motion/trace styles to first-class core
styles. motion routes to language_persistent (it tracks robot state over
time); trace routes to language_events (single-moment annotations).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Covers private helpers in recipe.py, language.py, language_render.py,
and render_messages_processor.py. Also reverts uv.lock to main (it was
re-generated by `uv run` during local checks).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(one shot load): adding metadata loading when reading from a dataset after writing
* refactor(one shot load): move metadata reload to ensure_readable() on LeRobotDatasetMetadata
Move the metadata reload from DatasetReader.load_and_activate() to a new
public ensure_readable() method on LeRobotDatasetMetadata, called from
LeRobotDataset._ensure_reader(). This places lifecycle management in the
right layer: metadata owns its readiness check, the dataset orchestrates
the write-to-read transition, and the reader stays clean.
Also adds a regression test using delta_timestamps to exercise the
meta.episodes access path in the create -> write -> finalize -> read flow.
Co-authored-by: Steven Palma <imstevenpmwork@users.noreply.github.com>
---------
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Steven Palma <imstevenpmwork@users.noreply.github.com>
* add: a flexible transformation registry
* fix: image transforms can be set both at init and after
* add: tests
* fix: take in review
* feat(datasets): add image transform setters
* fix: pre-commit
* fix: CI
---------
Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
* Add option for pi family models to train with relative actions (relative to state)
* formatting
* add recomputation of stats and option to compute delta stats
* normalzie after delta conversion
* only recompute state for stats
* calulate chunk based stats
* sample 100k
* load from parquet
* sample 1m
* stats per chunck
* fix
* use quantiles
* stats for entire dataset
* fix
* max 1m frames
* compute before dist
* fix multi gpu processor bug
* Fix RTC with delta actions and OpenArms motor_type wiring
* feat: align pi0_fast delta actions with pi0/pi05 and add RTC integration tests
- Add delta_exclude_joints and action_feature_names to PI0FastConfig
- Move to_absolute_actions from modeling to processor pipeline for pi0_fast
- Add delta action detection and logging to eval_with_real_robot.py
- Add delta actions documentation to pi0 and pi05 READMEs
- Fix ruff lint issues in test_delta_actions.py
- Add test_rtc_delta_actions.py (24 tests) covering:
- ActionQueue with delta vs absolute actions
- RTC denoise step with delta leftovers
- Full pipeline roundtrip (delta → RTC → absolute)
- State rebasing approximation bounds
- Non-delta policy compatibility
- Multi-chunk consistency
* chore: clean up test comments, add OpenPI attribution, remove debug logging
- Replace decorative comment separators in test files with plain section headers
- Add attribution comments for 1e-6 epsilon in normalize_processor.py (from OpenPI)
- Remove debug logging blocks from lerobot_train.py
* refactor: extract compute_delta_action_stats into compute_stats.py
Move the ~70-line inline delta action stats block from lerobot_train.py
into a dedicated function in compute_stats.py, where all other stats
computation already lives. The training script now calls it in 6 lines.
* refactor: remove unused get_processed_left_over from ActionQueue
This method was never called outside of tests. Leftover actions for RTC
guidance are always retrieved via get_left_over() (delta/original space).
* revert: remove logging-only changes from eval_with_real_robot.py
The delta actions detection helper and log message added no functional
value — the script already handles delta policies correctly via the
processor pipeline.
* refactor: use ACTION/OBS_STATE constants instead of hardcoded strings
Replace hardcoded "action" and "observation.state" with ACTION and
OBS_STATE from utils.constants in compute_stats.py, dataset_tools.py,
and lerobot_train.py.
* style: remove stray blank lines in training loop
* refactor: move delta action stats to preprocessing step, remove on-the-fly computation
- Remove on-the-fly compute_delta_action_stats from lerobot_train.py
- Rewrite recompute_stats to delegate action stats to compute_delta_action_stats
(chunk-based sampling matching what the model sees during training)
- Add chunk_size parameter to recompute_stats for delta action computation
- Add delta actions documentation to pi0.mdx and pi05.mdx
* feat: add recompute_stats CLI operation to lerobot-edit-dataset
* fix(tests): relax quantile normalization test tolerance for 1e-6 epsilon
* chore: remove agents_memory/pr_details.md from repo
* refactor: rename delta actions to relative actions throughout
What OpenPI calls "DeltaActions" is actually UMI's "relative trajectory"
representation: each action in the chunk is an offset from the current
state, not from the previous action. This avoids error accumulation.
Renamed across all source, tests, docs, and CLI:
- DeltaActionsProcessorStep → RelativeActionsProcessorStep
- to_delta_actions → to_relative_actions
- use_delta_actions → use_relative_actions
- delta_exclude_joints → relative_exclude_joints
- compute_delta_action_stats → compute_relative_action_stats
- delta_action_processor.py → relative_action_processor.py
- test_delta_actions.py → test_relative_actions.py
Kept as-is: AbsoluteActionsProcessorStep (converts TO absolute),
registry ID "delta_actions_processor" (backward compat), and unrelated
delta references (IK pipeline, Robosuite, RA-BC metrics, gym envs).
* docs: add Action Representations guide
Dedicated page explaining absolute, relative, and delta actions with
numerical examples, joint vs EE space, and how to use kinematics
pipelines and the relative action processor. References UMI paper
(Chi et al., 2024) for the terminology.
* docs: remove redundant OpenPI naming note from action representations
* docs: remove opinionated OpenPI reference from delta actions section
* docs: replace ASCII diagram with UMI paper figure
* docs: remove OpenPI reference from action representations
* docs: use HF-hosted image instead of local asset
* docs: clarify figure attribution
* revert: restore original normalization epsilon behavior
The 1e-6 unconditional epsilon change perturbed all normalized values,
breaking backward compatibility tests. The original approach (1e-8 eps
for MEAN_STD, conditional torch.where for QUANTILES) already handles
division by zero correctly without affecting non-degenerate cases.
* fix: restore delta_action_processor.py used by phone/RL teleop
The rename commit incorrectly deleted delta_action_processor.py and
duplicated its classes into relative_action_processor.py. Restore the
original file and import from it instead.
* fix(processor): address PR #2970 review comments
- Remove shebang from relative_action_processor.py (library module, not script)
- Add device alignment in to_relative_actions/to_absolute_actions so _last_state
on CPU doesn't cause cross-device errors when actions are on CUDA
- Rename delta_step → relative_step in AbsoluteActionsProcessorStep for naming
consistency; update factory.py, all processor files, and tests
- Expand _reconnect_relative_absolute_steps docstring to explain why post-hoc
rewiring is needed after deserialization
- Fix off-by-one in compute_stats.py: sample_upper_bound = total_frames - chunk_size + 1
so last valid start index is included and total_frames == chunk_size is not rejected
- Remove redundant NOTE comment in processor_pi05.py (duplicated two lines below)
- Fix pi0_fast processor ordering: move relative_step before NormalizerProcessorStep
so normalizer sees delta actions (matching pi0/pi05); flip postprocessor to
unnormalize → absolute accordingly. Relative stats are now required for all pi models
- Revert use_relative_joint_actions_aloha → use_delta_joint_actions_aloha in
configuration_smolvla.py (preserve existing public API)
- Update action_representations.mdx: add missing joint to 6-DOF example, fix
'based on a figure', clarify pi family ordering, add RTC compatibility section
* update rtc link
* feat: compute relative action stats over full dataset with optional parallelism
Remove the 100k sample cap from compute_relative_action_stats and process
all valid chunks. Vectorize with numpy (pre-load actions/states, fancy
indexing + broadcasting) for a large speedup over the per-index HF dataset
loop. Add num_workers param for thread-based parallelism (numpy releases
the GIL). Update docs to show --push_to_hub for recompute_stats.
* style: apply ruff formatting to compute_stats.py
* testing on real robot
* style: fix ruff format and remove redundant .keys() calls
* fix(datasets): remove unreachable timestamp branch in add_frame and document caller contract
- Remove dead code: frame.pop("timestamp") branch in add_frame() could never
execute because validate_frame() raises ValueError for any DEFAULT_FEATURES
key (including timestamp) before we reach that line.
- Expand add_frame() docstring: explicitly document that timestamp and
frame_index must NOT be passed by the caller.
- Add explanatory comment in validate_frame(): clarifies why DEFAULT_FEATURES
are excluded from expected_features, preventing future re-introduction of
the dead branch.
The dead branch originated in #1200, which fixed a shape-(1,) mismatch for a
code path that was subsequently made unreachable by a refactor of validate_frame.
* chore(datasets): narrow PR scope
* fix(datasets): move add_frame timestamp cleanup to dataset_writer
* refactor(dataset): enhance dataset root directory handling and introduce hub cache support
- Updated DatasetConfig and LeRobotDatasetMetadata to clarify root directory behavior and introduce a dedicated hub cache for downloads.
- Refactored LeRobotDataset and StreamingLeRobotDataset to utilize the new hub cache and improve directory management.
- Added tests to ensure correct behavior when using the hub cache and handling different revisions without a specified root directory.
* refactor(dataset): improve root directory handling in LeRobotDataset
- Updated LeRobotDataset to store the requested root path separately from the actual root path.
- Adjusted metadata loading to use the requested root, enhancing clarity and consistency in directory management.
* refactor(dataset): minor improvements for hub cache support
* chore(datasets): guard in resume + assertion test
---------
Co-authored-by: AdilZouitine <adilzouitinegm@gmail.com>
Co-authored-by: mickaelChen <mickael.chen.levinson@gmail.com>
* refactor(dataset): split reader and writer
* chore(dataset): remove proxys
* refactor(dataset): better reader & writer encapsulation
* refactor(datasets): clean API + reduce leaky implementations
* refactor(dataset): API cleaning for writer, reader and meta
* refactor(dataset): expose writer & reader + other minor improvements
* refactor(dataset): improve teardown routine
* refactor(dataset): add hf_dataset property at the facade level
* chore(dataset): add init for datasset module
* docs(dataset): add docstrings for public API of the dataset classes
* tests(dataset): add tests for new classes
* fix(dataset): remove circular dependecy
* chore(docstrings): updating v2.1-v3.0 conversion script docstrings to match the new task label
* chore(task): renamming the default index label in the tasks DataFrame to task
* Revert "chore(docstrings): updating v2.1-v3.0 conversion script docstrings to match the new task label"
This reverts commit f55de3255278f23f18b5d955565f6768d094951d.
* chore(docstrings): updating docstrings to match dataset v3.0 architecture
* chore(format): formatting code
* Fixing metadata indexing when writing new Parquet file
Summary:
- addressing this issue: https://github.com/huggingface/lerobot/issues/2401
- vibe-coded bugfix by Claude Sonnet 4.5
* Backing out changes to convert_videos_of_camera
* Addressing Ruff pre-commit complaint
Summary:
- addressing "SIM113 Use `enumerate()` for index variable `ep_idx` in `for` loop"
---------
Co-authored-by: Paul <238953601+pac-robotics@users.noreply.github.com>
* fix(root): adding proper support for the root and new_root arguments
* feat(roots): adding a roots agrument for the merge operation
* chore(clean): cleaning up code
* chore(doctrings): updating doctrings with new features
* fix(repo_id): setting repo_id to None when not needed
* fix(roots/repo_ids): making mypy happy by using repo_ids and roots for merge operation
* fix(path): fixing path related issues
* fix(repo_id): fixing issues related to repo_id
* chore(doctrings): updating docstrings + fix typo
* chore(clean): cleaning code
* fix(split new_repo_id): reverting new_repo_id addition for split operation
* docs(dosctrings): completing docstrings
* fix(repo_ids/roots): improving checks for repo_ids/roots lengths
* fix(repo_ids): making repo_ids optional in MergeConfig but raise if not given
* fix(docstrings): fixing docstrings for split operation
* fix(hints): updating get_output_path hints to accept paths as strings too
* fix(y/N prompts): removing y/N prompts in lerobot_edit_dataset
* fix(merge repo_id): fixing merge operation to use new_repo_id instead of repo_id
* fix(typo): fixing typo in doctrings
Replaced assert statements with FrameTimestampError exceptions in
decode_video_frames_torchvision and decode_video_frames_torchcodec.
Assertions are unsuitable for runtime validation because they can be
silently disabled with python -O, and they produce unhelpful
AssertionError tracebacks. The codebase already defines
FrameTimestampError for this exact purpose but it was only used
in one of the three validation sites.
Also removed AssertionError from the except clause in
LeRobotDataset.__init__, which was masking video timestamp errors
by silently triggering a dataset re-download instead of surfacing
the actual problem.
* fix(dataset): Reindex videos based on frame and not on time
Sometimes during split operations the frame timestamp floating
precision leads to frame ending up in the wrong split.
This changes fixes the issues by directly working with frame indices
instead.
* Fix formatting
* feat(datasets): add modify_tasks function for in-place task editing
Add a new utility function to modify tasks in LeRobotDataset in-place.
This allows users to:
- Set a single task for all episodes
- Set specific tasks for individual episodes
- Combine a default task with per-episode overrides
* feat(edit-dataset): add CLI support for modify_tasks operation
Integrate the modify_tasks function into lerobot_edit_dataset CLI.
Users can now modify dataset tasks via command line:
Supports setting a default task, per-episode tasks, or both combined.
* test(datasets): add tests for modify_tasks function
Add comprehensive test coverage for the modify_tasks utility:
- Single task for all episodes
- Episode-specific task assignment
- Default task with per-episode overrides
- Error handling for missing/invalid arguments
- Verification of task_index correctness
- In-place modification behavior
- Metadata preservation
* respond to copilot review
* Fix aggeregation of datasets when subdatasets are already a result of a previous merge
* docstring
* respond to copilot review + add regression test
* Remove unnecessary int conversion for indicies
* improve image2video
* add episodes video encoding
* fix mypy failing
* iterate on review
* nit
* remove max, and let it be optional
* iterate more
* update docs
* fix test
---------
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
* fix: use features when aggregating image based datasets
* add: test asserting for data type
* add: features param to writing dataset
---------
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>