* docs(benchmarks): add benchmark integration guide and standardize benchmark docs
Add a comprehensive guide for adding new benchmarks to LeRobot, and
refactor the existing LIBERO and Meta-World docs to follow the new
standardized template.
* refactor(envs): move dispatch logic from factory into EnvConfig subclasses
Replace hardcoded if/elif chains in factory.py with create_envs() and
get_env_processors() methods on EnvConfig. New benchmarks now only need
to register a config subclass — no factory.py edits required.
Net -23 lines: factory.py shrinks from ~200 to ~70 lines of logic.
* docs(benchmarks): clean up adding-benchmarks guide for clarity
Rewrite for simpler language, better structure, and easier navigation.
Move quick-reference table to the top, fold eval explanation into
architecture section, condense the doc template to a bulleted outline.
* fix link
* fix task count
* fix(tests): fix 3 failing dispatch tests
- test_registry_all_types: skip non-EnvConfig stubs (e.g. TestPluginConfig)
- test_processors_delegation: use None instead of abstract PreTrainedConfig
- test_custom_get_env_processors_override: use DataProcessorPipeline for isinstance check (PolicyProcessorPipeline is a subscripted generic)
* fix: enable SmolVLA eval on LIBERO with custom camera mappings
- Thread camera_name_mapping from LiberoEnv config through to gym envs
- Sync features_map with camera_name_mapping in LiberoEnv.__post_init__
- Fix render() to use first available camera instead of hardcoded "image"
- Handle non-dict final_info in rollout by falling back to info["is_success"]
- Add use_peft legacy field to SmolVLAConfig for checkpoint compat
- Add defaults to GR00TN15Config init=False fields for transformers 5.3
Made-with: Cursor
* fix: use direct AutoresetMode import for gymnasium compat
Made-with: Cursor
* fix: handle gymnasium < 1.0 without AutoresetMode
Made-with: Cursor
* refactor: revert policy changes, keep env-only camera mapping fixes
- Revert GR00T N1.5 default_factory/default changes (transformers compat)
- Revert SmolVLA use_peft legacy field
- Apply ruff formatting fixes
- camera_name_mapping stays entirely in env/eval layer (no policy changes)
Made-with: Cursor
* Update docs/source/env_processor.mdx
Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>
* Update docs/source/env_processor.mdx
Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>
* Update docs/source/env_processor.mdx
Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>
* fix(eval): raise RuntimeError for unsupported final_info format (Gymnasium < 1.0)
Made-with: Cursor
* style: fix markdown code fences in env_processor.mdx
Made-with: Cursor
* docs: remove duplicate code blocks in env_processor.mdx
Made-with: Cursor
* style: revert quadruple backticks to triple (prettier compat)
* docs(env_processor): add EnvConfig subclass step and policy_cfg examples
- Add missing '### 2. Update Your EnvConfig Subclass' section with
get_env_processors() snippet
- Update factory usage example to show policy_cfg parameter and
keyword-argument style for both SmolVLA and ACT cases
* docs(env_processor): rename step 2 and fix policy_cfg examples
- Rename '### 2. Update the Factory' → '### 2. Update Your EnvConfig Subclass'
- Update factory usage examples to use keyword-argument style with
policy_cfg parameter for both SmolVLA and ACT cases
---------
Signed-off-by: Pepijn <138571049+pkooij@users.noreply.github.com>
Co-authored-by: Khalil Meftah <khalil.meftah@huggingface.co>
* Add basic support for PEFT adapter methods
This changes adds support for training policies with much less parameters
by applying adapter methods such as LoRA on specific parts of the policies
and therefore possibly higher learning rates / batch sizes.
To make this as accessible as possible I thought it useful to provide
defaults for `target_modules` and `modules_to_save`. Currently only SmolVLA
has such defaults but when we agree that this change is useful I will set
out to generate more such defaults. While the user can override these
settings, they are expected to only change the peft_method, rank and init_type
parameters.
* Implement loading of PEFT adapters
Loading a PEFT adapter is currently done by initializing a policy with default config
and then applying the adapter on the resulting model. This has the obvious drawback
that any configurations done during training are not applied in the adapted model.
Currently the `use_peft` attribute of `PreTrainedConfig` is only set during loading
to signal the following code that it has to deal with a PEFT adapter. However
we could imagine a scenario where this is already set at training time and stored
alongside the adapter.
* Store policy config alongside PEFT checkpoint
Before this change the PEFT-wrapped policy did not save the policy's config
alongside the adapter config / weights which prevented us from changing the
policy config. Now the policy config is saved both in full training and PEFT
training.
This change makes loading the PEFT policy adapter much easier as well.
* Add default config for ACT
* Support targets like `all-linear`
* Formatting
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix failing tests
* Remove PEFT compatibility changes in config
We'll wait for the PEFT release that fixes this for good.
* Remove `use_peft` parameter from training script
Instead we make the PEFT config optional which has the same effect.
* Log adapter config to WandB
* Better documentation for CLI arguments
* Don't unload & merge the PEFT model
This can make things hard when using quantized layers (user expects quantized base layers with
unquantized adapters for example, merging defaults to upcast the layers leading to higher
memory).
* Correct way of identifying when to save config
* Add CLI end-to-end tests
Currently there don't seem to be any way to test the CLI commands.
Since this change mostly happens in those I thought it best to add
a way to test these commands end-to-end.
More integrated commands like `lerobot-record` need patching but
standalone commands like training seem to work fine.
* Update default targets
Removed ACT since it doesn't make sense to fine-tune ACT without having it pretrained beforehand.
SmolVLA and Pi0/0.5 are much more senseful targets.
* Clean up loading code
- Centralized instantiation of the PEFT wrapper in `make_policy` for inference
(e.g. in `lerobot-record`)
- Training a PEFT policy also sets `cfg.use_peft` so that all inference code loading
the policy can rely on that attribute to identify if PEFT loading is needed
- Modified RTC example to also include PEFT policies. Mostly because this is an example
I'm currently exploring.
* Make sure push_to_hub works
Since PEFT only wraps `push_to_hub` and not `push_model_to_hub`, the reference
to `self` in `policy.push_model_to_hub` is the unwrapped policy which, of course,
doesn't know anything about PEFT.
To make the upload process aware of PEFT, we pass the unwrapped policy down to
`push_model_to_hub` as a kwarg. This is not ideal but I think it is the best way
for now.
* formatting
* Warn when encountering from-scratch-training
* Revamp pretrained model loading
There were quite a few factors that convinced me that the status quo
is able to load pretrained models from the PEFT adapter config but
in fact that didn't work.
This commit fixes the following things:
- policies wrapped in PEFT will now have a `name_or_path` attribute
containing the name or path of the pretrained model we're fine-tuning
- we further assume that SmolVLA without `pretrained_path` and
`load_vlm_weights==False` must be an user-side error
- we assume that using PEFT on from-scratch-policies must be
an user-side-error
* Make it possible to unset policy features
This is necessary to train pre-trained policies on new datasets so that the
features are inferred from the new dataset and not from the pretrained
policy.
* Use correct loading for PEFT in RTC example
* Make it possible to use PeftModels in eval
* Add test checking that PEFT actually reduces params
* Adapt state/action projections instead of full-finetuning
There doesn't seem to be a benefit to fully fine-tune these layers
over just adapting them, so we do that instead.
* Disallow PEFT training on non-pretrained policies
At first I thought it would make sense to have this feature
in case you want to fine-tune a pre-trained section but in the
end it makes more trouble than it's worth.
It's still possible to allow this in the future when a concrete
need arises.
* Add basic documentation
* Formatting
* Add peft as extra dependency, mark tests
Fast tests currently fail because of the missing dependency.
* Fix pre-commit issues
* Add walx <> peft conflict for uv
* Exclude peft from pi install for now
---------
Co-authored-by: nemo <git@ningu.net>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>
* feat: Register external policies
* ruff fix
* move policy util functions to policy factory
* refactor register_third_party_devices -> register_third_party_plugins
* feat: Update docs with bring your own policies
* Improve docs for new policies
* fix: Inconsistent quotation marks
* fix: Remove print statement
* fix: wrong base class name in documentation
* fix: Handle better how the models are parsed
* fix: precommit passing
* Update docs/source/bring_your_own_policies.mdx
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Signed-off-by: Daniel San José Pro <42489409+danielsanjosepro@users.noreply.github.com>
---------
Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>
Signed-off-by: Daniel San José Pro <42489409+danielsanjosepro@users.noreply.github.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
* chore: replace hard-coded 'action' values with constants throughout all the source code
* chore(tests): replace hard-coded action values with constants throughout all the test code
* chore: replace hard-coded OBS values with constants throughout all the source code
* chore(tests): replace hard-coded OBS values with constants throughout all the test code