Add OpenPi, Pi0 and Pi0.5 (#1910)
* initial commit
* change device in test
* do detailed import
* adhere to python 3.11 syntax
* fix autodocstring
* additionally
* do same in other files
* add model. prefix to all keys in state dict
* use dummy stats
* add pi05
* also shorten action_steps
* fix test
* all test pass! and fix tokenizer max length between 05 and 0
* remove test
* fix transformer dependency
* fix test
* split pi0 and pi05 policy in seperate files
* fix test
* fix push to hub test
* add some comments, license and readme
* remove warning in config
* add pi05 to factory
* remove check
* rename action_horizon to chunk_size
* clean up padding of state and action (more in line with lerobot pi0)
* add openpi image transforms for training and add more flexibility to _preprocess_images similar to lerobot pi0
* fix key match from pytorch state dict (similar keys to openpi implementation now)
* also for pi05
* update to python 3.11
* revert to openpi transformer replace python 3.11
* fix(modeling pi0): nit warning message
* use safeauto_docstring
* fix: remove unused param
* fix from pretrained
* add preprocess tests
* also compile forward method
* Do not add model prefix to normalization
* use same name for action and state dim as lerobot pi0 and remove fixed image keys
* load from pretrained_path
* temp: hardcode base model
* fix override self.pretrained_path = None overwrite
* rename to loss
* remove additional image augmentations, lerobot dataset already does this
* Add docs
* put tests in test folder
* Add test to instatiate all base models
* go back to python 3.10
* update docs
* adapt docs pi05
* change docs: finetune base model options
* minor docs fixes and dependencies
* remove todo
* cast float64 to float32 for mps
* skip if no transformers
* fix tests
* add new models to modelcard
* add back init
* fix circular input
* feat: only run pi test on GPU
* remove require_nightly_gpu
* replace decorator test_pi0_openpi
* rename action_dim, state_dim to max_action_dim, max_state_dim
* fix doc and constants
* cleanup tests
* fix from pretrained
* fix tests
* add comment pi0 pi05 tests, add image features to pi0 pi05 hub tests
* fix, state is included in language not in flow head
* Move test to specific folder
* and paligemma task with newline
* remove add_special_tokens, not needed
* feedback pr
* Remove previous pi0 and rename pi0_openpi and pi05_openpi
* Add Quantile stats to LeRobotDataset (#1985)
* - Add RunningQuantileStats class for efficient histogram-based quantile computation
- Integrate quantile parameters (compute_quantiles, quantiles) into LeRobotDataset
- Support quantile computation during episode collection and aggregation
- Add comprehensive function-based test suite (24 tests) for quantile functionality
- Maintain full backward compatibility with existing stats computation
- Enable configurable quantiles (default: [0.01, 0.99]) for robust normalization
* style fixes, make quantiles computation by default to new datasets
* fix tests
* - Added DEFAULT_QUANTILES=[0.01, 0.10, 0.50, 0.90, 0.99] to be computed for each features instead of being chosen by the user
- Fortified tests.
* - add helper functions to reshape stats
- add missing test for quantiles
* - Add QUANTILE normalization mode to normalize the data with the 1st and 99th percentiles.
- Add QUANTILE10 normalization mode to normalize the data with the 10th and 90th percentiles.
* style fixes
* Added missing lisence
* Simplify compute_stats
* - added script `augment_dataset_quantile_stats.py` so that we can add quantile stats to existing v3 datasets that dont have quatniles
- modified quantile computation instead of using the edge for the value, interpolate the values in the bin
* rename pi0/pi05 files
* Remove open pi patch and use custom transformer branch for now
* renaming
* fix
* Revert "fix"
This reverts commit 1ea65730ac2cbca6e5869df734fbd4392561b3c6.
* fix naming
* feet(pi0/pi0.5): add pipeline (#2009)
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* refactor(pi05): update imports and rename configuration classes
- Changed imports to reflect the new naming convention for PI05 configuration and policy classes.
- Renamed `PI05OpenPIConfig` to `PI05Config` and `PI05OpenPIPolicy` to `PI05Policy` for consistency.
- Introduced a new processor file for PI05, implementing pre-processing and post-processing steps.
- Updated tests to utilize the renamed classes, ensuring functionality and consistency across the codebase.
* update(pi05): increase tokenizer_max_length for improved processing
- Changed the `tokenizer_max_length` from 48 to 200 to enhance the model's capability in handling longer sequences.
- This adjustment aims to improve the overall performance and flexibility of the PI05 configuration.
* add default for state (max_state_dim)
* correct naming
* fix import
* cleanup code
* remove unused test
* us quantiles for action
* move to device
* remove discrete state assert
* fix pi05 test
* move pi05 to device
* use base models in comparison tests
* small renames for tests
* change number of tokens pi05 test
* fix openpi tokenization in test
* fix hub test
* fix test
* assert lerobot vs openpi tests
---------
Co-authored-by: Pepijn <pepijn@huggingface.co>
* add headers
* add back previously removed imports
* update if statement load processor with dataset stats
* remove to avoid circular import
* inject dataset stats for pretrained models
* check normalization before applying
* add link to quantile augument script
* fix(policies): transformers import for ci in PI0 & PI05 (#2039)
* fix(policies): transformers import for ci in PI0
* fix(policies): transformers import for ci in PI05
* test(processor): fix expected raise when normalization types are missing (#2040)
* switch normalization order pipeline for pi05
* Fix/quantiles script (#2064)
* refactor augment stats with quantiles script
add parallelization for faster processing
shift the quantile normalization between -1 1
* fix replay buffer tests
* fix comment
* overwrite the pipeline normalization features with the policy features
* remove double normalization overwrite
* cleanup from pretrained
* remove typo
* also set norm_map
* fix(augment_quantiles) images incorrectly divided by 255
* clamp quantiles
* link to lerobot base models
* rename tests
* encorperate PR feedback
* update docstring for RunningQuantileStats
* update doc links
* Revert "clamp quantiles"
This reverts commit 172207471c8f2cb62958e9a9e6a0535ba3ff67d4.
* fix self.paligemma
* fix tests related to quantiles that were scaled to [0,1], the new range is [-1, 1]
* fix libero doc and use different transformer branch
* use fix branch instead of feat
* update results libero
* add new line
* fix formatting
* precommit
* update results libero
* update libero doc
* update title
* final changes
* add quantiles to test
* run pre commit
---------
Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Steven Palma <steven.palma@huggingface.co>
2025-10-02 13:14:45 +02:00
|
|
|
#!/usr/bin/env python
|
|
|
|
|
|
|
|
|
|
# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
|
|
|
|
|
#
|
|
|
|
|
# Licensed under the Apache License, Version 2.0 (the "License");
|
|
|
|
|
# you may not use this file except in compliance with the License.
|
|
|
|
|
# You may obtain a copy of the License at
|
|
|
|
|
#
|
|
|
|
|
# http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
|
#
|
|
|
|
|
# Unless required by applicable law or agreed to in writing, software
|
|
|
|
|
# distributed under the License is distributed on an "AS IS" BASIS,
|
|
|
|
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
|
|
|
# See the License for the specific language governing permissions and
|
|
|
|
|
# limitations under the License.
|
|
|
|
|
|
|
|
|
|
"""
|
|
|
|
|
This script augments existing LeRobot datasets with quantile statistics.
|
|
|
|
|
|
|
|
|
|
Most datasets created before the quantile feature was added do not contain
|
|
|
|
|
quantile statistics (q01, q10, q50, q90, q99) in their metadata. This script:
|
|
|
|
|
|
|
|
|
|
1. Loads an existing LeRobot dataset in v3.0 format
|
|
|
|
|
2. Checks if it already contains quantile statistics
|
|
|
|
|
3. If missing, computes quantile statistics for all features
|
|
|
|
|
4. Updates the dataset metadata with the new quantile statistics
|
|
|
|
|
|
|
|
|
|
Usage:
|
|
|
|
|
|
|
|
|
|
```bash
|
2026-03-15 20:42:15 -07:00
|
|
|
python src/lerobot/scripts/augment_dataset_quantile_stats.py \
|
Add OpenPi, Pi0 and Pi0.5 (#1910)
* initial commit
* change device in test
* do detailed import
* adhere to python 3.11 syntax
* fix autodocstring
* additionally
* do same in other files
* add model. prefix to all keys in state dict
* use dummy stats
* add pi05
* also shorten action_steps
* fix test
* all test pass! and fix tokenizer max length between 05 and 0
* remove test
* fix transformer dependency
* fix test
* split pi0 and pi05 policy in seperate files
* fix test
* fix push to hub test
* add some comments, license and readme
* remove warning in config
* add pi05 to factory
* remove check
* rename action_horizon to chunk_size
* clean up padding of state and action (more in line with lerobot pi0)
* add openpi image transforms for training and add more flexibility to _preprocess_images similar to lerobot pi0
* fix key match from pytorch state dict (similar keys to openpi implementation now)
* also for pi05
* update to python 3.11
* revert to openpi transformer replace python 3.11
* fix(modeling pi0): nit warning message
* use safeauto_docstring
* fix: remove unused param
* fix from pretrained
* add preprocess tests
* also compile forward method
* Do not add model prefix to normalization
* use same name for action and state dim as lerobot pi0 and remove fixed image keys
* load from pretrained_path
* temp: hardcode base model
* fix override self.pretrained_path = None overwrite
* rename to loss
* remove additional image augmentations, lerobot dataset already does this
* Add docs
* put tests in test folder
* Add test to instatiate all base models
* go back to python 3.10
* update docs
* adapt docs pi05
* change docs: finetune base model options
* minor docs fixes and dependencies
* remove todo
* cast float64 to float32 for mps
* skip if no transformers
* fix tests
* add new models to modelcard
* add back init
* fix circular input
* feat: only run pi test on GPU
* remove require_nightly_gpu
* replace decorator test_pi0_openpi
* rename action_dim, state_dim to max_action_dim, max_state_dim
* fix doc and constants
* cleanup tests
* fix from pretrained
* fix tests
* add comment pi0 pi05 tests, add image features to pi0 pi05 hub tests
* fix, state is included in language not in flow head
* Move test to specific folder
* and paligemma task with newline
* remove add_special_tokens, not needed
* feedback pr
* Remove previous pi0 and rename pi0_openpi and pi05_openpi
* Add Quantile stats to LeRobotDataset (#1985)
* - Add RunningQuantileStats class for efficient histogram-based quantile computation
- Integrate quantile parameters (compute_quantiles, quantiles) into LeRobotDataset
- Support quantile computation during episode collection and aggregation
- Add comprehensive function-based test suite (24 tests) for quantile functionality
- Maintain full backward compatibility with existing stats computation
- Enable configurable quantiles (default: [0.01, 0.99]) for robust normalization
* style fixes, make quantiles computation by default to new datasets
* fix tests
* - Added DEFAULT_QUANTILES=[0.01, 0.10, 0.50, 0.90, 0.99] to be computed for each features instead of being chosen by the user
- Fortified tests.
* - add helper functions to reshape stats
- add missing test for quantiles
* - Add QUANTILE normalization mode to normalize the data with the 1st and 99th percentiles.
- Add QUANTILE10 normalization mode to normalize the data with the 10th and 90th percentiles.
* style fixes
* Added missing lisence
* Simplify compute_stats
* - added script `augment_dataset_quantile_stats.py` so that we can add quantile stats to existing v3 datasets that dont have quatniles
- modified quantile computation instead of using the edge for the value, interpolate the values in the bin
* rename pi0/pi05 files
* Remove open pi patch and use custom transformer branch for now
* renaming
* fix
* Revert "fix"
This reverts commit 1ea65730ac2cbca6e5869df734fbd4392561b3c6.
* fix naming
* feet(pi0/pi0.5): add pipeline (#2009)
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* refactor(pi05): update imports and rename configuration classes
- Changed imports to reflect the new naming convention for PI05 configuration and policy classes.
- Renamed `PI05OpenPIConfig` to `PI05Config` and `PI05OpenPIPolicy` to `PI05Policy` for consistency.
- Introduced a new processor file for PI05, implementing pre-processing and post-processing steps.
- Updated tests to utilize the renamed classes, ensuring functionality and consistency across the codebase.
* update(pi05): increase tokenizer_max_length for improved processing
- Changed the `tokenizer_max_length` from 48 to 200 to enhance the model's capability in handling longer sequences.
- This adjustment aims to improve the overall performance and flexibility of the PI05 configuration.
* add default for state (max_state_dim)
* correct naming
* fix import
* cleanup code
* remove unused test
* us quantiles for action
* move to device
* remove discrete state assert
* fix pi05 test
* move pi05 to device
* use base models in comparison tests
* small renames for tests
* change number of tokens pi05 test
* fix openpi tokenization in test
* fix hub test
* fix test
* assert lerobot vs openpi tests
---------
Co-authored-by: Pepijn <pepijn@huggingface.co>
* add headers
* add back previously removed imports
* update if statement load processor with dataset stats
* remove to avoid circular import
* inject dataset stats for pretrained models
* check normalization before applying
* add link to quantile augument script
* fix(policies): transformers import for ci in PI0 & PI05 (#2039)
* fix(policies): transformers import for ci in PI0
* fix(policies): transformers import for ci in PI05
* test(processor): fix expected raise when normalization types are missing (#2040)
* switch normalization order pipeline for pi05
* Fix/quantiles script (#2064)
* refactor augment stats with quantiles script
add parallelization for faster processing
shift the quantile normalization between -1 1
* fix replay buffer tests
* fix comment
* overwrite the pipeline normalization features with the policy features
* remove double normalization overwrite
* cleanup from pretrained
* remove typo
* also set norm_map
* fix(augment_quantiles) images incorrectly divided by 255
* clamp quantiles
* link to lerobot base models
* rename tests
* encorperate PR feedback
* update docstring for RunningQuantileStats
* update doc links
* Revert "clamp quantiles"
This reverts commit 172207471c8f2cb62958e9a9e6a0535ba3ff67d4.
* fix self.paligemma
* fix tests related to quantiles that were scaled to [0,1], the new range is [-1, 1]
* fix libero doc and use different transformer branch
* use fix branch instead of feat
* update results libero
* add new line
* fix formatting
* precommit
* update results libero
* update libero doc
* update title
* final changes
* add quantiles to test
* run pre commit
---------
Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Steven Palma <steven.palma@huggingface.co>
2025-10-02 13:14:45 +02:00
|
|
|
--repo-id=lerobot/pusht \
|
|
|
|
|
```
|
|
|
|
|
"""
|
|
|
|
|
|
|
|
|
|
import argparse
|
|
|
|
|
import concurrent.futures
|
|
|
|
|
import logging
|
|
|
|
|
from pathlib import Path
|
|
|
|
|
|
|
|
|
|
import numpy as np
|
|
|
|
|
import torch
|
2025-10-02 18:28:44 +02:00
|
|
|
from huggingface_hub import HfApi
|
|
|
|
|
from requests import HTTPError
|
Add OpenPi, Pi0 and Pi0.5 (#1910)
* initial commit
* change device in test
* do detailed import
* adhere to python 3.11 syntax
* fix autodocstring
* additionally
* do same in other files
* add model. prefix to all keys in state dict
* use dummy stats
* add pi05
* also shorten action_steps
* fix test
* all test pass! and fix tokenizer max length between 05 and 0
* remove test
* fix transformer dependency
* fix test
* split pi0 and pi05 policy in seperate files
* fix test
* fix push to hub test
* add some comments, license and readme
* remove warning in config
* add pi05 to factory
* remove check
* rename action_horizon to chunk_size
* clean up padding of state and action (more in line with lerobot pi0)
* add openpi image transforms for training and add more flexibility to _preprocess_images similar to lerobot pi0
* fix key match from pytorch state dict (similar keys to openpi implementation now)
* also for pi05
* update to python 3.11
* revert to openpi transformer replace python 3.11
* fix(modeling pi0): nit warning message
* use safeauto_docstring
* fix: remove unused param
* fix from pretrained
* add preprocess tests
* also compile forward method
* Do not add model prefix to normalization
* use same name for action and state dim as lerobot pi0 and remove fixed image keys
* load from pretrained_path
* temp: hardcode base model
* fix override self.pretrained_path = None overwrite
* rename to loss
* remove additional image augmentations, lerobot dataset already does this
* Add docs
* put tests in test folder
* Add test to instatiate all base models
* go back to python 3.10
* update docs
* adapt docs pi05
* change docs: finetune base model options
* minor docs fixes and dependencies
* remove todo
* cast float64 to float32 for mps
* skip if no transformers
* fix tests
* add new models to modelcard
* add back init
* fix circular input
* feat: only run pi test on GPU
* remove require_nightly_gpu
* replace decorator test_pi0_openpi
* rename action_dim, state_dim to max_action_dim, max_state_dim
* fix doc and constants
* cleanup tests
* fix from pretrained
* fix tests
* add comment pi0 pi05 tests, add image features to pi0 pi05 hub tests
* fix, state is included in language not in flow head
* Move test to specific folder
* and paligemma task with newline
* remove add_special_tokens, not needed
* feedback pr
* Remove previous pi0 and rename pi0_openpi and pi05_openpi
* Add Quantile stats to LeRobotDataset (#1985)
* - Add RunningQuantileStats class for efficient histogram-based quantile computation
- Integrate quantile parameters (compute_quantiles, quantiles) into LeRobotDataset
- Support quantile computation during episode collection and aggregation
- Add comprehensive function-based test suite (24 tests) for quantile functionality
- Maintain full backward compatibility with existing stats computation
- Enable configurable quantiles (default: [0.01, 0.99]) for robust normalization
* style fixes, make quantiles computation by default to new datasets
* fix tests
* - Added DEFAULT_QUANTILES=[0.01, 0.10, 0.50, 0.90, 0.99] to be computed for each features instead of being chosen by the user
- Fortified tests.
* - add helper functions to reshape stats
- add missing test for quantiles
* - Add QUANTILE normalization mode to normalize the data with the 1st and 99th percentiles.
- Add QUANTILE10 normalization mode to normalize the data with the 10th and 90th percentiles.
* style fixes
* Added missing lisence
* Simplify compute_stats
* - added script `augment_dataset_quantile_stats.py` so that we can add quantile stats to existing v3 datasets that dont have quatniles
- modified quantile computation instead of using the edge for the value, interpolate the values in the bin
* rename pi0/pi05 files
* Remove open pi patch and use custom transformer branch for now
* renaming
* fix
* Revert "fix"
This reverts commit 1ea65730ac2cbca6e5869df734fbd4392561b3c6.
* fix naming
* feet(pi0/pi0.5): add pipeline (#2009)
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* refactor(pi05): update imports and rename configuration classes
- Changed imports to reflect the new naming convention for PI05 configuration and policy classes.
- Renamed `PI05OpenPIConfig` to `PI05Config` and `PI05OpenPIPolicy` to `PI05Policy` for consistency.
- Introduced a new processor file for PI05, implementing pre-processing and post-processing steps.
- Updated tests to utilize the renamed classes, ensuring functionality and consistency across the codebase.
* update(pi05): increase tokenizer_max_length for improved processing
- Changed the `tokenizer_max_length` from 48 to 200 to enhance the model's capability in handling longer sequences.
- This adjustment aims to improve the overall performance and flexibility of the PI05 configuration.
* add default for state (max_state_dim)
* correct naming
* fix import
* cleanup code
* remove unused test
* us quantiles for action
* move to device
* remove discrete state assert
* fix pi05 test
* move pi05 to device
* use base models in comparison tests
* small renames for tests
* change number of tokens pi05 test
* fix openpi tokenization in test
* fix hub test
* fix test
* assert lerobot vs openpi tests
---------
Co-authored-by: Pepijn <pepijn@huggingface.co>
* add headers
* add back previously removed imports
* update if statement load processor with dataset stats
* remove to avoid circular import
* inject dataset stats for pretrained models
* check normalization before applying
* add link to quantile augument script
* fix(policies): transformers import for ci in PI0 & PI05 (#2039)
* fix(policies): transformers import for ci in PI0
* fix(policies): transformers import for ci in PI05
* test(processor): fix expected raise when normalization types are missing (#2040)
* switch normalization order pipeline for pi05
* Fix/quantiles script (#2064)
* refactor augment stats with quantiles script
add parallelization for faster processing
shift the quantile normalization between -1 1
* fix replay buffer tests
* fix comment
* overwrite the pipeline normalization features with the policy features
* remove double normalization overwrite
* cleanup from pretrained
* remove typo
* also set norm_map
* fix(augment_quantiles) images incorrectly divided by 255
* clamp quantiles
* link to lerobot base models
* rename tests
* encorperate PR feedback
* update docstring for RunningQuantileStats
* update doc links
* Revert "clamp quantiles"
This reverts commit 172207471c8f2cb62958e9a9e6a0535ba3ff67d4.
* fix self.paligemma
* fix tests related to quantiles that were scaled to [0,1], the new range is [-1, 1]
* fix libero doc and use different transformer branch
* use fix branch instead of feat
* update results libero
* add new line
* fix formatting
* precommit
* update results libero
* update libero doc
* update title
* final changes
* add quantiles to test
* run pre commit
---------
Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Steven Palma <steven.palma@huggingface.co>
2025-10-02 13:14:45 +02:00
|
|
|
from tqdm import tqdm
|
|
|
|
|
|
2026-04-12 20:03:04 +02:00
|
|
|
from lerobot.datasets import (
|
|
|
|
|
CODEBASE_VERSION,
|
|
|
|
|
DEFAULT_QUANTILES,
|
|
|
|
|
LeRobotDataset,
|
|
|
|
|
aggregate_stats,
|
|
|
|
|
get_feature_stats,
|
|
|
|
|
write_stats,
|
|
|
|
|
)
|
Add OpenPi, Pi0 and Pi0.5 (#1910)
* initial commit
* change device in test
* do detailed import
* adhere to python 3.11 syntax
* fix autodocstring
* additionally
* do same in other files
* add model. prefix to all keys in state dict
* use dummy stats
* add pi05
* also shorten action_steps
* fix test
* all test pass! and fix tokenizer max length between 05 and 0
* remove test
* fix transformer dependency
* fix test
* split pi0 and pi05 policy in seperate files
* fix test
* fix push to hub test
* add some comments, license and readme
* remove warning in config
* add pi05 to factory
* remove check
* rename action_horizon to chunk_size
* clean up padding of state and action (more in line with lerobot pi0)
* add openpi image transforms for training and add more flexibility to _preprocess_images similar to lerobot pi0
* fix key match from pytorch state dict (similar keys to openpi implementation now)
* also for pi05
* update to python 3.11
* revert to openpi transformer replace python 3.11
* fix(modeling pi0): nit warning message
* use safeauto_docstring
* fix: remove unused param
* fix from pretrained
* add preprocess tests
* also compile forward method
* Do not add model prefix to normalization
* use same name for action and state dim as lerobot pi0 and remove fixed image keys
* load from pretrained_path
* temp: hardcode base model
* fix override self.pretrained_path = None overwrite
* rename to loss
* remove additional image augmentations, lerobot dataset already does this
* Add docs
* put tests in test folder
* Add test to instatiate all base models
* go back to python 3.10
* update docs
* adapt docs pi05
* change docs: finetune base model options
* minor docs fixes and dependencies
* remove todo
* cast float64 to float32 for mps
* skip if no transformers
* fix tests
* add new models to modelcard
* add back init
* fix circular input
* feat: only run pi test on GPU
* remove require_nightly_gpu
* replace decorator test_pi0_openpi
* rename action_dim, state_dim to max_action_dim, max_state_dim
* fix doc and constants
* cleanup tests
* fix from pretrained
* fix tests
* add comment pi0 pi05 tests, add image features to pi0 pi05 hub tests
* fix, state is included in language not in flow head
* Move test to specific folder
* and paligemma task with newline
* remove add_special_tokens, not needed
* feedback pr
* Remove previous pi0 and rename pi0_openpi and pi05_openpi
* Add Quantile stats to LeRobotDataset (#1985)
* - Add RunningQuantileStats class for efficient histogram-based quantile computation
- Integrate quantile parameters (compute_quantiles, quantiles) into LeRobotDataset
- Support quantile computation during episode collection and aggregation
- Add comprehensive function-based test suite (24 tests) for quantile functionality
- Maintain full backward compatibility with existing stats computation
- Enable configurable quantiles (default: [0.01, 0.99]) for robust normalization
* style fixes, make quantiles computation by default to new datasets
* fix tests
* - Added DEFAULT_QUANTILES=[0.01, 0.10, 0.50, 0.90, 0.99] to be computed for each features instead of being chosen by the user
- Fortified tests.
* - add helper functions to reshape stats
- add missing test for quantiles
* - Add QUANTILE normalization mode to normalize the data with the 1st and 99th percentiles.
- Add QUANTILE10 normalization mode to normalize the data with the 10th and 90th percentiles.
* style fixes
* Added missing lisence
* Simplify compute_stats
* - added script `augment_dataset_quantile_stats.py` so that we can add quantile stats to existing v3 datasets that dont have quatniles
- modified quantile computation instead of using the edge for the value, interpolate the values in the bin
* rename pi0/pi05 files
* Remove open pi patch and use custom transformer branch for now
* renaming
* fix
* Revert "fix"
This reverts commit 1ea65730ac2cbca6e5869df734fbd4392561b3c6.
* fix naming
* feet(pi0/pi0.5): add pipeline (#2009)
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* refactor(pi05): update imports and rename configuration classes
- Changed imports to reflect the new naming convention for PI05 configuration and policy classes.
- Renamed `PI05OpenPIConfig` to `PI05Config` and `PI05OpenPIPolicy` to `PI05Policy` for consistency.
- Introduced a new processor file for PI05, implementing pre-processing and post-processing steps.
- Updated tests to utilize the renamed classes, ensuring functionality and consistency across the codebase.
* update(pi05): increase tokenizer_max_length for improved processing
- Changed the `tokenizer_max_length` from 48 to 200 to enhance the model's capability in handling longer sequences.
- This adjustment aims to improve the overall performance and flexibility of the PI05 configuration.
* add default for state (max_state_dim)
* correct naming
* fix import
* cleanup code
* remove unused test
* us quantiles for action
* move to device
* remove discrete state assert
* fix pi05 test
* move pi05 to device
* use base models in comparison tests
* small renames for tests
* change number of tokens pi05 test
* fix openpi tokenization in test
* fix hub test
* fix test
* assert lerobot vs openpi tests
---------
Co-authored-by: Pepijn <pepijn@huggingface.co>
* add headers
* add back previously removed imports
* update if statement load processor with dataset stats
* remove to avoid circular import
* inject dataset stats for pretrained models
* check normalization before applying
* add link to quantile augument script
* fix(policies): transformers import for ci in PI0 & PI05 (#2039)
* fix(policies): transformers import for ci in PI0
* fix(policies): transformers import for ci in PI05
* test(processor): fix expected raise when normalization types are missing (#2040)
* switch normalization order pipeline for pi05
* Fix/quantiles script (#2064)
* refactor augment stats with quantiles script
add parallelization for faster processing
shift the quantile normalization between -1 1
* fix replay buffer tests
* fix comment
* overwrite the pipeline normalization features with the policy features
* remove double normalization overwrite
* cleanup from pretrained
* remove typo
* also set norm_map
* fix(augment_quantiles) images incorrectly divided by 255
* clamp quantiles
* link to lerobot base models
* rename tests
* encorperate PR feedback
* update docstring for RunningQuantileStats
* update doc links
* Revert "clamp quantiles"
This reverts commit 172207471c8f2cb62958e9a9e6a0535ba3ff67d4.
* fix self.paligemma
* fix tests related to quantiles that were scaled to [0,1], the new range is [-1, 1]
* fix libero doc and use different transformer branch
* use fix branch instead of feat
* update results libero
* add new line
* fix formatting
* precommit
* update results libero
* update libero doc
* update title
* final changes
* add quantiles to test
* run pre commit
---------
Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Steven Palma <steven.palma@huggingface.co>
2025-10-02 13:14:45 +02:00
|
|
|
from lerobot.utils.utils import init_logging
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def has_quantile_stats(stats: dict[str, dict] | None, quantile_list_keys: list[str] | None = None) -> bool:
|
|
|
|
|
"""Check if dataset statistics already contain quantile information.
|
|
|
|
|
|
|
|
|
|
Args:
|
|
|
|
|
stats: Dataset statistics dictionary
|
|
|
|
|
|
|
|
|
|
Returns:
|
|
|
|
|
True if quantile statistics are present, False otherwise
|
|
|
|
|
"""
|
|
|
|
|
if quantile_list_keys is None:
|
|
|
|
|
quantile_list_keys = [f"q{int(q * 100):02d}" for q in DEFAULT_QUANTILES]
|
|
|
|
|
|
|
|
|
|
if stats is None:
|
|
|
|
|
return False
|
|
|
|
|
|
|
|
|
|
for feature_stats in stats.values():
|
|
|
|
|
if any(q_key in feature_stats for q_key in quantile_list_keys):
|
|
|
|
|
return True
|
|
|
|
|
|
|
|
|
|
return False
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def process_single_episode(dataset: LeRobotDataset, episode_idx: int) -> dict:
|
|
|
|
|
"""Process a single episode and return its statistics.
|
|
|
|
|
|
|
|
|
|
Args:
|
|
|
|
|
dataset: The LeRobot dataset
|
|
|
|
|
episode_idx: Index of the episode to process
|
|
|
|
|
|
|
|
|
|
Returns:
|
|
|
|
|
Dictionary containing episode statistics
|
|
|
|
|
"""
|
|
|
|
|
logging.info(f"Computing stats for episode {episode_idx}")
|
|
|
|
|
|
|
|
|
|
start_idx = dataset.meta.episodes[episode_idx]["dataset_from_index"]
|
|
|
|
|
end_idx = dataset.meta.episodes[episode_idx]["dataset_to_index"]
|
|
|
|
|
|
2025-10-02 18:28:44 +02:00
|
|
|
collected_data: dict[str, list] = {}
|
|
|
|
|
for idx in range(start_idx, end_idx):
|
|
|
|
|
item = dataset[idx]
|
|
|
|
|
for key, value in item.items():
|
|
|
|
|
if key not in dataset.features:
|
|
|
|
|
continue
|
|
|
|
|
|
|
|
|
|
if key not in collected_data:
|
|
|
|
|
collected_data[key] = []
|
|
|
|
|
collected_data[key].append(value)
|
|
|
|
|
|
Add OpenPi, Pi0 and Pi0.5 (#1910)
* initial commit
* change device in test
* do detailed import
* adhere to python 3.11 syntax
* fix autodocstring
* additionally
* do same in other files
* add model. prefix to all keys in state dict
* use dummy stats
* add pi05
* also shorten action_steps
* fix test
* all test pass! and fix tokenizer max length between 05 and 0
* remove test
* fix transformer dependency
* fix test
* split pi0 and pi05 policy in seperate files
* fix test
* fix push to hub test
* add some comments, license and readme
* remove warning in config
* add pi05 to factory
* remove check
* rename action_horizon to chunk_size
* clean up padding of state and action (more in line with lerobot pi0)
* add openpi image transforms for training and add more flexibility to _preprocess_images similar to lerobot pi0
* fix key match from pytorch state dict (similar keys to openpi implementation now)
* also for pi05
* update to python 3.11
* revert to openpi transformer replace python 3.11
* fix(modeling pi0): nit warning message
* use safeauto_docstring
* fix: remove unused param
* fix from pretrained
* add preprocess tests
* also compile forward method
* Do not add model prefix to normalization
* use same name for action and state dim as lerobot pi0 and remove fixed image keys
* load from pretrained_path
* temp: hardcode base model
* fix override self.pretrained_path = None overwrite
* rename to loss
* remove additional image augmentations, lerobot dataset already does this
* Add docs
* put tests in test folder
* Add test to instatiate all base models
* go back to python 3.10
* update docs
* adapt docs pi05
* change docs: finetune base model options
* minor docs fixes and dependencies
* remove todo
* cast float64 to float32 for mps
* skip if no transformers
* fix tests
* add new models to modelcard
* add back init
* fix circular input
* feat: only run pi test on GPU
* remove require_nightly_gpu
* replace decorator test_pi0_openpi
* rename action_dim, state_dim to max_action_dim, max_state_dim
* fix doc and constants
* cleanup tests
* fix from pretrained
* fix tests
* add comment pi0 pi05 tests, add image features to pi0 pi05 hub tests
* fix, state is included in language not in flow head
* Move test to specific folder
* and paligemma task with newline
* remove add_special_tokens, not needed
* feedback pr
* Remove previous pi0 and rename pi0_openpi and pi05_openpi
* Add Quantile stats to LeRobotDataset (#1985)
* - Add RunningQuantileStats class for efficient histogram-based quantile computation
- Integrate quantile parameters (compute_quantiles, quantiles) into LeRobotDataset
- Support quantile computation during episode collection and aggregation
- Add comprehensive function-based test suite (24 tests) for quantile functionality
- Maintain full backward compatibility with existing stats computation
- Enable configurable quantiles (default: [0.01, 0.99]) for robust normalization
* style fixes, make quantiles computation by default to new datasets
* fix tests
* - Added DEFAULT_QUANTILES=[0.01, 0.10, 0.50, 0.90, 0.99] to be computed for each features instead of being chosen by the user
- Fortified tests.
* - add helper functions to reshape stats
- add missing test for quantiles
* - Add QUANTILE normalization mode to normalize the data with the 1st and 99th percentiles.
- Add QUANTILE10 normalization mode to normalize the data with the 10th and 90th percentiles.
* style fixes
* Added missing lisence
* Simplify compute_stats
* - added script `augment_dataset_quantile_stats.py` so that we can add quantile stats to existing v3 datasets that dont have quatniles
- modified quantile computation instead of using the edge for the value, interpolate the values in the bin
* rename pi0/pi05 files
* Remove open pi patch and use custom transformer branch for now
* renaming
* fix
* Revert "fix"
This reverts commit 1ea65730ac2cbca6e5869df734fbd4392561b3c6.
* fix naming
* feet(pi0/pi0.5): add pipeline (#2009)
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* refactor(pi05): update imports and rename configuration classes
- Changed imports to reflect the new naming convention for PI05 configuration and policy classes.
- Renamed `PI05OpenPIConfig` to `PI05Config` and `PI05OpenPIPolicy` to `PI05Policy` for consistency.
- Introduced a new processor file for PI05, implementing pre-processing and post-processing steps.
- Updated tests to utilize the renamed classes, ensuring functionality and consistency across the codebase.
* update(pi05): increase tokenizer_max_length for improved processing
- Changed the `tokenizer_max_length` from 48 to 200 to enhance the model's capability in handling longer sequences.
- This adjustment aims to improve the overall performance and flexibility of the PI05 configuration.
* add default for state (max_state_dim)
* correct naming
* fix import
* cleanup code
* remove unused test
* us quantiles for action
* move to device
* remove discrete state assert
* fix pi05 test
* move pi05 to device
* use base models in comparison tests
* small renames for tests
* change number of tokens pi05 test
* fix openpi tokenization in test
* fix hub test
* fix test
* assert lerobot vs openpi tests
---------
Co-authored-by: Pepijn <pepijn@huggingface.co>
* add headers
* add back previously removed imports
* update if statement load processor with dataset stats
* remove to avoid circular import
* inject dataset stats for pretrained models
* check normalization before applying
* add link to quantile augument script
* fix(policies): transformers import for ci in PI0 & PI05 (#2039)
* fix(policies): transformers import for ci in PI0
* fix(policies): transformers import for ci in PI05
* test(processor): fix expected raise when normalization types are missing (#2040)
* switch normalization order pipeline for pi05
* Fix/quantiles script (#2064)
* refactor augment stats with quantiles script
add parallelization for faster processing
shift the quantile normalization between -1 1
* fix replay buffer tests
* fix comment
* overwrite the pipeline normalization features with the policy features
* remove double normalization overwrite
* cleanup from pretrained
* remove typo
* also set norm_map
* fix(augment_quantiles) images incorrectly divided by 255
* clamp quantiles
* link to lerobot base models
* rename tests
* encorperate PR feedback
* update docstring for RunningQuantileStats
* update doc links
* Revert "clamp quantiles"
This reverts commit 172207471c8f2cb62958e9a9e6a0535ba3ff67d4.
* fix self.paligemma
* fix tests related to quantiles that were scaled to [0,1], the new range is [-1, 1]
* fix libero doc and use different transformer branch
* use fix branch instead of feat
* update results libero
* add new line
* fix formatting
* precommit
* update results libero
* update libero doc
* update title
* final changes
* add quantiles to test
* run pre commit
---------
Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Steven Palma <steven.palma@huggingface.co>
2025-10-02 13:14:45 +02:00
|
|
|
ep_stats = {}
|
2025-10-02 18:28:44 +02:00
|
|
|
for key, data_list in collected_data.items():
|
Add OpenPi, Pi0 and Pi0.5 (#1910)
* initial commit
* change device in test
* do detailed import
* adhere to python 3.11 syntax
* fix autodocstring
* additionally
* do same in other files
* add model. prefix to all keys in state dict
* use dummy stats
* add pi05
* also shorten action_steps
* fix test
* all test pass! and fix tokenizer max length between 05 and 0
* remove test
* fix transformer dependency
* fix test
* split pi0 and pi05 policy in seperate files
* fix test
* fix push to hub test
* add some comments, license and readme
* remove warning in config
* add pi05 to factory
* remove check
* rename action_horizon to chunk_size
* clean up padding of state and action (more in line with lerobot pi0)
* add openpi image transforms for training and add more flexibility to _preprocess_images similar to lerobot pi0
* fix key match from pytorch state dict (similar keys to openpi implementation now)
* also for pi05
* update to python 3.11
* revert to openpi transformer replace python 3.11
* fix(modeling pi0): nit warning message
* use safeauto_docstring
* fix: remove unused param
* fix from pretrained
* add preprocess tests
* also compile forward method
* Do not add model prefix to normalization
* use same name for action and state dim as lerobot pi0 and remove fixed image keys
* load from pretrained_path
* temp: hardcode base model
* fix override self.pretrained_path = None overwrite
* rename to loss
* remove additional image augmentations, lerobot dataset already does this
* Add docs
* put tests in test folder
* Add test to instatiate all base models
* go back to python 3.10
* update docs
* adapt docs pi05
* change docs: finetune base model options
* minor docs fixes and dependencies
* remove todo
* cast float64 to float32 for mps
* skip if no transformers
* fix tests
* add new models to modelcard
* add back init
* fix circular input
* feat: only run pi test on GPU
* remove require_nightly_gpu
* replace decorator test_pi0_openpi
* rename action_dim, state_dim to max_action_dim, max_state_dim
* fix doc and constants
* cleanup tests
* fix from pretrained
* fix tests
* add comment pi0 pi05 tests, add image features to pi0 pi05 hub tests
* fix, state is included in language not in flow head
* Move test to specific folder
* and paligemma task with newline
* remove add_special_tokens, not needed
* feedback pr
* Remove previous pi0 and rename pi0_openpi and pi05_openpi
* Add Quantile stats to LeRobotDataset (#1985)
* - Add RunningQuantileStats class for efficient histogram-based quantile computation
- Integrate quantile parameters (compute_quantiles, quantiles) into LeRobotDataset
- Support quantile computation during episode collection and aggregation
- Add comprehensive function-based test suite (24 tests) for quantile functionality
- Maintain full backward compatibility with existing stats computation
- Enable configurable quantiles (default: [0.01, 0.99]) for robust normalization
* style fixes, make quantiles computation by default to new datasets
* fix tests
* - Added DEFAULT_QUANTILES=[0.01, 0.10, 0.50, 0.90, 0.99] to be computed for each features instead of being chosen by the user
- Fortified tests.
* - add helper functions to reshape stats
- add missing test for quantiles
* - Add QUANTILE normalization mode to normalize the data with the 1st and 99th percentiles.
- Add QUANTILE10 normalization mode to normalize the data with the 10th and 90th percentiles.
* style fixes
* Added missing lisence
* Simplify compute_stats
* - added script `augment_dataset_quantile_stats.py` so that we can add quantile stats to existing v3 datasets that dont have quatniles
- modified quantile computation instead of using the edge for the value, interpolate the values in the bin
* rename pi0/pi05 files
* Remove open pi patch and use custom transformer branch for now
* renaming
* fix
* Revert "fix"
This reverts commit 1ea65730ac2cbca6e5869df734fbd4392561b3c6.
* fix naming
* feet(pi0/pi0.5): add pipeline (#2009)
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* refactor(pi05): update imports and rename configuration classes
- Changed imports to reflect the new naming convention for PI05 configuration and policy classes.
- Renamed `PI05OpenPIConfig` to `PI05Config` and `PI05OpenPIPolicy` to `PI05Policy` for consistency.
- Introduced a new processor file for PI05, implementing pre-processing and post-processing steps.
- Updated tests to utilize the renamed classes, ensuring functionality and consistency across the codebase.
* update(pi05): increase tokenizer_max_length for improved processing
- Changed the `tokenizer_max_length` from 48 to 200 to enhance the model's capability in handling longer sequences.
- This adjustment aims to improve the overall performance and flexibility of the PI05 configuration.
* add default for state (max_state_dim)
* correct naming
* fix import
* cleanup code
* remove unused test
* us quantiles for action
* move to device
* remove discrete state assert
* fix pi05 test
* move pi05 to device
* use base models in comparison tests
* small renames for tests
* change number of tokens pi05 test
* fix openpi tokenization in test
* fix hub test
* fix test
* assert lerobot vs openpi tests
---------
Co-authored-by: Pepijn <pepijn@huggingface.co>
* add headers
* add back previously removed imports
* update if statement load processor with dataset stats
* remove to avoid circular import
* inject dataset stats for pretrained models
* check normalization before applying
* add link to quantile augument script
* fix(policies): transformers import for ci in PI0 & PI05 (#2039)
* fix(policies): transformers import for ci in PI0
* fix(policies): transformers import for ci in PI05
* test(processor): fix expected raise when normalization types are missing (#2040)
* switch normalization order pipeline for pi05
* Fix/quantiles script (#2064)
* refactor augment stats with quantiles script
add parallelization for faster processing
shift the quantile normalization between -1 1
* fix replay buffer tests
* fix comment
* overwrite the pipeline normalization features with the policy features
* remove double normalization overwrite
* cleanup from pretrained
* remove typo
* also set norm_map
* fix(augment_quantiles) images incorrectly divided by 255
* clamp quantiles
* link to lerobot base models
* rename tests
* encorperate PR feedback
* update docstring for RunningQuantileStats
* update doc links
* Revert "clamp quantiles"
This reverts commit 172207471c8f2cb62958e9a9e6a0535ba3ff67d4.
* fix self.paligemma
* fix tests related to quantiles that were scaled to [0,1], the new range is [-1, 1]
* fix libero doc and use different transformer branch
* use fix branch instead of feat
* update results libero
* add new line
* fix formatting
* precommit
* update results libero
* update libero doc
* update title
* final changes
* add quantiles to test
* run pre commit
---------
Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Steven Palma <steven.palma@huggingface.co>
2025-10-02 13:14:45 +02:00
|
|
|
if dataset.features[key]["dtype"] == "string":
|
|
|
|
|
continue
|
|
|
|
|
|
2025-10-02 18:28:44 +02:00
|
|
|
data = torch.stack(data_list).cpu().numpy()
|
Add OpenPi, Pi0 and Pi0.5 (#1910)
* initial commit
* change device in test
* do detailed import
* adhere to python 3.11 syntax
* fix autodocstring
* additionally
* do same in other files
* add model. prefix to all keys in state dict
* use dummy stats
* add pi05
* also shorten action_steps
* fix test
* all test pass! and fix tokenizer max length between 05 and 0
* remove test
* fix transformer dependency
* fix test
* split pi0 and pi05 policy in seperate files
* fix test
* fix push to hub test
* add some comments, license and readme
* remove warning in config
* add pi05 to factory
* remove check
* rename action_horizon to chunk_size
* clean up padding of state and action (more in line with lerobot pi0)
* add openpi image transforms for training and add more flexibility to _preprocess_images similar to lerobot pi0
* fix key match from pytorch state dict (similar keys to openpi implementation now)
* also for pi05
* update to python 3.11
* revert to openpi transformer replace python 3.11
* fix(modeling pi0): nit warning message
* use safeauto_docstring
* fix: remove unused param
* fix from pretrained
* add preprocess tests
* also compile forward method
* Do not add model prefix to normalization
* use same name for action and state dim as lerobot pi0 and remove fixed image keys
* load from pretrained_path
* temp: hardcode base model
* fix override self.pretrained_path = None overwrite
* rename to loss
* remove additional image augmentations, lerobot dataset already does this
* Add docs
* put tests in test folder
* Add test to instatiate all base models
* go back to python 3.10
* update docs
* adapt docs pi05
* change docs: finetune base model options
* minor docs fixes and dependencies
* remove todo
* cast float64 to float32 for mps
* skip if no transformers
* fix tests
* add new models to modelcard
* add back init
* fix circular input
* feat: only run pi test on GPU
* remove require_nightly_gpu
* replace decorator test_pi0_openpi
* rename action_dim, state_dim to max_action_dim, max_state_dim
* fix doc and constants
* cleanup tests
* fix from pretrained
* fix tests
* add comment pi0 pi05 tests, add image features to pi0 pi05 hub tests
* fix, state is included in language not in flow head
* Move test to specific folder
* and paligemma task with newline
* remove add_special_tokens, not needed
* feedback pr
* Remove previous pi0 and rename pi0_openpi and pi05_openpi
* Add Quantile stats to LeRobotDataset (#1985)
* - Add RunningQuantileStats class for efficient histogram-based quantile computation
- Integrate quantile parameters (compute_quantiles, quantiles) into LeRobotDataset
- Support quantile computation during episode collection and aggregation
- Add comprehensive function-based test suite (24 tests) for quantile functionality
- Maintain full backward compatibility with existing stats computation
- Enable configurable quantiles (default: [0.01, 0.99]) for robust normalization
* style fixes, make quantiles computation by default to new datasets
* fix tests
* - Added DEFAULT_QUANTILES=[0.01, 0.10, 0.50, 0.90, 0.99] to be computed for each features instead of being chosen by the user
- Fortified tests.
* - add helper functions to reshape stats
- add missing test for quantiles
* - Add QUANTILE normalization mode to normalize the data with the 1st and 99th percentiles.
- Add QUANTILE10 normalization mode to normalize the data with the 10th and 90th percentiles.
* style fixes
* Added missing lisence
* Simplify compute_stats
* - added script `augment_dataset_quantile_stats.py` so that we can add quantile stats to existing v3 datasets that dont have quatniles
- modified quantile computation instead of using the edge for the value, interpolate the values in the bin
* rename pi0/pi05 files
* Remove open pi patch and use custom transformer branch for now
* renaming
* fix
* Revert "fix"
This reverts commit 1ea65730ac2cbca6e5869df734fbd4392561b3c6.
* fix naming
* feet(pi0/pi0.5): add pipeline (#2009)
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* refactor(pi05): update imports and rename configuration classes
- Changed imports to reflect the new naming convention for PI05 configuration and policy classes.
- Renamed `PI05OpenPIConfig` to `PI05Config` and `PI05OpenPIPolicy` to `PI05Policy` for consistency.
- Introduced a new processor file for PI05, implementing pre-processing and post-processing steps.
- Updated tests to utilize the renamed classes, ensuring functionality and consistency across the codebase.
* update(pi05): increase tokenizer_max_length for improved processing
- Changed the `tokenizer_max_length` from 48 to 200 to enhance the model's capability in handling longer sequences.
- This adjustment aims to improve the overall performance and flexibility of the PI05 configuration.
* add default for state (max_state_dim)
* correct naming
* fix import
* cleanup code
* remove unused test
* us quantiles for action
* move to device
* remove discrete state assert
* fix pi05 test
* move pi05 to device
* use base models in comparison tests
* small renames for tests
* change number of tokens pi05 test
* fix openpi tokenization in test
* fix hub test
* fix test
* assert lerobot vs openpi tests
---------
Co-authored-by: Pepijn <pepijn@huggingface.co>
* add headers
* add back previously removed imports
* update if statement load processor with dataset stats
* remove to avoid circular import
* inject dataset stats for pretrained models
* check normalization before applying
* add link to quantile augument script
* fix(policies): transformers import for ci in PI0 & PI05 (#2039)
* fix(policies): transformers import for ci in PI0
* fix(policies): transformers import for ci in PI05
* test(processor): fix expected raise when normalization types are missing (#2040)
* switch normalization order pipeline for pi05
* Fix/quantiles script (#2064)
* refactor augment stats with quantiles script
add parallelization for faster processing
shift the quantile normalization between -1 1
* fix replay buffer tests
* fix comment
* overwrite the pipeline normalization features with the policy features
* remove double normalization overwrite
* cleanup from pretrained
* remove typo
* also set norm_map
* fix(augment_quantiles) images incorrectly divided by 255
* clamp quantiles
* link to lerobot base models
* rename tests
* encorperate PR feedback
* update docstring for RunningQuantileStats
* update doc links
* Revert "clamp quantiles"
This reverts commit 172207471c8f2cb62958e9a9e6a0535ba3ff67d4.
* fix self.paligemma
* fix tests related to quantiles that were scaled to [0,1], the new range is [-1, 1]
* fix libero doc and use different transformer branch
* use fix branch instead of feat
* update results libero
* add new line
* fix formatting
* precommit
* update results libero
* update libero doc
* update title
* final changes
* add quantiles to test
* run pre commit
---------
Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Steven Palma <steven.palma@huggingface.co>
2025-10-02 13:14:45 +02:00
|
|
|
if dataset.features[key]["dtype"] in ["image", "video"]:
|
2025-10-02 18:28:44 +02:00
|
|
|
if data.dtype == np.uint8:
|
|
|
|
|
data = data.astype(np.float32) / 255.0
|
|
|
|
|
|
Add OpenPi, Pi0 and Pi0.5 (#1910)
* initial commit
* change device in test
* do detailed import
* adhere to python 3.11 syntax
* fix autodocstring
* additionally
* do same in other files
* add model. prefix to all keys in state dict
* use dummy stats
* add pi05
* also shorten action_steps
* fix test
* all test pass! and fix tokenizer max length between 05 and 0
* remove test
* fix transformer dependency
* fix test
* split pi0 and pi05 policy in seperate files
* fix test
* fix push to hub test
* add some comments, license and readme
* remove warning in config
* add pi05 to factory
* remove check
* rename action_horizon to chunk_size
* clean up padding of state and action (more in line with lerobot pi0)
* add openpi image transforms for training and add more flexibility to _preprocess_images similar to lerobot pi0
* fix key match from pytorch state dict (similar keys to openpi implementation now)
* also for pi05
* update to python 3.11
* revert to openpi transformer replace python 3.11
* fix(modeling pi0): nit warning message
* use safeauto_docstring
* fix: remove unused param
* fix from pretrained
* add preprocess tests
* also compile forward method
* Do not add model prefix to normalization
* use same name for action and state dim as lerobot pi0 and remove fixed image keys
* load from pretrained_path
* temp: hardcode base model
* fix override self.pretrained_path = None overwrite
* rename to loss
* remove additional image augmentations, lerobot dataset already does this
* Add docs
* put tests in test folder
* Add test to instatiate all base models
* go back to python 3.10
* update docs
* adapt docs pi05
* change docs: finetune base model options
* minor docs fixes and dependencies
* remove todo
* cast float64 to float32 for mps
* skip if no transformers
* fix tests
* add new models to modelcard
* add back init
* fix circular input
* feat: only run pi test on GPU
* remove require_nightly_gpu
* replace decorator test_pi0_openpi
* rename action_dim, state_dim to max_action_dim, max_state_dim
* fix doc and constants
* cleanup tests
* fix from pretrained
* fix tests
* add comment pi0 pi05 tests, add image features to pi0 pi05 hub tests
* fix, state is included in language not in flow head
* Move test to specific folder
* and paligemma task with newline
* remove add_special_tokens, not needed
* feedback pr
* Remove previous pi0 and rename pi0_openpi and pi05_openpi
* Add Quantile stats to LeRobotDataset (#1985)
* - Add RunningQuantileStats class for efficient histogram-based quantile computation
- Integrate quantile parameters (compute_quantiles, quantiles) into LeRobotDataset
- Support quantile computation during episode collection and aggregation
- Add comprehensive function-based test suite (24 tests) for quantile functionality
- Maintain full backward compatibility with existing stats computation
- Enable configurable quantiles (default: [0.01, 0.99]) for robust normalization
* style fixes, make quantiles computation by default to new datasets
* fix tests
* - Added DEFAULT_QUANTILES=[0.01, 0.10, 0.50, 0.90, 0.99] to be computed for each features instead of being chosen by the user
- Fortified tests.
* - add helper functions to reshape stats
- add missing test for quantiles
* - Add QUANTILE normalization mode to normalize the data with the 1st and 99th percentiles.
- Add QUANTILE10 normalization mode to normalize the data with the 10th and 90th percentiles.
* style fixes
* Added missing lisence
* Simplify compute_stats
* - added script `augment_dataset_quantile_stats.py` so that we can add quantile stats to existing v3 datasets that dont have quatniles
- modified quantile computation instead of using the edge for the value, interpolate the values in the bin
* rename pi0/pi05 files
* Remove open pi patch and use custom transformer branch for now
* renaming
* fix
* Revert "fix"
This reverts commit 1ea65730ac2cbca6e5869df734fbd4392561b3c6.
* fix naming
* feet(pi0/pi0.5): add pipeline (#2009)
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* refactor(pi05): update imports and rename configuration classes
- Changed imports to reflect the new naming convention for PI05 configuration and policy classes.
- Renamed `PI05OpenPIConfig` to `PI05Config` and `PI05OpenPIPolicy` to `PI05Policy` for consistency.
- Introduced a new processor file for PI05, implementing pre-processing and post-processing steps.
- Updated tests to utilize the renamed classes, ensuring functionality and consistency across the codebase.
* update(pi05): increase tokenizer_max_length for improved processing
- Changed the `tokenizer_max_length` from 48 to 200 to enhance the model's capability in handling longer sequences.
- This adjustment aims to improve the overall performance and flexibility of the PI05 configuration.
* add default for state (max_state_dim)
* correct naming
* fix import
* cleanup code
* remove unused test
* us quantiles for action
* move to device
* remove discrete state assert
* fix pi05 test
* move pi05 to device
* use base models in comparison tests
* small renames for tests
* change number of tokens pi05 test
* fix openpi tokenization in test
* fix hub test
* fix test
* assert lerobot vs openpi tests
---------
Co-authored-by: Pepijn <pepijn@huggingface.co>
* add headers
* add back previously removed imports
* update if statement load processor with dataset stats
* remove to avoid circular import
* inject dataset stats for pretrained models
* check normalization before applying
* add link to quantile augument script
* fix(policies): transformers import for ci in PI0 & PI05 (#2039)
* fix(policies): transformers import for ci in PI0
* fix(policies): transformers import for ci in PI05
* test(processor): fix expected raise when normalization types are missing (#2040)
* switch normalization order pipeline for pi05
* Fix/quantiles script (#2064)
* refactor augment stats with quantiles script
add parallelization for faster processing
shift the quantile normalization between -1 1
* fix replay buffer tests
* fix comment
* overwrite the pipeline normalization features with the policy features
* remove double normalization overwrite
* cleanup from pretrained
* remove typo
* also set norm_map
* fix(augment_quantiles) images incorrectly divided by 255
* clamp quantiles
* link to lerobot base models
* rename tests
* encorperate PR feedback
* update docstring for RunningQuantileStats
* update doc links
* Revert "clamp quantiles"
This reverts commit 172207471c8f2cb62958e9a9e6a0535ba3ff67d4.
* fix self.paligemma
* fix tests related to quantiles that were scaled to [0,1], the new range is [-1, 1]
* fix libero doc and use different transformer branch
* use fix branch instead of feat
* update results libero
* add new line
* fix formatting
* precommit
* update results libero
* update libero doc
* update title
* final changes
* add quantiles to test
* run pre commit
---------
Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Steven Palma <steven.palma@huggingface.co>
2025-10-02 13:14:45 +02:00
|
|
|
axes_to_reduce = (0, 2, 3)
|
|
|
|
|
keepdims = True
|
|
|
|
|
else:
|
|
|
|
|
axes_to_reduce = 0
|
|
|
|
|
keepdims = data.ndim == 1
|
|
|
|
|
|
|
|
|
|
ep_stats[key] = get_feature_stats(
|
|
|
|
|
data, axis=axes_to_reduce, keepdims=keepdims, quantile_list=DEFAULT_QUANTILES
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
if dataset.features[key]["dtype"] in ["image", "video"]:
|
2025-10-02 18:28:44 +02:00
|
|
|
ep_stats[key] = {
|
|
|
|
|
k: v if k == "count" else np.squeeze(v, axis=0) for k, v in ep_stats[key].items()
|
|
|
|
|
}
|
Add OpenPi, Pi0 and Pi0.5 (#1910)
* initial commit
* change device in test
* do detailed import
* adhere to python 3.11 syntax
* fix autodocstring
* additionally
* do same in other files
* add model. prefix to all keys in state dict
* use dummy stats
* add pi05
* also shorten action_steps
* fix test
* all test pass! and fix tokenizer max length between 05 and 0
* remove test
* fix transformer dependency
* fix test
* split pi0 and pi05 policy in seperate files
* fix test
* fix push to hub test
* add some comments, license and readme
* remove warning in config
* add pi05 to factory
* remove check
* rename action_horizon to chunk_size
* clean up padding of state and action (more in line with lerobot pi0)
* add openpi image transforms for training and add more flexibility to _preprocess_images similar to lerobot pi0
* fix key match from pytorch state dict (similar keys to openpi implementation now)
* also for pi05
* update to python 3.11
* revert to openpi transformer replace python 3.11
* fix(modeling pi0): nit warning message
* use safeauto_docstring
* fix: remove unused param
* fix from pretrained
* add preprocess tests
* also compile forward method
* Do not add model prefix to normalization
* use same name for action and state dim as lerobot pi0 and remove fixed image keys
* load from pretrained_path
* temp: hardcode base model
* fix override self.pretrained_path = None overwrite
* rename to loss
* remove additional image augmentations, lerobot dataset already does this
* Add docs
* put tests in test folder
* Add test to instatiate all base models
* go back to python 3.10
* update docs
* adapt docs pi05
* change docs: finetune base model options
* minor docs fixes and dependencies
* remove todo
* cast float64 to float32 for mps
* skip if no transformers
* fix tests
* add new models to modelcard
* add back init
* fix circular input
* feat: only run pi test on GPU
* remove require_nightly_gpu
* replace decorator test_pi0_openpi
* rename action_dim, state_dim to max_action_dim, max_state_dim
* fix doc and constants
* cleanup tests
* fix from pretrained
* fix tests
* add comment pi0 pi05 tests, add image features to pi0 pi05 hub tests
* fix, state is included in language not in flow head
* Move test to specific folder
* and paligemma task with newline
* remove add_special_tokens, not needed
* feedback pr
* Remove previous pi0 and rename pi0_openpi and pi05_openpi
* Add Quantile stats to LeRobotDataset (#1985)
* - Add RunningQuantileStats class for efficient histogram-based quantile computation
- Integrate quantile parameters (compute_quantiles, quantiles) into LeRobotDataset
- Support quantile computation during episode collection and aggregation
- Add comprehensive function-based test suite (24 tests) for quantile functionality
- Maintain full backward compatibility with existing stats computation
- Enable configurable quantiles (default: [0.01, 0.99]) for robust normalization
* style fixes, make quantiles computation by default to new datasets
* fix tests
* - Added DEFAULT_QUANTILES=[0.01, 0.10, 0.50, 0.90, 0.99] to be computed for each features instead of being chosen by the user
- Fortified tests.
* - add helper functions to reshape stats
- add missing test for quantiles
* - Add QUANTILE normalization mode to normalize the data with the 1st and 99th percentiles.
- Add QUANTILE10 normalization mode to normalize the data with the 10th and 90th percentiles.
* style fixes
* Added missing lisence
* Simplify compute_stats
* - added script `augment_dataset_quantile_stats.py` so that we can add quantile stats to existing v3 datasets that dont have quatniles
- modified quantile computation instead of using the edge for the value, interpolate the values in the bin
* rename pi0/pi05 files
* Remove open pi patch and use custom transformer branch for now
* renaming
* fix
* Revert "fix"
This reverts commit 1ea65730ac2cbca6e5869df734fbd4392561b3c6.
* fix naming
* feet(pi0/pi0.5): add pipeline (#2009)
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* refactor(pi05): update imports and rename configuration classes
- Changed imports to reflect the new naming convention for PI05 configuration and policy classes.
- Renamed `PI05OpenPIConfig` to `PI05Config` and `PI05OpenPIPolicy` to `PI05Policy` for consistency.
- Introduced a new processor file for PI05, implementing pre-processing and post-processing steps.
- Updated tests to utilize the renamed classes, ensuring functionality and consistency across the codebase.
* update(pi05): increase tokenizer_max_length for improved processing
- Changed the `tokenizer_max_length` from 48 to 200 to enhance the model's capability in handling longer sequences.
- This adjustment aims to improve the overall performance and flexibility of the PI05 configuration.
* add default for state (max_state_dim)
* correct naming
* fix import
* cleanup code
* remove unused test
* us quantiles for action
* move to device
* remove discrete state assert
* fix pi05 test
* move pi05 to device
* use base models in comparison tests
* small renames for tests
* change number of tokens pi05 test
* fix openpi tokenization in test
* fix hub test
* fix test
* assert lerobot vs openpi tests
---------
Co-authored-by: Pepijn <pepijn@huggingface.co>
* add headers
* add back previously removed imports
* update if statement load processor with dataset stats
* remove to avoid circular import
* inject dataset stats for pretrained models
* check normalization before applying
* add link to quantile augument script
* fix(policies): transformers import for ci in PI0 & PI05 (#2039)
* fix(policies): transformers import for ci in PI0
* fix(policies): transformers import for ci in PI05
* test(processor): fix expected raise when normalization types are missing (#2040)
* switch normalization order pipeline for pi05
* Fix/quantiles script (#2064)
* refactor augment stats with quantiles script
add parallelization for faster processing
shift the quantile normalization between -1 1
* fix replay buffer tests
* fix comment
* overwrite the pipeline normalization features with the policy features
* remove double normalization overwrite
* cleanup from pretrained
* remove typo
* also set norm_map
* fix(augment_quantiles) images incorrectly divided by 255
* clamp quantiles
* link to lerobot base models
* rename tests
* encorperate PR feedback
* update docstring for RunningQuantileStats
* update doc links
* Revert "clamp quantiles"
This reverts commit 172207471c8f2cb62958e9a9e6a0535ba3ff67d4.
* fix self.paligemma
* fix tests related to quantiles that were scaled to [0,1], the new range is [-1, 1]
* fix libero doc and use different transformer branch
* use fix branch instead of feat
* update results libero
* add new line
* fix formatting
* precommit
* update results libero
* update libero doc
* update title
* final changes
* add quantiles to test
* run pre commit
---------
Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Steven Palma <steven.palma@huggingface.co>
2025-10-02 13:14:45 +02:00
|
|
|
|
|
|
|
|
return ep_stats
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def compute_quantile_stats_for_dataset(dataset: LeRobotDataset) -> dict[str, dict]:
|
|
|
|
|
"""Compute quantile statistics for all episodes in the dataset.
|
|
|
|
|
|
|
|
|
|
Args:
|
|
|
|
|
dataset: The LeRobot dataset to compute statistics for
|
|
|
|
|
|
|
|
|
|
Returns:
|
|
|
|
|
Dictionary containing aggregated statistics with quantiles
|
2025-10-02 18:28:44 +02:00
|
|
|
|
|
|
|
|
Note:
|
|
|
|
|
Video decoding operations are not thread-safe, so we process episodes sequentially
|
|
|
|
|
when video keys are present. For datasets without videos, we use parallel processing
|
|
|
|
|
with ThreadPoolExecutor for better performance.
|
Add OpenPi, Pi0 and Pi0.5 (#1910)
* initial commit
* change device in test
* do detailed import
* adhere to python 3.11 syntax
* fix autodocstring
* additionally
* do same in other files
* add model. prefix to all keys in state dict
* use dummy stats
* add pi05
* also shorten action_steps
* fix test
* all test pass! and fix tokenizer max length between 05 and 0
* remove test
* fix transformer dependency
* fix test
* split pi0 and pi05 policy in seperate files
* fix test
* fix push to hub test
* add some comments, license and readme
* remove warning in config
* add pi05 to factory
* remove check
* rename action_horizon to chunk_size
* clean up padding of state and action (more in line with lerobot pi0)
* add openpi image transforms for training and add more flexibility to _preprocess_images similar to lerobot pi0
* fix key match from pytorch state dict (similar keys to openpi implementation now)
* also for pi05
* update to python 3.11
* revert to openpi transformer replace python 3.11
* fix(modeling pi0): nit warning message
* use safeauto_docstring
* fix: remove unused param
* fix from pretrained
* add preprocess tests
* also compile forward method
* Do not add model prefix to normalization
* use same name for action and state dim as lerobot pi0 and remove fixed image keys
* load from pretrained_path
* temp: hardcode base model
* fix override self.pretrained_path = None overwrite
* rename to loss
* remove additional image augmentations, lerobot dataset already does this
* Add docs
* put tests in test folder
* Add test to instatiate all base models
* go back to python 3.10
* update docs
* adapt docs pi05
* change docs: finetune base model options
* minor docs fixes and dependencies
* remove todo
* cast float64 to float32 for mps
* skip if no transformers
* fix tests
* add new models to modelcard
* add back init
* fix circular input
* feat: only run pi test on GPU
* remove require_nightly_gpu
* replace decorator test_pi0_openpi
* rename action_dim, state_dim to max_action_dim, max_state_dim
* fix doc and constants
* cleanup tests
* fix from pretrained
* fix tests
* add comment pi0 pi05 tests, add image features to pi0 pi05 hub tests
* fix, state is included in language not in flow head
* Move test to specific folder
* and paligemma task with newline
* remove add_special_tokens, not needed
* feedback pr
* Remove previous pi0 and rename pi0_openpi and pi05_openpi
* Add Quantile stats to LeRobotDataset (#1985)
* - Add RunningQuantileStats class for efficient histogram-based quantile computation
- Integrate quantile parameters (compute_quantiles, quantiles) into LeRobotDataset
- Support quantile computation during episode collection and aggregation
- Add comprehensive function-based test suite (24 tests) for quantile functionality
- Maintain full backward compatibility with existing stats computation
- Enable configurable quantiles (default: [0.01, 0.99]) for robust normalization
* style fixes, make quantiles computation by default to new datasets
* fix tests
* - Added DEFAULT_QUANTILES=[0.01, 0.10, 0.50, 0.90, 0.99] to be computed for each features instead of being chosen by the user
- Fortified tests.
* - add helper functions to reshape stats
- add missing test for quantiles
* - Add QUANTILE normalization mode to normalize the data with the 1st and 99th percentiles.
- Add QUANTILE10 normalization mode to normalize the data with the 10th and 90th percentiles.
* style fixes
* Added missing lisence
* Simplify compute_stats
* - added script `augment_dataset_quantile_stats.py` so that we can add quantile stats to existing v3 datasets that dont have quatniles
- modified quantile computation instead of using the edge for the value, interpolate the values in the bin
* rename pi0/pi05 files
* Remove open pi patch and use custom transformer branch for now
* renaming
* fix
* Revert "fix"
This reverts commit 1ea65730ac2cbca6e5869df734fbd4392561b3c6.
* fix naming
* feet(pi0/pi0.5): add pipeline (#2009)
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* refactor(pi05): update imports and rename configuration classes
- Changed imports to reflect the new naming convention for PI05 configuration and policy classes.
- Renamed `PI05OpenPIConfig` to `PI05Config` and `PI05OpenPIPolicy` to `PI05Policy` for consistency.
- Introduced a new processor file for PI05, implementing pre-processing and post-processing steps.
- Updated tests to utilize the renamed classes, ensuring functionality and consistency across the codebase.
* update(pi05): increase tokenizer_max_length for improved processing
- Changed the `tokenizer_max_length` from 48 to 200 to enhance the model's capability in handling longer sequences.
- This adjustment aims to improve the overall performance and flexibility of the PI05 configuration.
* add default for state (max_state_dim)
* correct naming
* fix import
* cleanup code
* remove unused test
* us quantiles for action
* move to device
* remove discrete state assert
* fix pi05 test
* move pi05 to device
* use base models in comparison tests
* small renames for tests
* change number of tokens pi05 test
* fix openpi tokenization in test
* fix hub test
* fix test
* assert lerobot vs openpi tests
---------
Co-authored-by: Pepijn <pepijn@huggingface.co>
* add headers
* add back previously removed imports
* update if statement load processor with dataset stats
* remove to avoid circular import
* inject dataset stats for pretrained models
* check normalization before applying
* add link to quantile augument script
* fix(policies): transformers import for ci in PI0 & PI05 (#2039)
* fix(policies): transformers import for ci in PI0
* fix(policies): transformers import for ci in PI05
* test(processor): fix expected raise when normalization types are missing (#2040)
* switch normalization order pipeline for pi05
* Fix/quantiles script (#2064)
* refactor augment stats with quantiles script
add parallelization for faster processing
shift the quantile normalization between -1 1
* fix replay buffer tests
* fix comment
* overwrite the pipeline normalization features with the policy features
* remove double normalization overwrite
* cleanup from pretrained
* remove typo
* also set norm_map
* fix(augment_quantiles) images incorrectly divided by 255
* clamp quantiles
* link to lerobot base models
* rename tests
* encorperate PR feedback
* update docstring for RunningQuantileStats
* update doc links
* Revert "clamp quantiles"
This reverts commit 172207471c8f2cb62958e9a9e6a0535ba3ff67d4.
* fix self.paligemma
* fix tests related to quantiles that were scaled to [0,1], the new range is [-1, 1]
* fix libero doc and use different transformer branch
* use fix branch instead of feat
* update results libero
* add new line
* fix formatting
* precommit
* update results libero
* update libero doc
* update title
* final changes
* add quantiles to test
* run pre commit
---------
Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Steven Palma <steven.palma@huggingface.co>
2025-10-02 13:14:45 +02:00
|
|
|
"""
|
|
|
|
|
logging.info(f"Computing quantile statistics for dataset with {dataset.num_episodes} episodes")
|
|
|
|
|
|
|
|
|
|
episode_stats_list = []
|
2025-10-02 18:28:44 +02:00
|
|
|
has_videos = len(dataset.meta.video_keys) > 0
|
|
|
|
|
|
|
|
|
|
if has_videos:
|
|
|
|
|
logging.info("Dataset contains video keys - using sequential processing for thread safety")
|
|
|
|
|
for episode_idx in tqdm(range(dataset.num_episodes), desc="Processing episodes"):
|
|
|
|
|
ep_stats = process_single_episode(dataset, episode_idx)
|
|
|
|
|
episode_stats_list.append(ep_stats)
|
|
|
|
|
else:
|
|
|
|
|
logging.info("Dataset has no video keys - using parallel processing for better performance")
|
|
|
|
|
max_workers = min(dataset.num_episodes, 16)
|
|
|
|
|
|
|
|
|
|
with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
|
|
|
|
|
future_to_episode = {
|
|
|
|
|
executor.submit(process_single_episode, dataset, episode_idx): episode_idx
|
|
|
|
|
for episode_idx in range(dataset.num_episodes)
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
episode_results = {}
|
|
|
|
|
with tqdm(total=dataset.num_episodes, desc="Processing episodes") as pbar:
|
|
|
|
|
for future in concurrent.futures.as_completed(future_to_episode):
|
|
|
|
|
episode_idx = future_to_episode[future]
|
|
|
|
|
ep_stats = future.result()
|
|
|
|
|
episode_results[episode_idx] = ep_stats
|
|
|
|
|
pbar.update(1)
|
Add OpenPi, Pi0 and Pi0.5 (#1910)
* initial commit
* change device in test
* do detailed import
* adhere to python 3.11 syntax
* fix autodocstring
* additionally
* do same in other files
* add model. prefix to all keys in state dict
* use dummy stats
* add pi05
* also shorten action_steps
* fix test
* all test pass! and fix tokenizer max length between 05 and 0
* remove test
* fix transformer dependency
* fix test
* split pi0 and pi05 policy in seperate files
* fix test
* fix push to hub test
* add some comments, license and readme
* remove warning in config
* add pi05 to factory
* remove check
* rename action_horizon to chunk_size
* clean up padding of state and action (more in line with lerobot pi0)
* add openpi image transforms for training and add more flexibility to _preprocess_images similar to lerobot pi0
* fix key match from pytorch state dict (similar keys to openpi implementation now)
* also for pi05
* update to python 3.11
* revert to openpi transformer replace python 3.11
* fix(modeling pi0): nit warning message
* use safeauto_docstring
* fix: remove unused param
* fix from pretrained
* add preprocess tests
* also compile forward method
* Do not add model prefix to normalization
* use same name for action and state dim as lerobot pi0 and remove fixed image keys
* load from pretrained_path
* temp: hardcode base model
* fix override self.pretrained_path = None overwrite
* rename to loss
* remove additional image augmentations, lerobot dataset already does this
* Add docs
* put tests in test folder
* Add test to instatiate all base models
* go back to python 3.10
* update docs
* adapt docs pi05
* change docs: finetune base model options
* minor docs fixes and dependencies
* remove todo
* cast float64 to float32 for mps
* skip if no transformers
* fix tests
* add new models to modelcard
* add back init
* fix circular input
* feat: only run pi test on GPU
* remove require_nightly_gpu
* replace decorator test_pi0_openpi
* rename action_dim, state_dim to max_action_dim, max_state_dim
* fix doc and constants
* cleanup tests
* fix from pretrained
* fix tests
* add comment pi0 pi05 tests, add image features to pi0 pi05 hub tests
* fix, state is included in language not in flow head
* Move test to specific folder
* and paligemma task with newline
* remove add_special_tokens, not needed
* feedback pr
* Remove previous pi0 and rename pi0_openpi and pi05_openpi
* Add Quantile stats to LeRobotDataset (#1985)
* - Add RunningQuantileStats class for efficient histogram-based quantile computation
- Integrate quantile parameters (compute_quantiles, quantiles) into LeRobotDataset
- Support quantile computation during episode collection and aggregation
- Add comprehensive function-based test suite (24 tests) for quantile functionality
- Maintain full backward compatibility with existing stats computation
- Enable configurable quantiles (default: [0.01, 0.99]) for robust normalization
* style fixes, make quantiles computation by default to new datasets
* fix tests
* - Added DEFAULT_QUANTILES=[0.01, 0.10, 0.50, 0.90, 0.99] to be computed for each features instead of being chosen by the user
- Fortified tests.
* - add helper functions to reshape stats
- add missing test for quantiles
* - Add QUANTILE normalization mode to normalize the data with the 1st and 99th percentiles.
- Add QUANTILE10 normalization mode to normalize the data with the 10th and 90th percentiles.
* style fixes
* Added missing lisence
* Simplify compute_stats
* - added script `augment_dataset_quantile_stats.py` so that we can add quantile stats to existing v3 datasets that dont have quatniles
- modified quantile computation instead of using the edge for the value, interpolate the values in the bin
* rename pi0/pi05 files
* Remove open pi patch and use custom transformer branch for now
* renaming
* fix
* Revert "fix"
This reverts commit 1ea65730ac2cbca6e5869df734fbd4392561b3c6.
* fix naming
* feet(pi0/pi0.5): add pipeline (#2009)
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* refactor(pi05): update imports and rename configuration classes
- Changed imports to reflect the new naming convention for PI05 configuration and policy classes.
- Renamed `PI05OpenPIConfig` to `PI05Config` and `PI05OpenPIPolicy` to `PI05Policy` for consistency.
- Introduced a new processor file for PI05, implementing pre-processing and post-processing steps.
- Updated tests to utilize the renamed classes, ensuring functionality and consistency across the codebase.
* update(pi05): increase tokenizer_max_length for improved processing
- Changed the `tokenizer_max_length` from 48 to 200 to enhance the model's capability in handling longer sequences.
- This adjustment aims to improve the overall performance and flexibility of the PI05 configuration.
* add default for state (max_state_dim)
* correct naming
* fix import
* cleanup code
* remove unused test
* us quantiles for action
* move to device
* remove discrete state assert
* fix pi05 test
* move pi05 to device
* use base models in comparison tests
* small renames for tests
* change number of tokens pi05 test
* fix openpi tokenization in test
* fix hub test
* fix test
* assert lerobot vs openpi tests
---------
Co-authored-by: Pepijn <pepijn@huggingface.co>
* add headers
* add back previously removed imports
* update if statement load processor with dataset stats
* remove to avoid circular import
* inject dataset stats for pretrained models
* check normalization before applying
* add link to quantile augument script
* fix(policies): transformers import for ci in PI0 & PI05 (#2039)
* fix(policies): transformers import for ci in PI0
* fix(policies): transformers import for ci in PI05
* test(processor): fix expected raise when normalization types are missing (#2040)
* switch normalization order pipeline for pi05
* Fix/quantiles script (#2064)
* refactor augment stats with quantiles script
add parallelization for faster processing
shift the quantile normalization between -1 1
* fix replay buffer tests
* fix comment
* overwrite the pipeline normalization features with the policy features
* remove double normalization overwrite
* cleanup from pretrained
* remove typo
* also set norm_map
* fix(augment_quantiles) images incorrectly divided by 255
* clamp quantiles
* link to lerobot base models
* rename tests
* encorperate PR feedback
* update docstring for RunningQuantileStats
* update doc links
* Revert "clamp quantiles"
This reverts commit 172207471c8f2cb62958e9a9e6a0535ba3ff67d4.
* fix self.paligemma
* fix tests related to quantiles that were scaled to [0,1], the new range is [-1, 1]
* fix libero doc and use different transformer branch
* use fix branch instead of feat
* update results libero
* add new line
* fix formatting
* precommit
* update results libero
* update libero doc
* update title
* final changes
* add quantiles to test
* run pre commit
---------
Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Steven Palma <steven.palma@huggingface.co>
2025-10-02 13:14:45 +02:00
|
|
|
|
|
|
|
|
for episode_idx in range(dataset.num_episodes):
|
|
|
|
|
if episode_idx in episode_results:
|
|
|
|
|
episode_stats_list.append(episode_results[episode_idx])
|
|
|
|
|
|
|
|
|
|
if not episode_stats_list:
|
|
|
|
|
raise ValueError("No episode data found for computing statistics")
|
|
|
|
|
|
|
|
|
|
logging.info(f"Aggregating statistics from {len(episode_stats_list)} episodes")
|
|
|
|
|
return aggregate_stats(episode_stats_list)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def augment_dataset_with_quantile_stats(
|
|
|
|
|
repo_id: str,
|
|
|
|
|
root: str | Path | None = None,
|
|
|
|
|
overwrite: bool = False,
|
|
|
|
|
) -> None:
|
|
|
|
|
"""Augment a dataset with quantile statistics if they are missing.
|
|
|
|
|
|
|
|
|
|
Args:
|
|
|
|
|
repo_id: Repository ID of the dataset
|
|
|
|
|
root: Local root directory for the dataset
|
|
|
|
|
overwrite: Overwrite existing quantile statistics if they already exist
|
|
|
|
|
"""
|
|
|
|
|
logging.info(f"Loading dataset: {repo_id}")
|
|
|
|
|
dataset = LeRobotDataset(
|
|
|
|
|
repo_id=repo_id,
|
|
|
|
|
root=root,
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
if not overwrite and has_quantile_stats(dataset.meta.stats):
|
|
|
|
|
logging.info("Dataset already contains quantile statistics. No action needed.")
|
|
|
|
|
return
|
|
|
|
|
|
|
|
|
|
logging.info("Dataset does not contain quantile statistics. Computing them now...")
|
|
|
|
|
|
|
|
|
|
new_stats = compute_quantile_stats_for_dataset(dataset)
|
|
|
|
|
|
|
|
|
|
logging.info("Updating dataset metadata with new quantile statistics")
|
|
|
|
|
dataset.meta.stats = new_stats
|
|
|
|
|
|
|
|
|
|
write_stats(new_stats, dataset.meta.root)
|
|
|
|
|
|
|
|
|
|
logging.info("Successfully updated dataset with quantile statistics")
|
|
|
|
|
dataset.push_to_hub()
|
|
|
|
|
|
2025-10-02 18:28:44 +02:00
|
|
|
hub_api = HfApi()
|
|
|
|
|
try:
|
|
|
|
|
hub_api.delete_tag(repo_id, tag=CODEBASE_VERSION, repo_type="dataset")
|
|
|
|
|
except HTTPError as e:
|
|
|
|
|
logging.info(f"tag={CODEBASE_VERSION} probably doesn't exist. Skipping exception ({e})")
|
|
|
|
|
pass
|
|
|
|
|
hub_api.create_tag(repo_id, tag=CODEBASE_VERSION, revision=None, repo_type="dataset")
|
|
|
|
|
|
Add OpenPi, Pi0 and Pi0.5 (#1910)
* initial commit
* change device in test
* do detailed import
* adhere to python 3.11 syntax
* fix autodocstring
* additionally
* do same in other files
* add model. prefix to all keys in state dict
* use dummy stats
* add pi05
* also shorten action_steps
* fix test
* all test pass! and fix tokenizer max length between 05 and 0
* remove test
* fix transformer dependency
* fix test
* split pi0 and pi05 policy in seperate files
* fix test
* fix push to hub test
* add some comments, license and readme
* remove warning in config
* add pi05 to factory
* remove check
* rename action_horizon to chunk_size
* clean up padding of state and action (more in line with lerobot pi0)
* add openpi image transforms for training and add more flexibility to _preprocess_images similar to lerobot pi0
* fix key match from pytorch state dict (similar keys to openpi implementation now)
* also for pi05
* update to python 3.11
* revert to openpi transformer replace python 3.11
* fix(modeling pi0): nit warning message
* use safeauto_docstring
* fix: remove unused param
* fix from pretrained
* add preprocess tests
* also compile forward method
* Do not add model prefix to normalization
* use same name for action and state dim as lerobot pi0 and remove fixed image keys
* load from pretrained_path
* temp: hardcode base model
* fix override self.pretrained_path = None overwrite
* rename to loss
* remove additional image augmentations, lerobot dataset already does this
* Add docs
* put tests in test folder
* Add test to instatiate all base models
* go back to python 3.10
* update docs
* adapt docs pi05
* change docs: finetune base model options
* minor docs fixes and dependencies
* remove todo
* cast float64 to float32 for mps
* skip if no transformers
* fix tests
* add new models to modelcard
* add back init
* fix circular input
* feat: only run pi test on GPU
* remove require_nightly_gpu
* replace decorator test_pi0_openpi
* rename action_dim, state_dim to max_action_dim, max_state_dim
* fix doc and constants
* cleanup tests
* fix from pretrained
* fix tests
* add comment pi0 pi05 tests, add image features to pi0 pi05 hub tests
* fix, state is included in language not in flow head
* Move test to specific folder
* and paligemma task with newline
* remove add_special_tokens, not needed
* feedback pr
* Remove previous pi0 and rename pi0_openpi and pi05_openpi
* Add Quantile stats to LeRobotDataset (#1985)
* - Add RunningQuantileStats class for efficient histogram-based quantile computation
- Integrate quantile parameters (compute_quantiles, quantiles) into LeRobotDataset
- Support quantile computation during episode collection and aggregation
- Add comprehensive function-based test suite (24 tests) for quantile functionality
- Maintain full backward compatibility with existing stats computation
- Enable configurable quantiles (default: [0.01, 0.99]) for robust normalization
* style fixes, make quantiles computation by default to new datasets
* fix tests
* - Added DEFAULT_QUANTILES=[0.01, 0.10, 0.50, 0.90, 0.99] to be computed for each features instead of being chosen by the user
- Fortified tests.
* - add helper functions to reshape stats
- add missing test for quantiles
* - Add QUANTILE normalization mode to normalize the data with the 1st and 99th percentiles.
- Add QUANTILE10 normalization mode to normalize the data with the 10th and 90th percentiles.
* style fixes
* Added missing lisence
* Simplify compute_stats
* - added script `augment_dataset_quantile_stats.py` so that we can add quantile stats to existing v3 datasets that dont have quatniles
- modified quantile computation instead of using the edge for the value, interpolate the values in the bin
* rename pi0/pi05 files
* Remove open pi patch and use custom transformer branch for now
* renaming
* fix
* Revert "fix"
This reverts commit 1ea65730ac2cbca6e5869df734fbd4392561b3c6.
* fix naming
* feet(pi0/pi0.5): add pipeline (#2009)
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* feat(processor): convert openpi model with processor
* TODO: Make test works
* fix(modeling_pi0openpi): update attention mask value and time scaling; improve task handling in tests
- Changed the attention mask value from `self.config.attention_mask_value` to a fixed value of `-2.3819763e38`.
- Updated time scaling in the `sample_noise` method to use a constant factor of `0.999` and an offset of `0.001`.
- Enhanced task handling in tests to ensure proper formatting and batch size consistency.
- Cleaned up commented-out test code for clarity.
* refactor(pi0): rename PI0OpenPIConfig and PI0OpenPIPolicy to PI0Config and PI0Policy
- Updated imports and references throughout the codebase to reflect the new naming convention.
- Introduced a new processor file for PI0 to handle pre-processing and post-processing steps.
- Adjusted tests to utilize the renamed classes, ensuring consistency and functionality.
- Enhanced clarity and maintainability by removing outdated naming conventions.
* refactor(pi05): rename PI0OpenPIPolicy to PI0Policy and update configuration
- Renamed `PI0OpenPIPolicy` to `PI0Policy` for consistency with naming conventions.
- Updated the `PI05OpenPIConfig` to include a new `tokenizer_max_length` attribute and changed the normalization mode for state from `MEAN_STD` to `QUANTILES`.
- Simplified model initialization in `PI05OpenPIPolicy` by removing unused `dataset_stats` parameter.
- Added a new processor class for `Pi05PrepareStateTokenizerProcessorStep` with `@dataclass` for improved readability.
- Introduced a test script to compare the integration of the PI0OpenPI policy with the original implementation, ensuring local testing compatibility.
* refactor(pi05): update imports and rename configuration classes
- Changed imports to reflect the new naming convention for PI05 configuration and policy classes.
- Renamed `PI05OpenPIConfig` to `PI05Config` and `PI05OpenPIPolicy` to `PI05Policy` for consistency.
- Introduced a new processor file for PI05, implementing pre-processing and post-processing steps.
- Updated tests to utilize the renamed classes, ensuring functionality and consistency across the codebase.
* update(pi05): increase tokenizer_max_length for improved processing
- Changed the `tokenizer_max_length` from 48 to 200 to enhance the model's capability in handling longer sequences.
- This adjustment aims to improve the overall performance and flexibility of the PI05 configuration.
* add default for state (max_state_dim)
* correct naming
* fix import
* cleanup code
* remove unused test
* us quantiles for action
* move to device
* remove discrete state assert
* fix pi05 test
* move pi05 to device
* use base models in comparison tests
* small renames for tests
* change number of tokens pi05 test
* fix openpi tokenization in test
* fix hub test
* fix test
* assert lerobot vs openpi tests
---------
Co-authored-by: Pepijn <pepijn@huggingface.co>
* add headers
* add back previously removed imports
* update if statement load processor with dataset stats
* remove to avoid circular import
* inject dataset stats for pretrained models
* check normalization before applying
* add link to quantile augument script
* fix(policies): transformers import for ci in PI0 & PI05 (#2039)
* fix(policies): transformers import for ci in PI0
* fix(policies): transformers import for ci in PI05
* test(processor): fix expected raise when normalization types are missing (#2040)
* switch normalization order pipeline for pi05
* Fix/quantiles script (#2064)
* refactor augment stats with quantiles script
add parallelization for faster processing
shift the quantile normalization between -1 1
* fix replay buffer tests
* fix comment
* overwrite the pipeline normalization features with the policy features
* remove double normalization overwrite
* cleanup from pretrained
* remove typo
* also set norm_map
* fix(augment_quantiles) images incorrectly divided by 255
* clamp quantiles
* link to lerobot base models
* rename tests
* encorperate PR feedback
* update docstring for RunningQuantileStats
* update doc links
* Revert "clamp quantiles"
This reverts commit 172207471c8f2cb62958e9a9e6a0535ba3ff67d4.
* fix self.paligemma
* fix tests related to quantiles that were scaled to [0,1], the new range is [-1, 1]
* fix libero doc and use different transformer branch
* use fix branch instead of feat
* update results libero
* add new line
* fix formatting
* precommit
* update results libero
* update libero doc
* update title
* final changes
* add quantiles to test
* run pre commit
---------
Signed-off-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
Co-authored-by: Steven Palma <steven.palma@huggingface.co>
2025-10-02 13:14:45 +02:00
|
|
|
|
|
|
|
|
def main():
|
|
|
|
|
"""Main function to run the augmentation script."""
|
|
|
|
|
parser = argparse.ArgumentParser(description="Augment LeRobot dataset with quantile statistics")
|
|
|
|
|
|
|
|
|
|
parser.add_argument(
|
|
|
|
|
"--repo-id",
|
|
|
|
|
type=str,
|
|
|
|
|
required=True,
|
|
|
|
|
help="Repository ID of the dataset (e.g., 'lerobot/pusht')",
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
parser.add_argument(
|
|
|
|
|
"--root",
|
|
|
|
|
type=str,
|
|
|
|
|
help="Local root directory for the dataset",
|
|
|
|
|
)
|
|
|
|
|
parser.add_argument(
|
|
|
|
|
"--overwrite",
|
|
|
|
|
action="store_true",
|
|
|
|
|
help="Overwrite existing quantile statistics if they already exist",
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
args = parser.parse_args()
|
|
|
|
|
root = Path(args.root) if args.root else None
|
|
|
|
|
|
|
|
|
|
init_logging()
|
|
|
|
|
|
|
|
|
|
augment_dataset_with_quantile_stats(
|
|
|
|
|
repo_id=args.repo_id,
|
|
|
|
|
root=root,
|
|
|
|
|
overwrite=args.overwrite,
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
if __name__ == "__main__":
|
|
|
|
|
main()
|