Commit Graph

23 Commits

Author SHA1 Message Date
Steven Palma
d893bf1e30 chore(processor): rename specialized processor -> XYZProcessorStep (#1852) 2025-09-03 17:30:47 +02:00
Steven Palma
8c796b39f5 chore(processor): rename RobotProcessor -> DataProcessorPipeline (#1850) 2025-09-03 17:13:16 +02:00
Steven Palma
b052843f08 refactor(processors): unify import statements by consolidating pipeline imports into the main processor module (#1845) 2025-09-02 18:26:59 +02:00
Adil Zouitine
645c87e3a9 refactor(converters): gather converters and refactor the logic (#1833)
* refactor(converters): move batch transition functions to converters module

- Moved `_default_batch_to_transition` and `_default_transition_to_batch` functions from `pipeline.py` to `converters.py` for better organization and separation of concerns.
- Updated references in `RobotProcessor` to use the new location of these functions.
- Added tests to ensure correct functionality of the transition functions, including handling of index and task_index fields.
- Removed redundant tests from `pipeline.py` to streamline the test suite.

* refactor(processor): reorganize EnvTransition and TransitionKey definitions

- Moved `EnvTransition` and `TransitionKey` classes from `pipeline.py` to a new `core.py` module for better structure and maintainability.
- Updated import statements across relevant modules to reflect the new location of these definitions, ensuring consistent access throughout the codebase.

* refactor(converters): rename and update dataset frame conversion functions

- Replaced `to_dataset_frame` with `transition_to_dataset_frame` for clarity and consistency in naming.
- Updated references in `record.py`, `pipeline.py`, and tests to use the new function name.
- Introduced `merge_transitions` to streamline the merging of transitions, enhancing readability and maintainability.
- Adjusted related tests to ensure correct functionality with the new naming conventions.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(processor): solve conflict artefacts

* refactor(converters): remove unused identity function and update type hints for merge_transitions

* refactor(processor): remove unused identity import and clean up gym_manipulator.py

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Steven Palma <steven.palma@huggingface.co>
2025-09-02 15:33:38 +02:00
Adil Zouitine
d32b76cc66 refactor(processor): improve processor pipeline typing with generic type (#1810)
* refactor(processor): introduce generic type for to_output

- Always return `TOutput`
- Remove `_prepare_transition`, so `__call__` now always returns `TOutput`
- Update tests accordingly
- This refactor paves the way for adding settings for `to_transition` and `to_output` in `make_processor` and the post-processor

* refactor(processor): consolidate ProcessorKwargs usage across policies

- Removed the ProcessorTypes module and integrated ProcessorKwargs directly into the processor pipeline.
- Updated multiple policy files to utilize the new ProcessorKwargs structure for preprocessor and postprocessor arguments.
- Simplified the handling of processor kwargs by initializing them to empty dictionaries when not provided.
2025-09-02 12:57:14 +02:00
Steven Palma
ce665160ae feat(processor): multiple improvements to the pipeline porting (#1749)
* [Port codebase pipeline] General fixes for RL and scripts (#1748)

* Refactor dataset configuration in documentation and codebase

- Updated dataset configuration keys from `dataset_root` to `root` and `num_episodes` to `num_episodes_to_record` for consistency.
- Adjusted replay episode handling by renaming `episode` to `replay_episode`.
- Enhanced documentation
- added specific processor to transform from policy actions to delta actions

* Added Robot action to tensor processor
Added new processor script for dealing with gym specific action processing

* removed RobotAction2Tensor processor; imrpoved choosing observations in actor

* nit in delta action

* added missing reset functions to kinematics

* Adapt teleoperate and replay to pipeline similar to record

* refactor(processors): move to inheritance (#1750)

* fix(teleoperator): improvements phone implementation (#1752)

* fix(teleoperator): protect shared state in phone implementation

* refactor(teleop): separate classes in phone

* fix: solve breaking changes (#1753)

* refactor(policies): multiple improvements (#1754)

* refactor(processor): simpler logic in device processor (#1755)

* refactor(processor): euclidean distance in delta action processor (#1757)

* refactor(processor): improvements to joint observations processor migration (#1758)

* refactor(processor): improvements to tokenizer migration (#1759)

* refactor(processor): improvements to tokenizer migration

* fix(tests): tokenizer tests regression from #1750

* fix(processors): fix float comparison and config in hil processors (#1760)

* chore(teleop): remove unnecessary callbacks in KeyboardEndEffectorTeleop (#1761)

* refactor(processor): improvements normalize pipeline migration (#1756)

* refactor(processor): several improvements normalize processor step

* refactor(processor): more improvements normalize processor

* refactor(processor): more changes to normalizer

* refactor(processor): take a different approach to DRY

* refactor(processor): final design

* chore(record): revert comment and continue deleted (#1764)

* refactor(examples): pipeline phone examples (#1769)

* refactor(examples): phone teleop + teleop script

* refactor(examples): phone replay + replay

* chore(examples): rename phone example files & folders

* feat(processor): fix improvements to the pipeline porting (#1796)

* refactor(processor): enhance tensor device handling in normalization process (#1795)

* refactor(tests): remove unsupported device detection test for complementary data (#1797)

* chore(tests): update ToBatchProcessor test (#1798)

* refactor(tests): remove in-place mutation tests for actions and complementary data in batch processor

* test(tests): add tests for action and task processing in batch processor

* add names for android and ios phone (#1799)

* use _tensor_stats in normalize processor (#1800)

* fix(normalize_processor): correct device reference for tensor epsilon handling (#1801)

* add point 5 add missing feature contracts (#1806)

* Fix PR comments 1452 (#1807)

* use key to determine image

* Address rest of PR comments

* use PolicyFeatures in transform_features

---------

Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>

---------

Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Adil Zouitine <adilzouitinegm@gmail.com>
Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>
2025-08-31 20:38:52 +02:00
Michel Aractingi
0053defa2e Refactorgym_manipulator.py using the universal pipeline (#1650)
* Migrate gym_manipulator to use the pipeline
Added get_teleop_events function to capture relevant events from teleop devices unrelated to actions

* Added the capability to record a dataset

* Added the replay functionality with the pipeline

* Refactored `actor.py` to use the pipeline

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* RL works at this commit - fixed actor.py and bugs in gym_manipulator

* change folder structure to reduce the size of gym_manip

* Refactored hilserl config

* Remove dataset and mode from HilSerlEnvConfig to a GymManipulatorConfig to reduce verbose of configs during training

* format docs

* removed get_teleop_events from abc

* Refactor environment configuration and processing pipeline for GymHIL support. Removed device attribute from HILSerlRobotEnvConfig, added DummyTeleopDevice for simulation, and updated processor creation to accommodate GymHIL environments.

* Improved typing for HILRobotEnv config and GymManipulator config

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Migrated `gym_manipulator` to use a more modular structure similar to phone teleop

* Refactor gripper handling and transition processing in HIL and robot kinematic processors

- Updated gripper position handling to use a consistent key format across processors
- Improved the EEReferenceAndDelta class to handle reference joint positions.
- Added support for discrete gripper actions in the GripperVelocityToJoint processor.
- Refactored the gym manipulator to improve modularity and clarity in processing steps.

* Added delta_action_processor mapping wrapper

* Added missing file delta_action_processor and improved imports in `gym_manipulator`

* nit

* Added missing file joint_observation_processor

* Enhance processing architecture with new teleoperation processors

- Introduced `AddTeleopActionAsComplimentaryData` and `AddTeleopEventsAsInfo` for integrating teleoperator actions and events into transitions.
- Added `Torch2NumpyActionProcessor` and `Numpy2TorchActionProcessor` for seamless conversion between PyTorch tensors and NumPy arrays.
- Updated `__init__.py` to include new processors in module exports, improving modularity and clarity in the processing pipeline.
- GymHIL is now fully supported with HIL using the pipeline

* Refactor configuration structure for gym_hil integration

- Renamed sections for better readability, such as changing "Gym Wrappers Configuration" to "Processor Configuration."
- Enhanced documentation with clear examples for dataset collection and policy evaluation configurations.

* Enhance reset configuration and teleoperation event handling

- Added `terminate_on_success` parameter to `ResetConfig` and `InterventionActionProcessor` for controlling episode termination behavior upon success detection.
- Updated documentation to clarify the impact of `terminate_on_success` on data collection for reward classifier training.
- Refactored teleoperation event handling to use `TeleopEvents` constants for improved readability and maintainability across various modules.

* fix(keyboard teleop), delta action keys

* Added transform features and feature contract

* Added transform features for image crop

* Enum for TeleopEvents

* Update tranform_features delta action proc

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-08-11 11:07:55 +02:00
Adil Zouitine
5326ffe77e feature(pipeline): port tokenizer pipeline for VLA (#1645)
* feat(tokenizer): Introduce TokenizerProcessor for text tokenization

- Added TokenizerProcessor class to handle tokenization of task strings using Hugging Face's AutoTokenizer.
- Supports both string and list inputs, with customizable parameters for task key, output key, and tokenization settings.
- Implemented comprehensive unit tests to validate functionality, including handling of various input scenarios and integration with RobotProcessor.
- Updated types.py to include LANGUAGE feature type and modified __init__.py to register the new processor.

* feat(language): Enhance language processing in TokenizerProcessor

- Added OBS_LANGUAGE constant to define the observation language key.
- Updated TokenizerProcessor to store tokenized task data in the observation dictionary, ensuring compatibility with the new language feature.
- Introduced Pi0NewLineProcessor to append newlines to tasks for proper tokenization.
- Modified tests to validate the integration of language tokens and attention masks in the observation structure.

* feat(tokenizer): Add padding configuration to TokenizerProcessor

- Introduced `padding_side` parameter to the TokenizerProcessor for customizable padding direction.
- Updated the `make_pi0_processor` function to include the new padding configuration.
- Enhanced unit tests to validate the functionality of the `padding_side` parameter in various scenarios.

* feat(processor): Add state management methods to Pi0NewLineProcessor

* feat(normalization): Track normalization and unnormalization info in complementary data

- Updated NormalizerProcessor and UnnormalizerProcessor to accept additional parameters for tracking normalization modes.
- Enhanced the __call__ methods to store normalization and unnormalization information in the complementary data of transitions.
- Added unit tests to verify the correct tracking of normalization info, including scenarios with missing stats and selective normalization keys.

* feat(factory): Add preprocessor and postprocessor overrides to ProcessorConfigKwargs

- Updated ProcessorConfigKwargs to include optional overrides for preprocessor and postprocessor configurations.
- Enhanced the make_processor function to utilize the new overrides, allowing for more flexible processor initialization.

* feat(processors): Integrate RenameProcessor into various processor configurations

- Added RenameProcessor to the input steps of multiple processor functions, including make_act_processor, make_diffusion_processor, make_pi0_processor, make_sac_processor, make_tdmpc_processor, make_vqbet_processor, and make_smolvla_processor.
- Consolidated normalization features from input and output into a single NormalizerProcessor for improved efficiency.
- Updated the input steps to ensure compatibility with the new RenameProcessor integration.

* feat(smolvla): Refactor language processing and introduce new line processor (#1658)

- Removed the prepare_language method and directly accessed language tokens and masks from the batch using the OBS_LANGUAGE constant.
- Added SmolVLANewLineProcessor to ensure tasks end with a newline, enhancing tokenization compatibility.
- Updated the make_smolvla_processor function to include the new line processor and tokenizer processor for improved input handling.

* feture(policies): add device processor (#1659)

* feat(processors): Integrate DeviceProcessor into multiple processor configurations

- Added DeviceProcessor to the input and output steps of various processor functions, including make_act_processor, make_diffusion_processor, make_pi0_processor, make_pi0fast_processor, make_sac_processor, make_tdmpc_processor, make_vqbet_processor, and make_smolvla_processor.
- Enhanced the DeviceProcessor class with state management methods and ensured compatibility with existing processor pipelines.
- Introduced unit tests for DeviceProcessor to validate functionality across different scenarios, including CPU and CUDA operations.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactor(pipeline): Remove to() method for device management

- Eliminated the to() method from RobotProcessor, which was responsible for moving tensor states to specified devices.
- Removed associated unit tests that validated the functionality of the to() method across various scenarios.
- Streamlined the pipeline code by focusing on other device management strategies.

* feat(processor): Enhance DeviceProcessor with float dtype conversion

- Added support for optional float dtype conversion in DeviceProcessor, allowing tensors to be converted to specified floating-point types while preserving non-float types.
- Implemented validation for float dtype input and updated the processor's configuration methods to include float dtype.
- Refactored tensor processing logic to streamline device movement and dtype conversion.
- Introduced comprehensive unit tests to validate the new float dtype functionality across various scenarios.

* feat(policies): Add new line processors and update module exports

* feat(processor): Enhance batch and device processors to handle index and task_index fields

- Added logic to ToBatchProcessor for unsqueezing 0D tensors for index and task_index fields, ensuring they are processed as 1D tensors.
- Updated DeviceProcessor to process index and task_index fields in complementary data, preserving their tensor types and ensuring non-tensor fields remain unchanged.
- Enhanced unit tests to validate the correct handling of index and task_index fields across various scenarios, including device compatibility and dtype preservation.
2025-08-06 17:21:13 +02:00
Adil Zouitine
8ff95be04c refactor(normalization): Remove unused state dict transformation methods and streamline imports
- Eliminated the _transform_state_dict_keys and _load_as_safetensor methods from PI0Policy, simplifying the model loading process.
- Cleaned up imports in modeling_pi0.py by removing log_model_loading_keys and init_logging.
- Updated TDMPCPolicy and VQBeTPolicy to handle action removal from batches during offline evaluation.
- Introduced hotswap_stats function in normalize_processor.py to update normalization statistics dynamically, with corresponding tests to ensure functionality.
2025-08-06 17:21:07 +02:00
Adil Zouitine
f5c6b03b61 feat(processor): Introduce ToBatchProcessor for handling observation batching
- Added ToBatchProcessor to ensure observations have proper batch dimensions for model processing.
- Implemented functionality to add batch dimensions to state and image observations as needed.
- Created comprehensive unit tests to validate the processor's behavior with various tensor dimensions and types.
- Ensured compatibility with existing transition keys and maintained the integrity of non-observation data.
2025-08-06 17:20:51 +02:00
Steven Palma
fd4ae3466b refactor(pipeline): minor improvements (#1684)
* chore(pipeline): remove unused features + device torch + envtransition keys

* refactor(pipeline): ImageProcessor & StateProcessor are both implemented directly in VanillaObservationPRocessor

* refactor(pipeline): RenameProcessor now inherits from ObservationProcessor + remove unused code

* test(pipeline): fix broken test after refactors

* docs(pipeline): update docstrings VanillaObservationProcessor

* chore(pipeline): move None check to base pipeline classes
2025-08-06 14:00:13 +02:00
Adil Zouitine
f2b79656eb refactor(pipeline): Transition from tuple to dictionary format for EnvTransition
- Updated the EnvTransition structure to use a dictionary format instead of a tuple, enhancing readability and maintainability.
- Replaced instances of TransitionIndex with TransitionKey for accessing transition components.
- Adjusted related processing functions and tests to accommodate the new dictionary format, ensuring consistent handling of transitions across the codebase.
2025-08-01 08:41:53 +02:00
Adil Zouitine
1e0d667a22 Apply suggestions from code review
Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>
Signed-off-by: Adil Zouitine <adilzouitinegm@gmail.com>
2025-08-01 08:41:53 +02:00
Adil Zouitine
b08149a113 chore (batch handling): Enhance processing components with batch conversion utilities 2025-08-01 08:41:53 +02:00
Adil Zouitine
c227107f60 feat (device processor): Implement device processor 2025-08-01 08:41:53 +02:00
Adil Zouitine
6830ca7645 Refactor normalization components and update tests
- Renamed `ObservationNormalizer` to `NormalizerProcessor` and `ActionUnnormalizer` to `UnnormalizerProcessor` for clarity.
- Consolidated normalization logic for both observations and actions into `NormalizerProcessor` and `UnnormalizerProcessor`.
- Updated tests to reflect the new class names and ensure proper functionality of normalization and unnormalization processes.
- Enhanced handling of missing statistics in normalization processes.
2025-08-01 08:41:52 +02:00
Adil Zouitine
9f33791b19 chore (docs): add docstring for processor 2025-08-01 08:41:52 +02:00
Adil Zouitine
453e0a995f Enhance processing architecture with new components
- Added `RenameProcessor` to facilitate key renaming in observations, improving data handling flexibility.
- Updated `__init__.py` to include `RenameProcessor` in module exports.
- Refactored `NormalizationProcessor` and `ObservationNormalizer` to use `rsplit` for better key handling.
- Introduced comprehensive tests for `NormalizationProcessor` and `RenameProcessor` to ensure functionality and robustness.
2025-08-01 08:41:52 +02:00
pre-commit-ci[bot]
8ebf79c494 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-08-01 08:41:52 +02:00
Adil Zouitine
8774aec304 Add normalization processor and related components
- Introduced `NormalizationProcessor` to handle both observation normalization and action unnormalization.
- Added `ObservationNormalizer` and `ActionUnnormalizer` classes for specific normalization tasks.
- Updated `__init__.py` to include the new `NormalizationProcessor` in the module exports.
- Enhanced `ObservationProcessor` with registration in the `ProcessorStepRegistry` for better modularity.
- Created `RenameProcessor` for renaming keys in observations, improving flexibility in data processing.
2025-08-01 08:41:52 +02:00
Adil Zouitine
9aa632968f Refactor processing architecture to use RobotProcessor
- Replaced instances of RobotPipeline with RobotProcessor across the codebase for improved modularity and clarity.
- Introduced ProcessorStepRegistry for better management of processing steps.
- Updated relevant documentation and tests to reflect the new processing structure.
- Enhanced the save/load functionality to support the new processor design.
- Added a model card template for RobotProcessor to facilitate sharing and documentation.
2025-08-01 08:41:52 +02:00
pre-commit-ci[bot]
769f531603 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-08-01 08:41:51 +02:00
Adil Zouitine
f6c7287ae7 Refactor observation preprocessing to use a modular pipeline system
- Introduced `RobotPipeline` and `ObservationProcessor` for handling observation transformations.
- Updated `preprocess_observation` to maintain backward compatibility while leveraging the new pipeline.
- Added tests for the new processing components and ensured they match the original functionality.
- Removed hardcoded logic in favor of a more flexible, composable architecture.
2025-08-01 08:41:51 +02:00