Adil Zouitine
ad132c9c39
[HIL SERL] Env management and add gym-hil ( #1077 )
...
Co-authored-by: Michel Aractingi <michel.aractingi@gmail.com >
2025-05-07 09:39:21 +02:00
Adil Zouitine
70d55c77e9
Merge branch 'main' into user/adil-zouitine/2025-1-7-port-hil-serl-new
2025-05-06 16:43:44 +02:00
Michel Aractingi
5998203a33
[Port HIL-SERL] Final fixes for reward classifier ( #1067 )
...
Co-authored-by: s1lent4gnt <kmeftah.khalil@gmail.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-05-05 11:33:09 +02:00
omahs
8cfab38824
Fix typos ( #1070 )
2025-05-05 10:35:32 +02:00
AdilZouitine
fb7c288c94
Update torch.load calls in network_utils.py to include weights_only=False, to ensure no regression with torch 2.6 update
2025-04-29 18:23:51 +02:00
AdilZouitine
4257fe5045
rename reward classifier
2025-04-25 18:38:52 +02:00
Michel Aractingi
bd4db8d747
[Port HIl-Serl] Refactor gym-manipulator ( #1034 )
2025-04-25 16:34:54 +02:00
AdilZouitine
a8da4a347e
Clean the code
2025-04-24 17:22:54 +02:00
AdilZouitine
b8c2b0bb93
Clean the code and remove todo
2025-04-24 16:10:56 +02:00
Adil Zouitine
c58b504a9e
[HIL-SERL]Remove overstrict pre-commit modifications ( #1028 )
2025-04-24 13:48:52 +02:00
Adil Zouitine
299effe0f1
[HIL-SERL] Update CI to allow installation of prerelease versions for lerobot ( #1018 )
...
Co-authored-by: imstevenpmwork <steven.palma@huggingface.co >
2025-04-24 10:18:03 +02:00
AdilZouitine
b77cee7cc6
Ignore spellcheck for ik variable
2025-04-22 13:19:59 +00:00
AdilZouitine
6230840397
Fix linter issue part 2
2025-04-22 10:56:23 +02:00
AdilZouitine
c5845ee203
Fix linter issue
2025-04-22 10:37:08 +02:00
Eugene Mironov
0030ff3f74
[HIL-SERl PORT] Unit tests for Replay Buffer ( #966 )
2025-04-22 09:35:57 +02:00
Michel Aractingi
dc726cb9a3
Refactor crop_dataset_roi
2025-04-22 09:31:35 +02:00
AdilZouitine
a7a51cfc9c
Refactor SACPolicy and configuration to replace 'grasp_critic' terminology with 'discrete_critic'. Update related methods and comments for clarity and consistency in handling discrete actions.
2025-04-18 14:57:03 +00:00
pre-commit-ci[bot]
0d70f0b85c
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-04-18 14:22:11 +00:00
Michel Aractingi
c1ee25d9f7
nits in configuration classifier and control_robot
2025-04-18 16:18:13 +02:00
Michel Aractingi
9886520d33
Added option to add current readings to the state of the policy
2025-04-18 16:18:13 +02:00
Michel Aractingi
3b24ad3c84
Fixes for the reward classifier
2025-04-18 16:18:13 +02:00
pre-commit-ci[bot]
fb92935601
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-04-18 13:33:37 +00:00
AdilZouitine
2f7339b410
Handle caching
...
Co-authored-by: s1lent4gnt <kmeftah.khalil@gmail.com >
2025-04-18 15:10:22 +02:00
AdilZouitine
8122721f6d
match target entropy hil serl
...
Co-authored-by: s1lent4gnt <kmeftah.khalil@gmail.com >
2025-04-18 15:10:22 +02:00
AdilZouitine
9386892f8e
Refactor modeling_sac and parameter handling for clarity and reusability.
...
Co-authored-by: s1lent4gnt <kmeftah.khalil@gmail.com >
2025-04-18 15:10:22 +02:00
pre-commit-ci[bot]
28b595c651
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-04-18 15:10:22 +02:00
Michel Aractingi
9fd4c21d4d
General fixes in code, removed delta action, fixed grasp penalty, added logic to put gripper reward in info
2025-04-18 15:10:22 +02:00
AdilZouitine
e18274bc9a
fix caching and dataset stats is optional
2025-04-18 15:10:22 +02:00
pre-commit-ci[bot]
a3ada81816
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-04-18 15:10:22 +02:00
AdilZouitine
78c640b6d8
Refactor complementary_info handling in ReplayBuffer
2025-04-18 15:10:22 +02:00
AdilZouitine
d5a87f67cf
Handle gripper penalty
2025-04-18 15:10:22 +02:00
AdilZouitine
8bcf41761d
fix caching
2025-04-18 15:10:22 +02:00
pre-commit-ci[bot]
1efaf02df9
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-04-18 15:10:22 +02:00
AdilZouitine
cf58890bb0
fix indentation issue
2025-04-18 15:10:22 +02:00
AdilZouitine
7c2c67fc3c
Enhance SAC configuration and replay buffer with asynchronous prefetching support
...
- Added async_prefetch parameter to SACConfig for improved buffer management.
- Implemented get_iterator method in ReplayBuffer to support asynchronous prefetching of batches.
- Updated learner_server to utilize the new iterator for online and offline sampling, enhancing training efficiency.
2025-04-18 15:10:22 +02:00
AdilZouitine
6167886472
Enhance SACPolicy and learner server for improved grasp critic integration
...
- Updated SACPolicy to conditionally compute grasp critic losses based on the presence of discrete actions.
- Refactored the forward method to handle grasp critic model selection and loss computation more clearly.
- Adjusted learner server to utilize optimized parameters for grasp critic during training.
- Improved action handling in the ManiskillMockGripperWrapper to accommodate both tuple and single action inputs.
2025-04-18 15:10:22 +02:00
AdilZouitine
f9fb9d4594
Refactor SACPolicy for improved readability and action dimension handling
...
- Cleaned up code formatting for better readability, including consistent spacing and removal of unnecessary blank lines.
- Consolidated continuous action dimension calculation to enhance clarity and maintainability.
- Simplified loss return statements in the forward method to improve code structure.
- Ensured grasp critic parameters are included conditionally based on configuration settings.
2025-04-18 15:10:22 +02:00
AdilZouitine
d86d29fe21
Add mock gripper support and enhance SAC policy action handling
...
- Introduced mock_gripper parameter in ManiskillEnvConfig to enable gripper simulation.
- Added ManiskillMockGripperWrapper to adjust action space for environments with discrete actions.
- Updated SACPolicy to compute continuous action dimensions correctly, ensuring compatibility with the new gripper setup.
- Refactored action handling in the training loop to accommodate the changes in action dimensions.
2025-04-18 15:10:22 +02:00
AdilZouitine
f83d215e7a
Refactor SAC policy and training loop to enhance discrete action support
...
- Updated SACPolicy to conditionally compute losses for grasp critic based on num_discrete_actions.
- Simplified forward method to return loss outputs as a dictionary for better clarity.
- Adjusted learner_server to handle both main and grasp critic losses during training.
- Ensured optimizers are created conditionally for grasp critic based on configuration settings.
2025-04-18 15:10:22 +02:00
Michel Aractingi
0cce2fe0fa
Added Gripper quantization wrapper and grasp penalty
...
removed complementary info from buffer and learner server
removed get_gripper_action function
added gripper parameters to `common/envs/configs.py`
2025-04-18 15:10:22 +02:00
pre-commit-ci[bot]
88d26ae976
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-04-18 15:10:22 +02:00
s1lent4gnt
3a2308d86f
Add grasp critic to the training loop
...
- Integrated the grasp critic gradient update to the training loop in learner_server
- Added Adam optimizer and configured grasp critic learning rate in configuration_sac
- Added target critics networks update after the critics gradient step
2025-04-18 15:10:22 +02:00
s1lent4gnt
fdd04efdb7
Add get_gripper_action method to GamepadController
2025-04-18 15:10:22 +02:00
s1lent4gnt
ff18be18ad
Add gripper penalty wrapper
2025-04-18 15:10:22 +02:00
s1lent4gnt
427720426b
Add complementary info in the replay buffer
...
- Added complementary info in the add method
- Added complementary info in the sample method
2025-04-18 15:10:22 +02:00
pre-commit-ci[bot]
334cf8143e
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-04-18 15:10:22 +02:00
AdilZouitine
5b49601072
Fix convergence of sac, multiple torch compile on the same model caused divergence
2025-04-18 15:10:22 +02:00
AdilZouitine
0185a0b6fd
Fix cuda graph break
2025-04-18 15:10:22 +02:00
s1lent4gnt
70d418935d
Fix: Prevent Invalid next_state References When optimize_memory=True ( #918 )
2025-04-18 15:10:22 +02:00
pre-commit-ci[bot]
eb44a06a9b
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-04-18 15:10:22 +02:00