Commit Graph

1520 Commits

Author SHA1 Message Date
Maximellerbach
90c15db1a7 Revert "trying to isolate pi0 side-effect"
This reverts commit ec42dd9973.
2026-06-01 16:47:54 +02:00
Maximellerbach
ec42dd9973 trying to isolate pi0 side-effect 2026-05-31 22:34:49 +02:00
Maximellerbach
6079d125f0 fixing simlink 2026-05-31 22:34:49 +02:00
Maximellerbach
6f82adddc1 adding configuration gripper index and threshold 2026-05-31 22:34:49 +02:00
Maximellerbach
8e82b2bd1a removing swish in favor of silu 2026-05-31 22:34:49 +02:00
Maximellerbach
9b0742172e cleanup 2026-05-31 22:34:49 +02:00
Maximellerbach
2c14d24750 removing useless pre-processor 2026-05-31 22:34:49 +02:00
Maximellerbach
5db440cc0e adding .mdx docs and shortening polivy_vla_jepa_README.md 2026-05-31 22:34:49 +02:00
Maximellerbach
5524239288 adding licences 2026-05-31 22:34:49 +02:00
Maximellerbach
d26d2da138 removing conversion script 2026-05-31 22:34:49 +02:00
Maximellerbach
c2088fd760 fixing misconception about multiview / singleview handling 2026-05-31 22:34:49 +02:00
Maximellerbach
5fc4285419 smol fix to avoid having default CPU device when training 2026-05-31 22:34:49 +02:00
Maximellerbach
9fe7a542e4 adding instructions for different embodiement + fixing some tests 2026-05-31 22:34:49 +02:00
Maximellerbach
baa22e6acb fix qwen norm layer output libero eval is now as expected 2026-05-31 22:34:49 +02:00
Maxime Ellerbach
cef14cab93 trying to close success rate gap 2026-05-31 22:34:49 +02:00
Maximellerbach
b382398fad fixing training and exal examples 2026-05-31 22:34:48 +02:00
Maximellerbach
e8ea7b9fa0 adding guard for diffusers 2026-05-31 22:34:48 +02:00
Maximellerbach
5c10d0011b adressing dtype zeros issue 2026-05-31 22:34:48 +02:00
Maxime Ellerbach
3de7d2359a fixing doc defaults args
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Maxime Ellerbach <maxime@ellerbach.net>
2026-05-31 22:34:48 +02:00
Maximellerbach
1066d037d5 pre-commit cleanup 2026-05-31 22:34:48 +02:00
Maxime Ellerbach
553c217ee2 refactoring into using pre and post processor 2026-05-31 22:34:48 +02:00
Maxime Ellerbach
999cc625d6 lots of changes to make existing weights work, need to massively refactor the pre and post processing 2026-05-31 22:34:48 +02:00
Maximellerbach
c6bf11b2d5 removing missleading future_action_window_size to just use chunk_size 2026-05-31 22:34:48 +02:00
Maximellerbach
a61361b9be allow different state dim and action dim 2026-05-31 22:34:48 +02:00
Maximellerbach
4a34c81ed5 trying out to re-init the action head to avoid pretraining dimension mismatch 2026-05-31 22:34:48 +02:00
Maximellerbach
23831f5cb5 make default params more aligned with paper and pretrained models
- adding possibility of freezing qwen backbone and world model
- added tests for weight loading
2026-05-31 22:34:48 +02:00
Maximellerbach
967534a51f add one-shot script to convert ginwind/VLA-JEPA checkpoints to safetensors (will remove once migrated) 2026-05-31 22:34:48 +02:00
Maximellerbach
60a4e02bc0 add VLA-JEPA documentation
Covers architecture overview, pretrained checkpoints, config reference,
training/eval commands for LIBERO-10, and guidance on fine-tuning for
single-camera datasets.
2026-05-31 22:34:48 +02:00
Maximellerbach
972b32fc99 update VLA-JEPA tests for arch changes and action_is_pad
- Switch conftest to use `action_model_type="DiT-test"` now that
  `action_num_heads` / `action_attention_head_dim` have been removed.
- Add action_head tests covering fully-padded loss (zero) and equivalence
  of action_is_pad=None vs all-zeros mask.
- Remove obsolete `test_native_to_lerobot_wm_only` test.
2026-05-31 22:34:48 +02:00
Maximellerbach
a597ba3837 propagate action_is_pad masking through VLA-JEPA policy pipeline
Pass the `action_is_pad` tensor from the batch through to the action head
so padded timesteps are excluded from the flow-matching loss.
2026-05-31 22:34:48 +02:00
Maximellerbach
785997b1d2 align VLA-JEPA architecture with original checkpoint
- Remove stale `action_num_heads` / `action_attention_head_dim` config fields;
  DiT head dimensions are now always derived from the preset (DiT-B/L/test).
- Add `num_target_vision_tokens` and `action_max_seq_len` config fields required
  by the action head's future-token embedding and positional embedding tables.
- Fix default `qwen_model_name` to 2B (matches all released checkpoints).
- Rename `ActionEncoder` attrs w1/w2/w3 → layer1/layer2/layer3 to match
  checkpoint key names; replace `nn.Sequential` decoder/state-encoder with
  `_MLP2` (layer1/layer2 naming).
- Fix `VLAJEPAActionHead` to size ActionEncoder and StateEncoder at `inner_dim`
  (DiT input width) rather than `action_hidden_size` (DiT output width).
- Rename `DiT.blocks` → `transformer_blocks` and `attn` → `attn1` to match
  checkpoint; add alternating cross/self attention (even blocks cross-attend to
  Qwen context, odd blocks self-attend).
- Add `DiT-test` preset for unit tests.
- Rewrite `ActionConditionedVideoPredictor` with explicit ViT-style blocks
  (`_PredictorBlock` with fused qkv) to match checkpoint structure; rename
  `encoder`/`norm`/`proj` → `predictor_blocks`/`predictor_norm`/`predictor_proj`.
2026-05-31 22:34:48 +02:00
Maximellerbach
a4f59cd5be adding more tests to ensure good coverage 2026-05-31 22:34:48 +02:00
Maximellerbach
af36b42e90 some more fixes to be closer to the original implem 2026-05-31 22:34:48 +02:00
Maxime Ellerbach
be9147b131 adjusting obs steps, tublets size to match original implementation 2026-05-31 22:34:47 +02:00
Maxime Ellerbach
921b823fb4 fixing wm_loss not propagating 2026-05-31 22:34:47 +02:00
Maxime Ellerbach
6657be734c fix warnings with qwen processor kwargs 2026-05-31 22:34:47 +02:00
Maxime Ellerbach
5317cbb97a fixing action and state dim 2026-05-31 22:34:47 +02:00
Maximellerbach
b370da1813 adding guards to avoid needing transformers and diffusers for type checking and basic tests 2026-05-31 22:34:47 +02:00
Maximellerbach
208bcab3bd updating uv lock 2026-05-31 22:34:47 +02:00
Maximellerbach
6aa75984cc adding deps to pyproject.toml 2026-05-31 22:31:42 +02:00
Maximellerbach
a72b3bd2f4 linting 2026-05-31 22:31:42 +02:00
ginwind
3536d17cfb (feat)policies: add VLA-JEPA 2026-05-31 22:31:42 +02:00
ginwind
8543ed756d support vla_jepa 2026-05-31 22:31:42 +02:00
ginwind
64c726ac2b feat(policies): add VLA-JEPA 2026-05-31 22:31:42 +02:00
ginwind
010ffc84ab feat(policies): add VLA-JEPA 2026-05-31 22:31:42 +02:00
ginwind
03c9f878be first commit 2026-05-31 22:31:42 +02:00
Khalil Meftah
b8ad81bf39 feat(rewards): add ROBOMETER reward model (#3627)
* feat/add ROBOMETER reward model

* feat(rewards): add Robometer offline progress labeling script

* fix(rewards/robometer): add missing input keys mm_token_type_ids

* chore(rewards/robometer): default to lerobot/Robometer-4b model

* doc(rewards/robometer): update citation and original github link

* feat(rewards/robometer): add image key argument to compute Robometer progress
2026-05-29 21:45:39 +02:00
Haoquan Fang
24017e960c Add MolmoAct2 policy (#3604)
* add molmoact2 policy

* add apache headers to molmoact2 files

* simplify molmoact2 package imports

* align molmoact2 feature validation with eo pattern

* remove molmoact2 processor override from factory

* guard molmoact2 transformers imports

* guard molmoact2 processor transformers import

* add scipy dependency to molmoact2 extra

* use a single molmoact2 action queue

* move molmoact2 config logic into config

* fix molmoact2 hf image key resolution

* load molmoact2 without remote code

* lazy import molmoact2 scipy

* format molmoact2 files

* skip molmoact2 tests without optional deps

* fix molmoact2 pre-commit checks

* validate molmoact2 gripper range
2026-05-27 18:58:37 +02:00
Khalil Meftah
e86f5af5bf feat(rewards): add TOPReward reward model (#3629)
* feat(rewards): add TOPReward reward model

* refactor(rewards): clean up TOPReward processor/model

* fix(rewards/topreward): add missing input keys mm_token_type_ids

* fix(rewards/topreward): fix pyproject extra typo and simplify processor (#3653)

Add lerobot[topreward] extra to all in
pyproject.toml, drop the redundant labels arg in scoring, and
collapse the dead-branch shape check in the encoder processor.

* optmize topreward input processing (#3660)

---------

Co-authored-by: Cole <91766445+jcoleharrison@users.noreply.github.com>
Co-authored-by: Haoming Song <haomingsong24@gmail.com>
2026-05-27 14:24:31 +02:00
Haoming Song
5c98e80430 fix(gr00t): fix Eagle25VL model and processor crash in transformers>=5.4.0, <5.6.0 (#3652)
Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
2026-05-26 14:04:22 +02:00