lerobot-clone/src/lerobot at 3174e14bc02ec8c1fa5a3dbe899774355fe88197 - lerobot-clone - allenyuan‘s gitea

ydy0615/lerobot-clone

mirror of https://github.com/huggingface/lerobot.git synced 2026-06-04 12:51:27 +00:00

Files

History

Pepijn 3174e14bc0 fix(smolvla2): feed all cameras to VQA generation, not just the chosen one

handle_vqa_query filtered the observation down to the single chosen
camera before calling the VLM. But training feeds every camera: the
ask_vqa_* recipes' image blocks are stripped before tokenization and
the frames reach the model via OBS_IMAGES_*, where embed_prefix
consumes all config.image_features regardless of the per-camera recipe
tag. Filtering to one camera changed the image-token count in the
prefix (the dropped camera zero-padded with mask=0) — a prefix shape
the model never saw at training.

Now the full observation is passed to select_message; the chosen
camera is used only to pick which frame the bbox/point overlay is
drawn on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-18 14:46:38 +02:00

..

annotate: telegraphic subtasks — ≤4 words, verb+object, consistent nouns

2026-05-15 14:14:42 +02:00

async_inference

feat(dependencies): minimal default tag install (#3362 )

2026-04-12 20:03:04 +02:00

refactor(imports): enforce guard pattern (#3382 )

2026-04-14 22:54:05 +02:00

refactor(imports): enforce guard pattern (#3382 )

2026-04-14 22:54:05 +02:00

feat(recipes): add hirobot_memory — hirobot + memory + spoken tool-call replies

2026-05-18 14:21:41 +02:00

data_processing

feat(dependencies): minimal default tag install (#3362 )

2026-04-12 20:03:04 +02:00

fix(datasets): render flow-only low_level recipes instead of dropping them

2026-05-17 13:20:39 +02:00

feat(sim): VLABench benchmark integration (#3396 )

2026-04-21 17:54:11 +02:00

refactor(imports): enforce guard pattern (#3382 )

2026-04-14 22:54:05 +02:00

refactor(imports): enforce guard pattern (#3382 )

2026-04-14 22:54:05 +02:00

refactor(imports): enforce guard pattern (#3382 )

2026-04-14 22:54:05 +02:00

fix(smolvla2): feed all cameras to VQA generation, not just the chosen one

2026-05-18 14:46:38 +02:00

fix(smolvla2): train on rendered language batches

2026-05-05 08:55:56 +00:00

fix(rl): swap dict merge order to preserve teleop intervention flag (#3273 )

2026-04-14 16:20:54 +02:00

feat(smolvla2-runtime): overfit/memorisation diagnostics on the panel

2026-05-12 17:31:04 +02:00

feat(smolvla2): VQA example prompts in the panel; drop quotes from hints

2026-05-18 14:42:32 +02:00

refactor(imports): enforce guard pattern (#3382 )

2026-04-14 22:54:05 +02:00

chore(policies): deprecate pi0fast (#2203 )

2025-10-14 16:00:42 +02:00

feat(tools): src/lerobot/tools/ — runnable tool registry + SayTool

2026-04-30 18:58:04 +02:00

feat(dependencies): minimal default tag install (#3362 )

2026-04-12 20:03:04 +02:00

feat(dependencies): minimal default tag install (#3362 )

2026-04-12 20:03:04 +02:00

fix(smolvla2): train on rendered language batches

2026-05-05 08:55:56 +00:00

__init__.py

feat(dependencies): minimal default tag install (#3362 )

2026-04-12 20:03:04 +02:00

__version__.py

Package folder structure (#1417 )

2025-07-01 16:34:46 +02:00

types.py

chore(dependecies): untangle dependecies across internal modules (#3149 )

2026-03-15 20:26:06 -07:00