Files
lerobot-clone/examples/annotations
Pepijn c5042a6850 fix(annotate): stop action records + augmentation from corrupting RoboCasa labels
Three compounding bugs made RoboCasa annotation produce off-task
subtasks ('move stove to stove with left arm') and drifting
augmentations ('wander around the kitchen' for 'Navigate to the stove').

1. action_records.replace_subtask_text now defaults False.
   Overwriting the VLM's subtask text with a reconstruction of
   hallucinated {verb,object,arm,grasp,dest} fields is high-risk:
   navigation / non-manipulation tasks don't fit the schema and render
   to nonsense. Records are now additive by default (emit_record_row),
   never silently replacing subtask text. Flip replace_subtask_text on
   only for manipulation datasets verified to render cleanly.

2. _render_action_record_to_subtask_text drops a degenerate
   destination that just echoes the object (verb=move object=stove
   destination=stove -> 'move stove' instead of 'move stove to stove').
   Also routes 'navigate' through the 'to <dest>' preposition family.

3. module_1_task_aug_axes.txt hardened: variants MUST preserve the
   goal/destination. Explicitly forbids 'Navigate to the stove' ->
   'wander around the kitchen'. Only wording / arm / orientation /
   grasp may vary; verb meaning, object, and destination are fixed.

examples/annotations/run_hf_job.py — corrected for RoboCasa:
  * derive_task_from_video=off (was =always). The dataset task string
    is authoritative and is what eval conditions on; =always threw it
    away, re-derived a hallucinated task from the video, and poisoned
    every downstream subtask/plan row. THIS was the dominant cause.
  * n_task_rephrasings=0 + task_aug_axes left off — RoboCasa eval uses
    exact task strings, so augmentation is unused/harmful.
  * action_records left off — manipulation schema doesn't fit atomic /
    navigation tasks.
  * plan_max_steps=6 to keep atomic-task decomposition tight.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-02 14:34:48 +02:00
..