annotate: address review feedback — bug fixes, docs/code drift, naming, cleanup

Bugs * validator: don't re-raise on unknown style. The second column_for_style lookup (used to route persistent vs event) now sits in try/except so an unknown style is recorded by _check_column_routing and skipped instead of crashing the whole validation pass. * general_vqa._target_cameras: when restrict_to_default_camera is set but the configured camera_key isn't one the provider exposes, warn and fall back to all cameras instead of returning a phantom key that KeyErrors deep in frame decode. * interjections: clamp interjection timestamps to frame_timestamps[0] rather than a hardcoded 0.0 (datasets can start at non-zero t). Docs / code drift * annotation_pipeline.mdx: drop the phantom 'vocabulary discovery / phase 0 / --vocabulary.* / canonical_vocabulary.json' section (none of it exists); describe the real describe->segment + coverage-stitch flow. Soften the src/lerobot/tools/ + TOOL_REGISTRY reference to 'not part of this PR' (matches tools.mdx, which already marks the runtime layer as not-yet-implemented). Fix the --push_to_hub/--new_repo_id wording. Note the default is now a single h200. Add a 'Contributing new modules' section inviting module / prompt / quality contributions. * executor docstring: six phases, no phantom phase 0. run_hf_job.py * add the Apache 2.0 license header (was flagged repeatedly). * default to a single GPU: flavor=h200, parallel_servers=1, num_gpus=1 (scale to h200x4 noted in the docstring). * pin the install to @main instead of the feature branch (won't break after merge). Naming / cleanup * rename dest_repo_id -> new_repo_id across config / script / example / test to match the LeRobot dataset edit tools. * rename prompt templates module_N_*.txt -> descriptive (plan_*, interjections_*, vqa.txt) and update every load_prompt() call. * remove dead _messages_to_prompt (used only by the removed in-process backends). * declare _warned_decode_fail (frames) and _warned_no_camera (vqa) as real init=False dataclass fields instead of getattr monkey-patches. * scope bandit B607 to the two ffmpeg subprocess.run sites via '# nosec B607' and drop it from the global skip list. Tests * fix stale canned-VLM markers ('ONE realistic interruption' -> 'compact interjection', 'Update the memory' -> 'compressed semantic memory') and drop the dead 'concise hierarchical PLAN' plan responders (plan generation is deterministic now) in run_e2e_smoke, test_pipeline_recipe_render, test_modules. * run_e2e_smoke now asserts interjection + speech rows are produced so a stale marker can't silently pass again. * drop remaining 'PR 1' / 'PR 2' references from test comments / names. Verified: tests/annotations + tests/datasets/test_language + tests/scripts/test_lerobot_annotate (31 passed); make-style E2E smoke (interjections=1 speech_atoms=2); pre-commit (ruff, mypy, bandit, prettier) clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-04 04:41:24 +00:00 · 2026-06-03 18:30:46 +02:00
parent 3a24e426df
commit eba3ab3741
27 changed files with 148 additions and 104 deletions
--- a/tests/annotations/test_modules.py
+++ b/tests/annotations/test_modules.py
@@ -151,7 +151,7 @@ def test_module2_mid_episode_emits_paired_interjection_and_speech(
        {
            "acknowledgement the robot": {"text": "OK."},
            # Marker matches the distinctive line of
-            # ``module_2_interjection.txt`` ("Write ONE compact
+            # ``interjections_interjection.txt`` ("Write ONE compact
            # interjection ..."). Keep this in sync with that prompt's
            # wording — the canned responder matches on substring.
            "Write ONE compact interjection": {
@@ -245,7 +245,6 @@ def test_module1_attaches_video_block_to_subtask_prompt(fixture_dataset_root: Pa
            {"text": "wipe the counter", "start": 0.5, "end": 1.1},
        ]
    }
-    plan_payload = {"plan": "1. grasp\n2. wipe"}
    memory_payload = {"memory": "wiped once"}

    def responder(messages):
@@ -255,9 +254,7 @@ def test_module1_attaches_video_block_to_subtask_prompt(fixture_dataset_root: Pa
            for block in m.get("content", []):
                if isinstance(block, dict) and block.get("type") == "text":
                    text = block.get("text", "")
-        if "concise hierarchical PLAN" in text:
-            return plan_payload
-        if "Update the memory" in text:
+        if "compressed semantic memory" in text:
            return memory_payload
        return payload