annotate: dedup task_aug + row-normalization; docs module on/off table

Two behavior-preserving simplifications: * plan_subtasks_memory.run_episode: the task_aug 'axes' and free-form branches built identical deduped rows via copy-pasted seen/append loops. Collapse to one branch that picks the variant source, then a shared _task_aug_rows() helper does the dedup + row build (-~25 LOC). * writer: _normalize_persistent_row / _normalize_event_row shared the same camera-validate + struct construction. Extract _normalize_row(), keeping the exact key order (the parquet struct schema is inferred from insertion order, so timestamp must stay between style and camera). docs: 'Which modules run' is now a table giving each module's on/off flag (--plan.enabled / --interjections.enabled / --vqa.enabled) and what it turns off. Verified: 40 tests pass (incl. test_writer struct round-trip); pre-commit clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-04 21:01:26 +00:00 · 2026-06-04 14:18:36 +02:00
parent 7471a6b1ed
commit 973318ef65
3 changed files with 56 additions and 66 deletions
--- a/docs/source/annotation_pipeline.mdx
+++ b/docs/source/annotation_pipeline.mdx
@@ -136,9 +136,14 @@ short manipulation episodes.

 ### Which modules run

-Each module can be turned off independently to iterate on one at a time:
-`--plan.enabled`, `--interjections.enabled`, `--vqa.enabled` (all
-`true` by default).
+Every module is on by default and can be toggled independently (set to
+`false` to skip it, e.g. to iterate on one module at a time):
+
+| Flag                      | Default | Turns off                           |
+| ------------------------- | ------- | ----------------------------------- |
+| `--plan.enabled`          | `true`  | subtasks + plan + memory + task_aug |
+| `--interjections.enabled` | `true`  | interjections + speech atoms        |
+| `--vqa.enabled`           | `true`  | the VQA pairs                       |

 ### The VLM (`--vlm.*`)