From 1e7c0d6aa18c44503ae9e0e3af2bc6a16b897ab8 Mon Sep 17 00:00:00 2001 From: pepijn Date: Tue, 26 May 2026 05:14:30 +0000 Subject: [PATCH] annotate(plan): force composite-action subtasks; ban ultra-fine splits MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Tighten ``module_1_subtasks.txt`` so the VLM emits one composite atomic action per subtask instead of decomposing every pick into ``move to X`` / ``grasp X`` / ``lift X``: - Lock the verb vocabulary to the composite set the low-level policy actually learns end-to-end: ``pick up`` (approach + grasp + lift), ``put``/``place`` (transport + release), ``push``, ``pull``, ``turn``, ``press``, ``open``, ``close``, ``pour``, ``insert``. ``go to`` is allowed only as a pure relocation between phases. - Add an explicit ``Forbidden ultra-fine splits`` block enumerating the patterns the VLM was tempted to emit (``move to X``, ``reach for X``, ``grasp X``, ``lift X``, ``release X``) and instructing it to fold each into its parent composite. - Rewrite the Good/Bad examples to match the composite contract; the previous ``"move to blue cube" / "grasp blue cube" / "lift blue cube"`` Good list was actively encouraging the over- segmentation pattern this prompt is supposed to prevent. - Tighten the duration rule: candidates shorter than ``min_subtask_seconds`` must be merged into a neighbour rather than emitted. Pairs with bumping the runtime floor to 3 s so composites have room to land. Pure prompt change — no code or schema change. Existing canonical- vocabulary retry path is unaffected (the new verb whitelist lives in prose, not in the validator). Co-authored-by: Cursor --- .../prompts/module_1_subtasks.txt | 58 +++++++++++++++---- 1 file changed, 46 insertions(+), 12 deletions(-) diff --git a/src/lerobot/annotations/steerable_pipeline/prompts/module_1_subtasks.txt b/src/lerobot/annotations/steerable_pipeline/prompts/module_1_subtasks.txt index 12bbcfba2..9314282be 100644 --- a/src/lerobot/annotations/steerable_pipeline/prompts/module_1_subtasks.txt +++ b/src/lerobot/annotations/steerable_pipeline/prompts/module_1_subtasks.txt @@ -8,14 +8,42 @@ the robot performs. {vocabulary_block}Authoring rules — Hi Robot atom granularity, pi0.7-style short prompts: -- Each subtask = one atomic skill the low-level policy can execute. -- Write each subtask as an IMPERATIVE COMMAND, starting with a verb: - move, reach, pick up, grasp, place, put, push, pull, open, close, - turn, press, lift, insert, pour... +- Each subtask = one COMPOSITE atomic skill the low-level policy can + execute end-to-end. A "skill" bundles its own approach motion with + its terminal action — do NOT split the approach off as its own + subtask. The whole-arm policy already learns to reach as part of + every manipulation primitive. +- Write each subtask as an IMPERATIVE COMMAND, starting with one of + these verbs (extend only when none fits): + pick up — approach + grasp + lift in one subtask + put on/in — transport + release in one subtask + place on/in — synonym of "put"; pick one and stay consistent + push — contact + linear shove + pull — contact + linear retract + turn — rotary actuation + press