chore(rollout): nice collored cli

feat(policies): add EO-1 model (#3403 )
* feat(policies): add EO-1 model * chore(eo1): adjust policy_eo1_README.md to to avoid duplicate with eo1.mdx * chore(eo1): remove policy_eo1_README.md, link eo1.mdx in policy folder --------- Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>
2026-05-31 19:01:28 +00:00 · 2026-05-07 11:12:02 +02:00 · 2026-05-06 18:01:16 +02:00 · 2026-05-06 17:03:09 +02:00
51 changed files with 2359 additions and 2414 deletions
--- a/docs/source/_toctree.yml
+++ b/docs/source/_toctree.yml
@@ -31,10 +31,8 @@
    title: Porting Large Datasets
  - local: using_dataset_tools
    title: Using the Dataset Tools
-  - local: language_and_recipes
-    title: Language Columns and Recipes
-  - local: tools
-    title: Tools
+  - local: dataset_subtask
+    title: Using Subtasks in the Dataset
  - local: streaming_video_encoding
    title: Streaming Video Encoding
  title: "Datasets"
@@ -49,6 +47,8 @@
    title: π₀-FAST (Pi0Fast)
  - local: pi05
    title: π₀.₅ (Pi05)
+  - local: eo1
+    title: EO-1
  - local: groot
    title: NVIDIA GR00T N1.5
  - local: xvla
--- a/docs/source/dataset_subtask.mdx
+++ b/docs/source/dataset_subtask.mdx
@@ -0,0 +1,277 @@
+# Using Subtasks in LeRobot Datasets
+
+Subtask support in robotics datasets has proven effective in improving robot reasoning and understanding. Subtasks are particularly useful for:
+
+- **Hierarchical policies**: Building policies that include subtask predictions to visualize robot reasoning in real time
+- **Reward modeling**: Helping reward models understand task progression (e.g., SARM-style stage-aware reward models)
+- **Task decomposition**: Breaking down complex manipulation tasks into atomic, interpretable steps
+
+LeRobotDataset now supports subtasks as part of its dataset structure, alongside tasks.
+
+## What are Subtasks?
+
+While a **task** describes the overall goal (e.g., "Pick up the apple and place it in the basket"), **subtasks** break down the execution into finer-grained steps:
+
+1. "Approach the apple"
+2. "Grasp the apple"
+3. "Lift the apple"
+4. "Move to basket"
+5. "Release the apple"
+
+Each frame in the dataset can be annotated with its corresponding subtask, enabling models to learn and predict these intermediate stages.
+
+<img
+  src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobot/subtask-asset.png"
+  alt="An overview of subtask annotation showing how frames are labeled with intermediate subtask stages"
+  width="80%"
+/>
+
+<p>
+  <em>Figure: Overview of subtask annotation.</em>
+</p>
+
+**Reference:** _Subtask-learning based for robot self-assembly in flexible collaborative assembly in manufacturing_, Original Article, Published: 19 April 2022.
+
+## Dataset Structure
+
+Subtask information is stored in the dataset metadata:
+
+```
+my-dataset/
+├── data/
+│   └── ...
+├── meta/
+│   ├── info.json
+│   ├── stats.json
+│   ├── tasks.parquet
+│   ├── subtasks.parquet      # Subtask index → subtask string mapping
+│   └── episodes/
+│       └── ...
+└── videos/
+    └── ...
+```
+
+### Subtasks Parquet File
+
+The `meta/subtasks.parquet` file maps subtask indices to their natural language descriptions:
+
+| subtask_index | subtask (index column) |
+| ------------- | ---------------------- |
+| 0             | "Approach the apple"   |
+| 1             | "Grasp the apple"      |
+| 2             | "Lift the apple"       |
+| ...           | ...                    |
+
+### Frame-Level Annotations
+
+Each frame in the dataset can include a `subtask_index` field that references the subtasks parquet file:
+
+```python
+# Example frame data in the parquet file
+{
+    "index": 42,
+    "timestamp": 1.4,
+    "episode_index": 0,
+    "task_index": 0,
+    "subtask_index": 2,  # References "Lift the apple"
+    "observation.state": [...],
+    "action": [...],
+}
+```
+
+## Annotating Datasets with Subtasks
+
+We provide a HuggingFace Space for easily annotating any LeRobotDataset with subtasks:
+
+**[https://huggingface.co/spaces/lerobot/annotate](https://huggingface.co/spaces/lerobot/annotate)**
+
+After completing your annotation:
+
+1. Click "Push to Hub" to upload your annotated dataset
+2. You can also run the annotation space locally by following the instructions at [github.com/huggingface/lerobot-annotate](https://github.com/huggingface/lerobot-annotate)
+
+## Loading Datasets with Subtasks
+
+When you load a dataset with subtask annotations, the subtask information is automatically available:
+
+```python
+from lerobot.datasets import LeRobotDataset
+
+# Load a dataset with subtask annotations
+dataset = LeRobotDataset("jadechoghari/collect-fruit-annotated")
+
+# Access a sample
+sample = dataset[100]
+
+# The sample includes both task and subtask information
+print(sample["task"])        # "Collect the fruit"
+print(sample["subtask"])     # "Grasp the apple"
+print(sample["task_index"])  # tensor(0)
+print(sample["subtask_index"])  # tensor(2)
+```
+
+### Checking for Subtask Support
+
+You can check if a dataset has subtask annotations:
+
+```python
+# Check if subtasks are available
+has_subtasks = (
+    "subtask_index" in dataset.features
+    and dataset.meta.subtasks is not None
+)
+
+if has_subtasks:
+    print(f"Dataset has {len(dataset.meta.subtasks)} unique subtasks")
+    print("Subtasks:", list(dataset.meta.subtasks.index))
+```
+
+## Using Subtasks for Training
+
+### With the Tokenizer Processor
+
+The `TokenizerProcessor` automatically handles subtask tokenization for Vision-Language Action (VLA) models:
+
+```python
+from lerobot.processor import TokenizerProcessorStep
+
+# Create a tokenizer processor step
+tokenizer_processor = TokenizerProcessorStep(
+    tokenizer_name_or_path="google/paligemma-3b-pt-224",
+    padding="max_length",
+    max_length=64,
+)
+
+# The processor will automatically tokenize subtasks if present in the batch
+# and add them to the observation under:
+# - "observation.subtask.tokens"
+# - "observation.subtask.attention_mask"
+```
+
+When subtasks are available in the batch, the tokenizer processor adds:
+
+- `observation.subtask.tokens`: Tokenized subtask text
+- `observation.subtask.attention_mask`: Attention mask for the subtask tokens
+
+### DataLoader with Subtasks
+
+```python
+import torch
+from lerobot.datasets import LeRobotDataset
+
+dataset = LeRobotDataset("jadechoghari/collect-fruit-annotated")
+
+dataloader = torch.utils.data.DataLoader(
+    dataset,
+    batch_size=16,
+    shuffle=True,
+)
+
+for batch in dataloader:
+    # Access subtask information in the batch
+    subtasks = batch["subtask"]  # List of subtask strings
+    subtask_indices = batch["subtask_index"]  # Tensor of subtask indices
+
+    # Use for training hierarchical policies or reward models
+    print(f"Batch subtasks: {set(subtasks)}")
+```
+
+## Example Datasets with Subtask Annotations
+
+Try loading a dataset with subtask annotations:
+
+```python
+from lerobot.datasets import LeRobotDataset
+
+# Example dataset with subtask annotations
+dataset = LeRobotDataset("jadechoghari/collect-fruit-annotated")
+
+# Explore the subtasks
+print("Available subtasks:")
+for subtask_name in dataset.meta.subtasks.index:
+    print(f"  - {subtask_name}")
+
+# Get subtask distribution
+subtask_counts = {}
+for i in range(len(dataset)):
+    sample = dataset[i]
+    subtask = sample["subtask"]
+    subtask_counts[subtask] = subtask_counts.get(subtask, 0) + 1
+
+print("\nSubtask distribution:")
+for subtask, count in sorted(subtask_counts.items(), key=lambda x: -x[1]):
+    print(f"  {subtask}: {count} frames")
+```
+
+## Use Cases
+
+### 1. Hierarchical Policy Training
+
+Train policies that predict both actions and current subtask:
+
+```python
+class HierarchicalPolicy(nn.Module):
+    def __init__(self, num_subtasks):
+        super().__init__()
+        self.action_head = nn.Linear(hidden_dim, action_dim)
+        self.subtask_head = nn.Linear(hidden_dim, num_subtasks)
+
+    def forward(self, observations):
+        features = self.encoder(observations)
+        actions = self.action_head(features)
+        subtask_logits = self.subtask_head(features)
+        return actions, subtask_logits
+```
+
+### 2. Stage-Aware Reward Modeling (SARM)
+
+Build reward models that understand task progression:
+
+```python
+# SARM predicts:
+# - Stage: Which subtask is being executed (discrete)
+# - Progress: How far along the subtask (continuous 0-1)
+
+class SARMRewardModel(nn.Module):
+    def forward(self, observations):
+        features = self.encoder(observations)
+        stage_logits = self.stage_classifier(features)
+        progress = self.progress_regressor(features)
+        return stage_logits, progress
+```
+
+### 3. Progress Visualization
+
+Monitor robot execution by tracking subtask progression:
+
+```python
+def visualize_execution(model, observations):
+    for t, obs in enumerate(observations):
+        action, subtask_logits = model(obs)
+        predicted_subtask = subtask_names[subtask_logits.argmax()]
+        print(f"t={t}: Executing '{predicted_subtask}'")
+```
+
+## API Reference
+
+### LeRobotDataset Properties
+
+| Property                    | Type                   | Description                                |
+| --------------------------- | ---------------------- | ------------------------------------------ |
+| `meta.subtasks`             | `pd.DataFrame \| None` | DataFrame mapping subtask names to indices |
+| `features["subtask_index"]` | `dict`                 | Feature spec for subtask_index if present  |
+
+### Sample Keys
+
+When subtasks are available, each sample includes:
+
+| Key             | Type           | Description                          |
+| --------------- | -------------- | ------------------------------------ |
+| `subtask_index` | `torch.Tensor` | Integer index of the current subtask |
+| `subtask`       | `str`          | Natural language subtask description |
+
+## Related Resources
+
+- [SARM Paper](https://arxiv.org/pdf/2509.25358) - Stage-Aware Reward Modeling for Long Horizon Robot Manipulation
+- [LeRobot Annotate Space](https://huggingface.co/spaces/lerobot/annotate) - Interactive annotation tool
+- [LeRobotDataset v3.0](./lerobot-dataset-v3) - Dataset format documentation
--- a/docs/source/eo1.mdx
+++ b/docs/source/eo1.mdx
@@ -0,0 +1,168 @@
+# EO-1
+
+EO-1 is a **Vision-Language-Action policy for robot control**. The LeRobot implementation integrates EO-1 with the standard LeRobot training, evaluation, processor interface.
+
+## Model Overview
+
+EO-1 uses a Qwen2.5-VL backbone for vision-language understanding and adds a continuous flow-matching action head for robot control. The policy formats each robot-control sample as a multimodal conversation: camera images are passed to Qwen2.5-VL, the robot state is represented with EO-1 state tokens, and the future action chunk is represented with EO-1 action tokens.
+
+<img
+  src="https://huggingface.co/datasets/HaomingSong/lerobot-documentation-images/resolve/main/lerobot/eo_pipeline.png"
+  alt="An overview of EO-1"
+  width="85%"
+/>
+
+During training, EO-1 learns to denoise continuous action chunks at the action-token positions. During inference, it samples an action chunk, returns continuous actions, and executes `n_action_steps` from the chunk before sampling again.
+
+### What the LeRobot Integration Covers
+
+- Standard `policy.type=eo1` configuration through LeRobot
+- Qwen2.5-VL image and text preprocessing through policy processors
+- Continuous flow-matching action prediction
+- Checkpoint save/load through LeRobot policy APIs
+- Training with `lerobot-train` and evaluation with `lerobot-eval`
+
+The broader EO-1 project also includes interleaved vision-text-action pretraining and multimodal reasoning workflows. This page focuses on the LeRobot robot-control policy path.
+
+## Installation Requirements
+
+1. Install LeRobot by following the [Installation Guide](./installation).
+2. Install EO-1 dependencies by running:
+
+   ```bash
+   pip install -e ".[eo1]"
+   ```
+
+3. If you want to train or evaluate on LIBERO, install the LIBERO dependencies too:
+
+   ```bash
+   pip install -e ".[eo1,libero]"
+   ```
+
+EO-1 can use the standard PyTorch scaled-dot-product attention backend through `policy.attn_implementation=sdpa`. If your environment has a compatible `flash_attn` installation, you can request `policy.attn_implementation=flash_attention_2`.
+
+## Data Requirements
+
+EO-1 expects a LeRobot dataset with:
+
+- At least one visual observation, for example `observation.images.image`
+- `observation.state`
+- `action`
+- A language task instruction through the dataset `task` field
+
+If your dataset uses different observation names, use `rename_map` to align them with the names expected by your training or evaluation setup.
+
+## Usage
+
+To use EO-1 in a LeRobot configuration, specify the policy type as:
+
+```python
+policy.type=eo1
+```
+
+By default, a new EO-1 policy initializes its backbone from:
+
+```python
+policy.vlm_base=Qwen/Qwen2.5-VL-3B-Instruct
+```
+
+Once a LeRobot-format EO-1 checkpoint is available, load it with:
+
+```python
+policy.path=your-org/your-eo1-checkpoint
+```
+
+## Training
+
+### Training Command Example
+
+```bash
+lerobot-train \
+  --dataset.repo_id=your_org/your_dataset \
+  --policy.type=eo1 \
+  --policy.vlm_base=Qwen/Qwen2.5-VL-3B-Instruct \
+  --policy.dtype=bfloat16 \
+  --policy.attn_implementation=sdpa \
+  --policy.gradient_checkpointing=false \
+  --output_dir=./outputs/eo1_training \
+  --job_name=eo1_training \
+  --steps=300000 \
+  --batch_size=16 \
+  --policy.device=cuda
+```
+
+### Key Training Parameters
+
+| Parameter                              | Default                       | Description                                                             |
+| -------------------------------------- | ----------------------------- | ----------------------------------------------------------------------- |
+| `policy.vlm_base`                      | `Qwen/Qwen2.5-VL-3B-Instruct` | Qwen2.5-VL checkpoint used to initialize a new policy                   |
+| `policy.dtype`                         | `auto`                        | Backbone dtype request: `auto`, `bfloat16`, or `float32`                |
+| `policy.attn_implementation`           | `None`                        | Optional Qwen attention backend, such as `sdpa`                         |
+| `policy.gradient_checkpointing`        | `false`                       | Reduces memory usage during training                                    |
+| `policy.chunk_size`                    | `8`                           | Number of future actions predicted per chunk                            |
+| `policy.n_action_steps`                | `8`                           | Number of actions consumed from a sampled chunk                         |
+| `policy.num_denoise_steps`             | `10`                          | Number of flow-matching denoising steps used during sampling            |
+| `policy.max_state_dim`                 | `32`                          | State padding dimension                                                 |
+| `policy.max_action_dim`                | `32`                          | Action padding dimension                                                |
+| `policy.force_fp32_autocast`           | `true`                        | Keeps the flow head in fp32 even when the backbone uses mixed precision |
+| `policy.supervise_padding_action_dims` | `true`                        | Controls whether padded action dimensions are supervised                |
+| `policy.supervise_padding_actions`     | `true`                        | Controls whether padded future action rows are supervised               |
+
+## Evaluation
+
+EO-1 can be evaluated through `lerobot-eval` once you have a LeRobot-format checkpoint:
+
+```bash
+lerobot-eval \
+  --policy.path=your-org/your-eo1-checkpoint \
+  --env.type=libero \
+  --env.task=libero_object \
+  --eval.batch_size=1 \
+  --eval.n_episodes=20
+```
+
+For datasets or environments whose camera names differ from the checkpoint configuration, pass a `rename_map`:
+
+```bash
+lerobot-eval \
+  --policy.path=your-org/your-eo1-checkpoint \
+  --env.type=libero \
+  --env.task=libero_object \
+  --rename_map='{"observation.images.image2":"observation.images.wrist_image"}'
+```
+
+## Configuration Notes
+
+### Image Processing
+
+EO-1 uses the Qwen2.5-VL processor. The `policy.image_min_pixels` and `policy.image_max_pixels` settings control the image resizing bounds before the visual tokens are passed into the backbone.
+
+### State and Action Dimensions
+
+The policy pads state and action vectors to `policy.max_state_dim` and `policy.max_action_dim` before the EO-1 flow head. Predictions are cropped back to the original action dimension before being returned by the policy.
+
+### Attention Backend
+
+Use `policy.attn_implementation=sdpa` for a portable setup. Use `flash_attention_2` only when `flash_attn` is installed and compatible with your environment.
+
+## References
+
+- [EO-1 project](https://github.com/EO-Robotics/EO1)
+- [EO-1 paper](https://arxiv.org/abs/2508.21112)
+- [Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct)
+
+## Citation
+
+```bibtex
+@article{eo1,
+  title={EO-1: Interleaved Vision-Text-Action Pretraining for General Robot Control},
+  author={Delin Qu and Haoming Song and Qizhi Chen and Zhaoqing Chen and Xianqiang Gao and Xinyi Ye and Qi Lv and Modi Shi and Guanghui Ren and Cheng Ruan and Maoqing Yao and Haoran Yang and Jiacheng Bao and Bin Zhao and Dong Wang},
+  journal={arXiv preprint},
+  year={2025},
+  url={https://arxiv.org/abs/2508.21112}
+}
+```
+
+## License
+
+This LeRobot integration follows the **Apache 2.0 License** used by LeRobot. Check the upstream EO-1 model and dataset pages for the licenses of released EO-1 checkpoints and data.
--- a/docs/source/language_and_recipes.mdx
+++ b/docs/source/language_and_recipes.mdx
@@ -1,147 +0,0 @@
-# Language columns and recipes
-
-Most LeRobot datasets ship with a single `task` string per episode — fine for
-short, single-instruction skills, but not enough for the longer-horizon,
-multi-modal robot policies the field is moving toward (high-level planning,
-memory, interjections, VQA, tool use). To support those policies without
-forking the dataset format, LeRobot extends `LeRobotDataset` with two optional
-language columns and a small recipe layer that turns those rows into
-chat-style training samples on the fly.
-
-The design splits cleanly into three layers:
-
-1. **Data in the dataset** — language annotations stored next to frames in
-   `data/chunk-*/file-*.parquet` as two optional columns (`language_persistent`
-   and `language_events`). Datasets without these columns keep their existing
-   behavior.
-2. **Recipe** — a YAML file that declares which annotation rows to bind and
-   how to lay them out as chat turns (`role`, `content`, optional images,
-   optional tool calls). Recipes are pure config; no Python required to add a
-   new one.
-3. **Training format** — at sample time, `RenderMessagesStep` resolves the
-   recipe against the per-frame annotations and emits HF-style `messages` plus
-   LeRobot-specific sidecars (`message_streams`, `target_message_indices`)
-   that policy processors consume.
-
-This page describes each layer in turn.
-
-## Layer 1 — language columns in the dataset
-
-The two optional columns live next to frame data in
-`data/chunk-*/file-*.parquet`:
-
- `language_persistent`: a list of rows broadcast across every frame in an episode for state that remains active, such as `subtask`, `plan`, and `memory`.
- `language_events`: a list of rows only on the exact frame where an event was emitted, such as `interjection`, `vqa`, and speech tool calls.
-
-Both columns share the same row shape (event rows omit `timestamp` because the
-frame the row sits on already provides it):
-
-```text
-role: string
-content: string | null
-style: string | null
-timestamp: float64        # persistent rows only
-camera: string | null     # observation.images.* feature key, view-dependent rows only
-tool_calls: list[Json] | null
-```
-
-The `camera` field tags rows whose `content` is grounded in a specific camera
-view. Rows of view-dependent styles (`vqa` and `trace`) MUST set `camera` to
-the matching `observation.images.*` feature key. Rows of every other style —
-including `motion`, which describes robot-frame primitives in joint / Cartesian
-terms — MUST leave `camera` as `null`. Pipeline writers and the validator
-enforce this via `validate_camera_field(style, camera)`.
-
-`meta/tasks.parquet` remains the canonical source for the task. The special `${task}` recipe binding always reads that task string and does not depend on language annotations.
-
-### Architecture
-
-The language stack itself has three internal modules backing layer 1:
-
-1. `lerobot.datasets.language` defines the schema, style registry, and `column_for_style`.
-2. `lerobot.datasets.language_render` resolves rows and renders messages.
-3. `RenderMessagesStep` turns dataset samples into `messages`, `message_streams`, and `target_message_indices`.
-
-`LeRobotDataset` stays recipe-agnostic. It passes `language_persistent` and `language_events` through when present, and unannotated datasets keep their existing behavior.
-
-### Temporal semantics
-
-Persistent styles are active after emission until replaced:
-
- `active_at(t, style=subtask)`
- `nth_prev(style=memory, offset=1)`
- `nth_next(style=subtask, offset=1)`
-
-Event styles only exist on their exact timestamp:
-
- `emitted_at(t, style=interjection)`
- `emitted_at(t, style=vqa, role=user, camera=observation.images.top)`
- `emitted_at(t, role=assistant, tool_name=say)`
-
-Exact event matching has no tolerance window, so writers must stamp event rows with frame timestamps from the parquet data.
-
-### View-dependent resolution
-
-For view-dependent styles (`vqa` and `trace`), the resolver gains a
-`camera=` filter parallel to `role=` and `tool_name=`. Datasets with multiple
-cameras typically emit one (`vqa`, `user`) + (`vqa`, `assistant`) pair per
-camera at the same timestamp; without `camera=`, those resolvers see two
-matches and raise an ambiguity error. Recipes consume each camera through its
-own binding plus a matching image block, e.g.
-
-```yaml
-ask_vqa_top:
-  bindings:
-    vqa_query: "emitted_at(t, style=vqa, role=user, camera=observation.images.top)"
-    vqa: "emitted_at(t, style=vqa, role=assistant, camera=observation.images.top)"
-  messages:
-    - role: user
-      stream: high_level
-      if_present: vqa_query
-      content:
-        - { type: image, feature: observation.images.top }
-        - { type: text, text: "${vqa_query}" }
-    - {
-        role: assistant,
-        content: "${vqa}",
-        stream: high_level,
-        target: true,
-        if_present: vqa,
-      }
-```
-
-Add one such sub-recipe per camera the dataset records.
-
-## Layer 2 — recipe anatomy
-
-Recipes are YAML files backed by `TrainingRecipe` and `MessageTurn`. They
-declare which annotation rows to pull (via `bindings`) and how to compose them
-into chat turns (`messages`).
-
-```yaml
-messages:
-  - { role: user, content: "${task}", stream: high_level }
-  - { role: assistant, content: "${subtask}", stream: low_level, target: true }
-```
-
-A recipe can also branch into a weighted **blend** of sub-recipes. At sample
-time, exactly one branch is selected deterministically from the sample index,
-so different frames train different objectives (e.g. memory updates vs.
-low-level execution vs. VQA) without any Python wiring.
-
-## Layer 3 — training format
-
-Rendered samples use HF-style chat messages plus LeRobot sidecars:
-
-```python
-sample["messages"]
-sample["message_streams"]
-sample["target_message_indices"]
-```
-
-The renderer does not apply a tokenizer chat template. Policy processors decide how to serialize the messages for their backbone, which keeps the same dataset usable across SmolVLA, Pi0.5, and any future VLM that expects OpenAI-style chat messages.
-
-## Graceful absence
-
-If both language columns are missing, `None`, or empty, `RenderMessagesStep` is a no-op.
-If an event-scoped branch is selected on a frame without the required event row, rendering returns `None`, allowing a loader to retry another sample.
--- a/docs/source/tools.mdx
+++ b/docs/source/tools.mdx
@@ -1,200 +0,0 @@
-# Tools
-
-LeRobot v3.1 supports **tool calls** in policies — assistant messages can
-emit structured invocations like `say(text="OK, starting now")` that the
-runtime dispatches to a real implementation (TTS, controller, logger, …).
-
-This page covers:
-
-1. Where the tool catalog lives.
-2. How the annotation pipeline produces tool-call atoms.
-3. How to add your own tool.
-
-## Where tools are declared
-
-Two layers.
-
-**The catalog** — a list of OpenAI-style function schemas — lives at
-`meta/info.json["tools"]` on each dataset. Example:
-
-```json
-{
-  "features": { "...": "..." },
-  "tools": [
-    {
-      "type": "function",
-      "function": {
-        "name": "say",
-        "description": "Speak a short utterance to the user via the TTS executor.",
-        "parameters": {
-          "type": "object",
-          "properties": {
-            "text": {
-              "type": "string",
-              "description": "The verbatim text to speak."
-            }
-          },
-          "required": ["text"]
-        }
-      }
-    }
-  ]
-}
-```
-
-Read it via the dataset metadata accessor:
-
-```python
-from lerobot.datasets.dataset_metadata import LeRobotDatasetMetadata
-
-meta = LeRobotDatasetMetadata(repo_id="pepijn/super_poulain_final_annotations")
-tools = meta.tools     # list[dict] — OpenAI tool schemas
-```
-
-If the dataset's `info.json` doesn't declare any tools, `meta.tools`
-returns `DEFAULT_TOOLS` from `lerobot.datasets.language` — currently a
-single-entry list with the canonical `say` schema. So unannotated
-datasets and chat-template consumers keep working without any
-configuration:
-
-```python
-prompt_str = tokenizer.apply_chat_template(
-    sample["messages"],
-    tools=meta.tools,                 # works either way
-    add_generation_prompt=False,
-    tokenize=False,
-)
-```
-
-**The implementations** — runnable Python — live under
-`src/lerobot/tools/`, one file per tool. The canonical `say`
-implementation wraps Kyutai's pocket-tts model.
-
-## Per-row tool _invocations_
-
-The catalog above describes _what can be called_. The actual _call_ — the
-function name plus the argument values — is stored per-row, on the
-assistant atoms in `language_events`:
-
-```python
-{
-  "role": "assistant",
-  "content": null,
-  "style": null,
-  "timestamp": 12.4,
-  "camera": null,
-  "tool_calls": [
-    { "type": "function",
-      "function": { "name": "say", "arguments": { "text": "On it." } } }
-  ]
-}
-```
-
-Recipes splice these into rendered messages via `tool_calls_from`:
-
-```yaml
-user_interjection_response:
-  bindings:
-    speech: "emitted_at(t, role=assistant, tool_name=say)"
-  messages:
-    - { role: user, content: "${task}", stream: high_level }
-    - {
-        role: assistant,
-        content: "${current_plan}",
-        stream: high_level,
-        target: true,
-        tool_calls_from: speech,
-      }
-```
-
-The model's training target is one assistant turn that carries both the
-plan text _and_ the `say` tool call. At inference, the runtime parses
-the generated text back into structured `tool_calls` and dispatches to
-the matching implementation.
-
-## How to add your own tool
-
-Three steps. Concrete example: a `record_observation` tool the policy
-can call to capture an extra observation outside the regular control
-loop.
-
-### Step 1 — declare the schema
-
-Add an entry under `meta/info.json["tools"]`. Either edit the file
-directly on disk _before_ running the annotation pipeline (it'll be
-preserved) or hand it to `lerobot-annotate` via a config flag.
-
-```json
-{
-  "tools": [
-    { "type": "function", "function": { "name": "say", "...": "..." } },
-    {
-      "type": "function",
-      "function": {
-        "name": "record_observation",
-        "description": "Capture a high-resolution still image for the user.",
-        "parameters": {
-          "type": "object",
-          "properties": {
-            "label": {
-              "type": "string",
-              "description": "Short label for the saved image."
-            }
-          },
-          "required": ["label"]
-        }
-      }
-    }
-  ]
-}
-```
-
-The schema follows OpenAI's function-calling convention exactly, so the
-chat template can render it natively.
-
-### Step 2 — implement the call
-
-Create `src/lerobot/tools/record_observation.py`:
-
-```python
-from .base import Tool
-from typing import Any
-
-RECORD_OBSERVATION_SCHEMA: dict[str, Any] = { "...": "..." }   # mirrors the JSON above
-
-
-class RecordObservationTool:
-    name = "record_observation"
-    schema = RECORD_OBSERVATION_SCHEMA
-
-    def __init__(self, schema: dict | None = None, output_dir: str = "."):
-        self.output_dir = output_dir
-
-    def call(self, arguments: dict) -> str:
-        label = arguments["label"]
-        # ... save the latest camera frame to <output_dir>/<label>.png ...
-        return f"saved {label}.png"
-```
-
-One file per tool keeps dependencies isolated — `record_observation`
-might pull `pillow`, while `say` pulls `pocket-tts`. Users installing
-only the tools they need avoid heavy transitive deps.
-
-### Step 3 — register it
-
-Add to `src/lerobot/tools/registry.py`:
-
-```python
-from .record_observation import RecordObservationTool
-
-TOOL_REGISTRY["record_observation"] = RecordObservationTool
-```
-
-That's it. At runtime `get_tools(meta)` looks up each schema in
-`meta.tools`, instantiates the matching registered class, and returns
-a name → instance dict the dispatcher can route into.
-
-If you want to use a tool _without_ writing an implementation (e.g. for
-training-time chat-template formatting only), step 1 alone is enough —
-the model still learns to _generate_ the call. Steps 2 and 3 are only
-needed to actually _execute_ it at inference.
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -95,7 +95,7 @@ dependencies = [

 # ── Feature-scoped extras ──────────────────────────────────
 dataset = [
-    "datasets>=4.7.0,<5.0.0",
+    "datasets>=4.0.0,<5.0.0",
    "pandas>=2.0.0,<3.0.0", # NOTE: Transitive dependency of datasets
    "pyarrow>=21.0.0,<30.0.0", # NOTE: Transitive dependency of datasets
    "lerobot[av-dep]",
@@ -194,6 +194,7 @@ groot = [
 ]
 sarm = ["lerobot[transformers-dep]", "pydantic>=2.0.0,<3.0.0", "faker>=33.0.0,<35.0.0", "lerobot[matplotlib-dep]", "lerobot[qwen-vl-utils-dep]"]
 xvla = ["lerobot[transformers-dep]"]
+eo1 = ["lerobot[transformers-dep]", "lerobot[qwen-vl-utils-dep]"]
 hilserl = ["lerobot[transformers-dep]", "gym-hil>=0.1.13,<0.2.0", "lerobot[grpcio-dep]", "lerobot[placo-dep]"]

 # Features
--- a/src/lerobot/configs/init.py
+++ b/src/lerobot/configs/init.py
@@ -24,7 +24,6 @@ Import them directly: ``from lerobot.configs.train import TrainPipelineConfig``
 from .dataset import DatasetRecordConfig
 from .default import DatasetConfig, EvalConfig, PeftConfig, WandBConfig
 from .policies import PreTrainedConfig
-from .recipe import MessageTurn, TrainingRecipe, load_recipe
 from .types import (
    FeatureType,
    NormalizationMode,
@@ -44,10 +43,7 @@ __all__ = [
    "DatasetRecordConfig",
    "DatasetConfig",
    "EvalConfig",
-    "MessageTurn",
    "PeftConfig",
    "PreTrainedConfig",
-    "TrainingRecipe",
    "WandBConfig",
-    "load_recipe",
 ]
--- a/src/lerobot/configs/recipe.py
+++ b/src/lerobot/configs/recipe.py
@@ -1,193 +0,0 @@
-#!/usr/bin/env python
-
-# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-from __future__ import annotations
-
-import re
-from dataclasses import dataclass
-from pathlib import Path
-from typing import Any, Literal, get_args
-
-MessageRole = Literal["user", "assistant", "system", "tool"]
-MessageStream = Literal["high_level", "low_level"]
-
-DEFAULT_BINDINGS = {
-    "subtask": "active_at(t, style=subtask)",
-    "memory": "active_at(t, style=memory)",
-    "plan": "active_at(t, style=plan)",
-    "speech": "emitted_at(t, role=assistant, tool_name=say)",
-    "interjection": "emitted_at(t, style=interjection)",
-    "vqa": "emitted_at(t, style=vqa, role=assistant)",
-    "vqa_query": "emitted_at(t, style=vqa, role=user)",
-}
-
-_PLACEHOLDER_RE = re.compile(r"\$\{([A-Za-z_][A-Za-z0-9_]*)\}")
-_VALID_ROLES = frozenset(get_args(MessageRole))
-_VALID_STREAMS = frozenset(get_args(MessageStream))
-
-
-@dataclass
-class MessageTurn:
-    """A single chat-style turn in a recipe template.
-
-    ``content`` may be a plain string, a list of HF-style multimodal blocks, or
-    ``None`` when ``tool_calls_from`` supplies tool-call payloads instead.
-    ``stream`` tags the turn for downstream filtering, ``target`` flags it as a
-    training target, and ``if_present`` skips the turn when the named binding
-    resolves to ``None``.
-    """
-
-    role: MessageRole
-    content: str | list[dict[str, Any]] | None = None
-    stream: MessageStream | None = None
-    target: bool = False
-    if_present: str | None = None
-    tool_calls_from: str | None = None
-
-    def __post_init__(self) -> None:
-        """Validate role, stream, and content after dataclass construction."""
-        if self.role not in _VALID_ROLES:
-            raise ValueError(f"Unsupported message role: {self.role!r}")
-        if self.stream is not None and self.stream not in _VALID_STREAMS:
-            raise ValueError(f"Unsupported message stream: {self.stream!r}")
-        if self.content is None and self.tool_calls_from is None:
-            raise ValueError("MessageTurn.content is required unless tool_calls_from is set.")
-        if self.content is not None and not isinstance(self.content, (str, list)):
-            raise TypeError("MessageTurn.content must be a string, a list of HF-style blocks, or None.")
-        if isinstance(self.content, list):
-            for block in self.content:
-                if not isinstance(block, dict) or "type" not in block:
-                    raise ValueError(
-                        "Multimodal content blocks must be HF-style dictionaries with a type key."
-                    )
-
-    @classmethod
-    def from_dict(cls, data: dict[str, Any]) -> MessageTurn:
-        """Construct a :class:`MessageTurn` from a plain dictionary."""
-        return cls(**data)
-
-
-@dataclass
-class TrainingRecipe:
-    """A recipe describing how to render training samples from language rows.
-
-    A recipe is either a *message recipe* (``messages`` plus optional
-    ``bindings``) or a *blend recipe* (``blend`` mapping names to weighted
-    sub-recipes). ``weight`` is only meaningful inside a blend.
-    """
-
-    messages: list[MessageTurn] | None = None
-    bindings: dict[str, str] | None = None
-    blend: dict[str, TrainingRecipe] | None = None
-    weight: float | None = None
-
-    def __post_init__(self) -> None:
-        """Validate that exactly one of ``messages`` or ``blend`` is set."""
-        if self.messages is not None and self.blend is not None:
-            raise ValueError("TrainingRecipe must set only one of messages or blend.")
-        if self.messages is None and self.blend is None:
-            raise ValueError("TrainingRecipe must set one of messages or blend.")
-
-        if self.messages is not None:
-            self._validate_message_recipe()
-        if self.blend is not None:
-            self._validate_blend_recipe()
-
-    @classmethod
-    def from_dict(cls, data: dict[str, Any]) -> TrainingRecipe:
-        """Construct a :class:`TrainingRecipe` from a nested dictionary."""
-        data = dict(data)
-        if data.get("messages") is not None:
-            data["messages"] = [
-                turn if isinstance(turn, MessageTurn) else MessageTurn.from_dict(turn)
-                for turn in data["messages"]
-            ]
-        if data.get("blend") is not None:
-            data["blend"] = {
-                name: recipe if isinstance(recipe, TrainingRecipe) else cls.from_dict(recipe)
-                for name, recipe in data["blend"].items()
-            }
-        return cls(**data)
-
-    @classmethod
-    def from_yaml(cls, path: str | Path) -> TrainingRecipe:
-        """Load a :class:`TrainingRecipe` from a YAML file at ``path``."""
-        import yaml  # type: ignore[import-untyped]
-
-        with open(path) as f:
-            data = yaml.safe_load(f)
-        if not isinstance(data, dict):
-            raise ValueError(f"Recipe YAML must contain a mapping at the top level: {path}")
-        return cls.from_dict(data)
-
-    def _validate_message_recipe(self) -> None:
-        """Ensure every templated binding is known and at least one turn is a target."""
-        assert self.messages is not None
-        known_bindings = set(DEFAULT_BINDINGS) | set(self.bindings or {}) | {"task"}
-
-        for turn in self.messages:
-            missing = self._referenced_bindings(turn) - known_bindings
-            if missing:
-                raise ValueError(f"MessageTurn references unknown binding(s): {sorted(missing)}")
-
-        if not any(turn.target for turn in self.messages):
-            raise ValueError("Message recipes must contain at least one target turn.")
-
-    def _validate_blend_recipe(self) -> None:
-        """Ensure each blend component is a non-empty, weighted message recipe."""
-        assert self.blend is not None
-        if not self.blend:
-            raise ValueError("Blend recipes must contain at least one component.")
-
-        for name, recipe in self.blend.items():
-            if recipe.blend is not None:
-                raise ValueError(f"Blend component {name!r} cannot itself define a blend.")
-            if recipe.messages is None:
-                raise ValueError(f"Blend component {name!r} must define messages.")
-            if recipe.weight is None:
-                raise ValueError(f"Blend component {name!r} must define weight.")
-            if recipe.weight <= 0:
-                raise ValueError(f"Blend component {name!r} must have a positive weight.")
-
-    def _referenced_bindings(self, turn: MessageTurn) -> set[str]:
-        """Return the binding names that ``turn`` references via placeholders or attributes."""
-        names: set[str] = set()
-        if turn.if_present is not None:
-            names.add(turn.if_present)
-        if turn.tool_calls_from is not None:
-            names.add(turn.tool_calls_from)
-        names.update(_placeholders_in_content(turn.content))
-        return names
-
-
-def _placeholders_in_content(content: str | list[dict[str, Any]] | None) -> set[str]:
-    """Return the set of ``${name}`` placeholders found anywhere in ``content``."""
-    if content is None:
-        return set()
-    if isinstance(content, str):
-        return set(_PLACEHOLDER_RE.findall(content))
-
-    names: set[str] = set()
-    for block in content:
-        for value in block.values():
-            if isinstance(value, str):
-                names.update(_PLACEHOLDER_RE.findall(value))
-    return names
-
-
-def load_recipe(path: str | Path) -> TrainingRecipe:
-    """Load a :class:`TrainingRecipe` from a YAML file at ``path``."""
-    return TrainingRecipe.from_yaml(path)
--- a/src/lerobot/datasets/init.py
+++ b/src/lerobot/datasets/init.py
@@ -37,14 +37,6 @@ from .dataset_tools import (
 from .factory import make_dataset, resolve_delta_timestamps
 from .image_writer import safe_stop_image_writer
 from .io_utils import load_episodes, write_stats
-from .language import (
-    EVENT_ONLY_STYLES,
-    LANGUAGE_EVENTS,
-    LANGUAGE_PERSISTENT,
-    PERSISTENT_STYLES,
-    STYLE_REGISTRY,
-    column_for_style,
-)
 from .lerobot_dataset import LeRobotDataset
 from .multi_dataset import MultiLeRobotDataset
 from .pipeline_features import aggregate_pipeline_dataset_features, create_initial_features
@@ -61,15 +53,10 @@ __all__ = [
    "CODEBASE_VERSION",
    "DEFAULT_EPISODES_PATH",
    "DEFAULT_QUANTILES",
-    "EVENT_ONLY_STYLES",
    "EpisodeAwareSampler",
-    "LANGUAGE_EVENTS",
-    "LANGUAGE_PERSISTENT",
    "LeRobotDataset",
    "LeRobotDatasetMetadata",
    "MultiLeRobotDataset",
-    "PERSISTENT_STYLES",
-    "STYLE_REGISTRY",
    "StreamingLeRobotDataset",
    "VideoEncodingManager",
    "add_features",
@@ -79,7 +66,6 @@ __all__ = [
    "convert_image_to_video_dataset",
    "create_initial_features",
    "create_lerobot_dataset_card",
-    "column_for_style",
    "delete_episodes",
    "get_feature_stats",
    "load_episodes",
--- a/src/lerobot/datasets/compute_stats.py
+++ b/src/lerobot/datasets/compute_stats.py
@@ -512,7 +512,7 @@ def compute_episode_stats(

    ep_stats = {}
    for key, data in episode_data.items():
-        if features[key]["dtype"] in {"string", "language"}:
+        if features[key]["dtype"] == "string":
            continue

        if features[key]["dtype"] in ["image", "video"]:
--- a/src/lerobot/datasets/dataset_metadata.py
+++ b/src/lerobot/datasets/dataset_metadata.py
@@ -34,6 +34,7 @@ from .io_utils import (
    load_episodes,
    load_info,
    load_stats,
+    load_subtasks,
    load_tasks,
    write_info,
    write_stats,
@@ -174,6 +175,7 @@ class LeRobotDatasetMetadata:
        self.info = load_info(self.root)
        check_version_compatibility(self.repo_id, self._version, CODEBASE_VERSION)
        self.tasks = load_tasks(self.root)
+        self.subtasks = load_subtasks(self.root)
        self.episodes = load_episodes(self.root)
        self.stats = load_stats(self.root)

@@ -316,39 +318,6 @@ class LeRobotDatasetMetadata:
        """Keys to access visual modalities (regardless of their storage method)."""
        return [key for key, ft in self.features.items() if ft["dtype"] in ["video", "image"]]

-    @property
-    def has_language_columns(self) -> bool:
-        """Return ``True`` if the dataset declares any language column.
-
-        Used to gate language-aware code paths (collate, render step) so
-        unannotated datasets keep PyTorch's default collate behavior.
-        """
-        from .language import LANGUAGE_COLUMNS  # noqa: PLC0415  (avoid circular import)
-
-        return any(col in self.features for col in LANGUAGE_COLUMNS)
-
-    @property
-    def tools(self) -> list[dict]:
-        """OpenAI-style tool schemas declared by this dataset.
-
-        Read from ``meta/info.json["tools"]``. Returns a copy, so callers
-        can mutate the result safely. Falls back to
-        :data:`lerobot.datasets.language.DEFAULT_TOOLS` (the canonical
-        ``say`` schema) when the dataset doesn't declare any — that way
-        unannotated datasets and chat-template consumers
-        (``apply_chat_template(messages, tools=meta.tools)``) keep
-        working out of the box.
-
-        Implementations live under :mod:`lerobot.tools` (one file per
-        tool); see ``docs/source/tools.mdx`` for the authoring guide.
-        """
-        from .language import DEFAULT_TOOLS  # noqa: PLC0415  (avoid circular import)
-
-        declared = self.info.tools
-        if declared:
-            return [dict(t) for t in declared]
-        return [dict(t) for t in DEFAULT_TOOLS]
-
    @property
    def names(self) -> dict[str, list | dict]:
        """Names of the various dimensions of vector modalities."""
@@ -664,6 +633,7 @@ class LeRobotDatasetMetadata:
        _validate_feature_names(features)

        obj.tasks = None
+        obj.subtasks = None
        obj.episodes = None
        obj.stats = None
        obj.info = create_empty_dataset_info(
--- a/src/lerobot/datasets/dataset_reader.py
+++ b/src/lerobot/datasets/dataset_reader.py
@@ -295,4 +295,9 @@ class DatasetReader:
        task_idx = item["task_index"].item()
        item["task"] = self._meta.tasks.iloc[task_idx].name

+        # add subtask information if available
+        if "subtask_index" in self._meta.features and self._meta.subtasks is not None:
+            subtask_idx = item["subtask_index"].item()
+            item["subtask"] = self._meta.subtasks.iloc[subtask_idx].name
+
        return item
--- a/src/lerobot/datasets/feature_utils.py
+++ b/src/lerobot/datasets/feature_utils.py
@@ -22,12 +22,6 @@ from PIL import Image as PILImage
 from lerobot.utils.constants import DEFAULT_FEATURES
 from lerobot.utils.utils import is_valid_numpy_dtype_string

-from .language import (
-    LANGUAGE_PERSISTENT,
-    is_language_column,
-    language_events_column_feature,
-    language_persistent_column_feature,
-)
 from .utils import (
    DEFAULT_CHUNK_SIZE,
    DEFAULT_DATA_FILE_SIZE_IN_MB,
@@ -52,13 +46,7 @@ def get_hf_features_from_features(features: dict) -> datasets.Features:
    """
    hf_features = {}
    for key, ft in features.items():
-        if is_language_column(key):
-            hf_features[key] = (
-                language_persistent_column_feature()
-                if key == LANGUAGE_PERSISTENT
-                else language_events_column_feature()
-            )
-        elif ft["dtype"] == "video":
+        if ft["dtype"] == "video":
            continue
        elif ft["dtype"] == "image":
            hf_features[key] = datasets.Image()
@@ -254,8 +242,6 @@ def validate_feature_dtype_and_shape(
        return validate_feature_image_or_video(name, expected_shape, value)
    elif expected_dtype == "string":
        return validate_feature_string(name, value)
-    elif expected_dtype == "language":
-        return ""
    else:
        raise NotImplementedError(f"The feature dtype '{expected_dtype}' is not implemented yet.")

--- a/src/lerobot/datasets/io_utils.py
+++ b/src/lerobot/datasets/io_utils.py
@@ -34,6 +34,7 @@ from lerobot.utils.utils import SuppressProgressBars, flatten_dict, unflatten_di
 from .utils import (
    DEFAULT_DATA_FILE_SIZE_IN_MB,
    DEFAULT_EPISODES_PATH,
+    DEFAULT_SUBTASKS_PATH,
    DEFAULT_TASKS_PATH,
    EPISODES_DIR,
    INFO_PATH,
@@ -185,6 +186,14 @@ def load_tasks(local_dir: Path) -> pandas.DataFrame:
    return tasks


+def load_subtasks(local_dir: Path) -> pandas.DataFrame | None:
+    """Load subtasks from subtasks.parquet if it exists."""
+    subtasks_path = local_dir / DEFAULT_SUBTASKS_PATH
+    if subtasks_path.exists():
+        return pd.read_parquet(subtasks_path)
+    return None
+
+
 def write_episodes(episodes: Dataset, local_dir: Path) -> None:
    """Write episode metadata to a parquet file in the LeRobot v3.0 format.
    This function writes episode-level metadata to a single parquet file.
@@ -256,13 +265,11 @@ def hf_transform_to_torch(items_dict: dict[str, list[Any]]) -> dict[str, list[to
        dict: The batch with items converted to torch tensors.
    """
    for key in items_dict:
-        if key in {"language_persistent", "language_events"}:
-            continue
        first_item = items_dict[key][0]
        if isinstance(first_item, PILImage.Image):
            to_tensor = transforms.ToTensor()
            items_dict[key] = [to_tensor(img) for img in items_dict[key]]
-        elif first_item is None or isinstance(first_item, dict):
+        elif first_item is None:
            pass
        else:
            items_dict[key] = [x if isinstance(x, str) else torch.tensor(x) for x in items_dict[key]]
@@ -298,11 +305,7 @@ def item_to_torch(item: dict) -> dict:
        dict: Dictionary with all tensor-like items converted to torch.Tensor.
    """
    for key, val in item.items():
-        if isinstance(val, (np.ndarray | list)) and key not in [
-            "task",
-            "language_persistent",
-            "language_events",
-        ]:
+        if isinstance(val, (np.ndarray | list)) and key not in ["task"]:
            # Convert numpy arrays and lists to torch tensors
            item[key] = torch.tensor(val)
    return item
--- a/src/lerobot/datasets/language.py
+++ b/src/lerobot/datasets/language.py
@@ -1,234 +0,0 @@
-#!/usr/bin/env python
-
-# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-from __future__ import annotations
-
-from typing import Literal
-
-import datasets
-import pyarrow as pa
-
-LANGUAGE_PERSISTENT = "language_persistent"
-LANGUAGE_EVENTS = "language_events"
-LANGUAGE_COLUMNS = (LANGUAGE_PERSISTENT, LANGUAGE_EVENTS)
-PERSISTENT_ROW_FIELDS = ("role", "content", "style", "timestamp", "camera", "tool_calls")
-EVENT_ROW_FIELDS = ("role", "content", "style", "camera", "tool_calls")
-
-CORE_STYLES = {
-    "subtask",
-    "plan",
-    "memory",
-    "motion",
-    "interjection",
-    "vqa",
-    "trace",
-    "task_aug",
-}
-EXTENDED_STYLES = set()
-STYLE_REGISTRY = CORE_STYLES | EXTENDED_STYLES
-
-PERSISTENT_STYLES = {"subtask", "plan", "memory", "motion", "task_aug"}
-EVENT_ONLY_STYLES = {"interjection", "vqa", "trace"}
-
-# Styles whose ``content`` is grounded in a specific camera view. Rows of these
-# styles MUST carry a non-null ``camera`` referencing an ``observation.images.*``
-# feature key. Rows of every other style MUST have ``camera=None``. ``motion``
-# is intentionally NOT in this set: motion primitives are described in
-# robot-frame (joint / Cartesian) terms, not pixel space, so they are
-# camera-agnostic. ``trace`` is the pixel-trajectory event style and IS
-# view-dependent. The ``camera`` field nevertheless lives on
-# ``PERSISTENT_ROW_FIELDS`` too so the schema, validator, and resolver
-# behave symmetrically across the two columns; persistent rows simply
-# always have ``camera=None`` in practice today.
-VIEW_DEPENDENT_STYLES = {"vqa", "trace"}
-
-LanguageColumn = Literal["language_persistent", "language_events"]
-
-
-def _json_arrow_type() -> pa.DataType:
-    """Return the Arrow JSON type, falling back to ``string`` on older pyarrow."""
-    return pa.json_() if hasattr(pa, "json_") else pa.string()
-
-
-def _json_feature() -> object:
-    """Return the HF ``datasets`` JSON feature, falling back to a string value."""
-    return datasets.Json() if hasattr(datasets, "Json") else datasets.Value("string")
-
-
-def language_persistent_row_arrow_type() -> pa.StructType:
-    """Return the Arrow struct type for a single persistent language row.
-
-    Persistent rows carry their own ``timestamp`` because they represent a state
-    that became active at a specific moment and remains active until superseded.
-    """
-    return pa.struct(
-        [
-            pa.field("role", pa.string(), nullable=False),
-            pa.field("content", pa.string(), nullable=True),
-            pa.field("style", pa.string(), nullable=True),
-            pa.field("timestamp", pa.float64(), nullable=False),
-            pa.field("camera", pa.string(), nullable=True),
-            pa.field("tool_calls", pa.list_(_json_arrow_type()), nullable=True),
-        ]
-    )
-
-
-def language_event_row_arrow_type() -> pa.StructType:
-    """Return the Arrow struct type for a single event language row.
-
-    Event rows have no ``timestamp`` field: each event is stored on the dataset
-    row whose frame timestamp is the event's firing time.
-    """
-    return pa.struct(
-        [
-            pa.field("role", pa.string(), nullable=False),
-            pa.field("content", pa.string(), nullable=True),
-            pa.field("style", pa.string(), nullable=True),
-            pa.field("camera", pa.string(), nullable=True),
-            pa.field("tool_calls", pa.list_(_json_arrow_type()), nullable=True),
-        ]
-    )
-
-
-def language_persistent_arrow_type() -> pa.ListType:
-    """Return the Arrow list type for the ``language_persistent`` column."""
-    return pa.list_(language_persistent_row_arrow_type())
-
-
-def language_events_arrow_type() -> pa.ListType:
-    """Return the Arrow list type for the ``language_events`` column."""
-    return pa.list_(language_event_row_arrow_type())
-
-
-def language_persistent_row_feature() -> dict[str, object]:
-    """Return the HF ``datasets`` feature mapping for a persistent language row."""
-    return {
-        "role": datasets.Value("string"),
-        "content": datasets.Value("string"),
-        "style": datasets.Value("string"),
-        "timestamp": datasets.Value("float64"),
-        "camera": datasets.Value("string"),
-        "tool_calls": datasets.List(_json_feature()),
-    }
-
-
-def language_event_row_feature() -> dict[str, object]:
-    """Return the HF ``datasets`` feature mapping for an event language row."""
-    return {
-        "role": datasets.Value("string"),
-        "content": datasets.Value("string"),
-        "style": datasets.Value("string"),
-        "camera": datasets.Value("string"),
-        "tool_calls": datasets.List(_json_feature()),
-    }
-
-
-def language_persistent_column_feature() -> datasets.List:
-    """Return the HF ``datasets`` feature for the ``language_persistent`` column."""
-    return datasets.List(language_persistent_row_feature())
-
-
-def language_events_column_feature() -> datasets.List:
-    """Return the HF ``datasets`` feature for the ``language_events`` column."""
-    return datasets.List(language_event_row_feature())
-
-
-def language_feature_info() -> dict[str, dict]:
-    """Return the ``info["features"]`` entries for both language columns."""
-    return {
-        LANGUAGE_PERSISTENT: {"dtype": "language", "shape": (1,), "names": None},
-        LANGUAGE_EVENTS: {"dtype": "language", "shape": (1,), "names": None},
-    }
-
-
-def is_language_column(key: str) -> bool:
-    """Return ``True`` if ``key`` is one of the dataset's language column names."""
-    return key in LANGUAGE_COLUMNS
-
-
-def is_view_dependent_style(style: str | None) -> bool:
-    """Return ``True`` if rows of ``style`` must be tagged with a ``camera`` key."""
-    return style in VIEW_DEPENDENT_STYLES
-
-
-def validate_camera_field(style: str | None, camera: str | None) -> None:
-    """Enforce the ``camera`` invariant: required iff ``style`` is view-dependent.
-
-    Raises ``ValueError`` if a view-dependent style is missing ``camera`` or if
-    a non-view-dependent style carries one. Pipeline writers and the validator
-    should call this on every emitted row.
-    """
-    if is_view_dependent_style(style):
-        if not camera:
-            raise ValueError(
-                f"Rows of view-dependent style {style!r} require a non-empty 'camera' "
-                f"field referencing an 'observation.images.*' feature key."
-            )
-    elif camera is not None:
-        raise ValueError(f"Rows of style {style!r} must have camera=None; got camera={camera!r}.")
-
-
-# --- Tool registry --------------------------------------------------------
-# Tools declared on a dataset live in ``meta/info.json["tools"]`` as a list
-# of OpenAI-style function schemas. The runtime / training stack reads them
-# through :class:`LeRobotDatasetMetadata.tools` (with these constants as
-# fallback when the dataset doesn't declare any). Implementations live
-# under :mod:`lerobot.tools` (one file per tool); see
-# ``docs/source/tools.mdx`` for the authoring guide.
-
-SAY_TOOL_SCHEMA: dict = {
-    "type": "function",
-    "function": {
-        "name": "say",
-        "description": "Speak a short utterance to the user via the TTS executor.",
-        "parameters": {
-            "type": "object",
-            "properties": {
-                "text": {
-                    "type": "string",
-                    "description": "The verbatim text to speak.",
-                }
-            },
-            "required": ["text"],
-        },
-    },
-}
-"""Canonical schema for the ``say`` tool emitted by the steerable
-annotation pipeline (PR 2 Module 2). Single source of truth — PR 2's
-writer, PR 3's runtime tool registry, and the dataset visualizer all
-import this constant rather than duplicating the dict."""
-
-DEFAULT_TOOLS: list[dict] = [SAY_TOOL_SCHEMA]
-"""Fallback tools list. Returned by ``LeRobotDatasetMetadata.tools``
-when ``meta/info.json["tools"]`` is unset, so unannotated datasets and
-chat-template consumers (``apply_chat_template(messages, tools=...)``)
-keep working out of the box."""
-
-
-def column_for_style(style: str | None) -> LanguageColumn:
-    """Map a language style to the column where rows of that style are stored.
-
-    Styles in :data:`PERSISTENT_STYLES` route to :data:`LANGUAGE_PERSISTENT`.
-    Styles in :data:`EVENT_ONLY_STYLES` and the implicit ``None`` style route
-    to :data:`LANGUAGE_EVENTS`.
-    """
-    if style is None:
-        return LANGUAGE_EVENTS
-    if style in PERSISTENT_STYLES:
-        return LANGUAGE_PERSISTENT
-    if style in EVENT_ONLY_STYLES:
-        return LANGUAGE_EVENTS
-    raise ValueError(f"Unknown language style: {style!r}")
--- a/src/lerobot/datasets/language_render.py
+++ b/src/lerobot/datasets/language_render.py
@@ -1,532 +0,0 @@
-#!/usr/bin/env python
-
-# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-from __future__ import annotations
-
-import copy
-import hashlib
-import re
-from collections.abc import Sequence
-from typing import Any
-
-from lerobot.configs.recipe import DEFAULT_BINDINGS, TrainingRecipe
-
-from .language import LANGUAGE_PERSISTENT, column_for_style
-
-LanguageRow = dict[str, Any]
-RenderedMessages = dict[str, list[Any]]
-
-_RESOLVER_RE = re.compile(r"^(?P<name>[A-Za-z_][A-Za-z0-9_]*)\((?P<args>.*)\)$")
-_PLACEHOLDER_RE = re.compile(r"\$\{([A-Za-z_][A-Za-z0-9_]*)\}")
-
-
-def active_at(
-    t: float,
-    *,
-    persistent: Sequence[LanguageRow],
-    style: str | None = None,
-    role: str | None = None,
-    tool_name: str | None = None,
-    camera: str | None = None,
-) -> LanguageRow | None:
-    """Return the persistent row of ``style`` that is active at time ``t``.
-
-    A persistent row is "active" at ``t`` when its own ``timestamp`` is the
-    most recent one ``<= t`` for the given ``style``/``role``/``tool_name``/
-    ``camera`` selector. Only valid for persistent styles.
-    """
-    _validate_persistent_resolver("active_at", style)
-    matches = [
-        row
-        for row in _matching_rows(persistent, style=style, role=role, tool_name=tool_name, camera=camera)
-        if _timestamp(row) <= t
-    ]
-    if not matches:
-        return None
-    latest_ts = max(_timestamp(row) for row in matches)
-    return _select_one(
-        [row for row in matches if _timestamp(row) == latest_ts],
-        style=style,
-        role=role,
-        tool_name=tool_name,
-        camera=camera,
-    )
-
-
-def emitted_at(
-    t: float,
-    *,
-    persistent: Sequence[LanguageRow],
-    events: Sequence[LanguageRow],
-    style: str | None = None,
-    role: str | None = None,
-    tool_name: str | None = None,
-    camera: str | None = None,
-) -> LanguageRow | None:
-    """Return the row of ``style`` emitted at exactly time ``t``.
-
-    For persistent styles, this matches persistent rows whose own ``timestamp``
-    equals ``t``. For event styles, the ``events`` list is assumed to come from
-    the dataset row at frame ``t`` (event rows carry no timestamp of their own),
-    so all matching event rows are considered emitted at ``t``. ``camera``
-    filters by the row's ``camera`` field — required to disambiguate when
-    multiple view-dependent rows share ``(t, role)`` across cameras.
-    """
-    if column_for_style(style) == LANGUAGE_PERSISTENT:
-        matches = [
-            row
-            for row in _matching_rows(persistent, style=style, role=role, tool_name=tool_name, camera=camera)
-            if _timestamp(row) == t
-        ]
-    else:
-        matches = _matching_rows(events, style=style, role=role, tool_name=tool_name, camera=camera)
-    return _select_one(matches, style=style, role=role, tool_name=tool_name, camera=camera)
-
-
-def nth_prev(
-    t: float,
-    *,
-    persistent: Sequence[LanguageRow],
-    style: str | None = None,
-    offset: int = 1,
-    role: str | None = None,
-    tool_name: str | None = None,
-    camera: str | None = None,
-) -> LanguageRow | None:
-    """Return the persistent row that was active ``offset`` steps before ``t``.
-
-    Walks back through chronologically sorted persistent rows of ``style``
-    (filtered by optional ``role``/``tool_name``/``camera``) and returns the
-    one ``offset`` positions before the row active at ``t``. Only valid for
-    persistent styles.
-    """
-    return _nth_relative("nth_prev", t, persistent, style, -offset, role, tool_name, camera)
-
-
-def nth_next(
-    t: float,
-    *,
-    persistent: Sequence[LanguageRow],
-    style: str | None = None,
-    offset: int = 1,
-    role: str | None = None,
-    tool_name: str | None = None,
-    camera: str | None = None,
-) -> LanguageRow | None:
-    """Return the persistent row that becomes active ``offset`` steps after ``t``.
-
-    Walks forward through chronologically sorted persistent rows of ``style``
-    (filtered by optional ``role``/``tool_name``/``camera``) and returns the
-    one ``offset`` positions after the row active at ``t``. Only valid for
-    persistent styles.
-    """
-    return _nth_relative("nth_next", t, persistent, style, offset, role, tool_name, camera)
-
-
-def render_sample(
-    *,
-    recipe: TrainingRecipe,
-    persistent: Sequence[LanguageRow] | None,
-    events: Sequence[LanguageRow] | None,
-    t: float,
-    sample_idx: int,
-    task: str | None = None,
-    dataset_ctx: Any | None = None,
-) -> RenderedMessages | None:
-    """Render the chat-style messages for a single dataset sample.
-
-    Resolves the recipe's bindings against ``persistent`` and ``events`` rows
-    at frame timestamp ``t``, then expands the recipe's message templates.
-    Returns ``None`` if the resolved sample contains no target message.
-    """
-    persistent_rows = _normalize_rows(persistent or [])
-    event_rows = _normalize_rows(events or [])
-    selected_recipe = _select_recipe(recipe, sample_idx)
-    bindings = _resolve_bindings(
-        selected_recipe,
-        persistent=persistent_rows,
-        events=event_rows,
-        t=t,
-        sample_idx=sample_idx,
-        task=task,
-        dataset_ctx=dataset_ctx,
-    )
-    return _render_message_recipe(selected_recipe, bindings)
-
-
-def _select_recipe(recipe: TrainingRecipe, sample_idx: int) -> TrainingRecipe:
-    """Pick a deterministic blend component for ``sample_idx`` (or return ``recipe``)."""
-    if recipe.blend is None:
-        return recipe
-
-    total_weight = sum(component.weight or 0.0 for component in recipe.blend.values())
-    if total_weight <= 0:
-        raise ValueError("Blend weights must sum to a positive value.")
-
-    digest = hashlib.blake2b(str(sample_idx).encode(), digest_size=8).digest()
-    draw = int.from_bytes(digest, "big") / 2**64 * total_weight
-    cumulative = 0.0
-    last_component: TrainingRecipe | None = None
-    for component in recipe.blend.values():
-        last_component = component
-        cumulative += component.weight or 0.0
-        if draw < cumulative:
-            return component
-    assert last_component is not None
-    return last_component
-
-
-def _resolve_bindings(
-    recipe: TrainingRecipe,
-    *,
-    persistent: Sequence[LanguageRow],
-    events: Sequence[LanguageRow],
-    t: float,
-    sample_idx: int,
-    task: str | None,
-    dataset_ctx: Any | None,
-) -> dict[str, LanguageRow | str | None]:
-    """Resolve every binding in ``recipe`` (plus ``task``) at time ``t``."""
-    bindings: dict[str, LanguageRow | str | None] = {
-        "task": _resolve_task(task, dataset_ctx, persistent=persistent, sample_idx=sample_idx),
-    }
-    specs = {**DEFAULT_BINDINGS, **(recipe.bindings or {})}
-    for name, spec in specs.items():
-        bindings[name] = _resolve_spec(spec, persistent=persistent, events=events, t=t)
-    return bindings
-
-
-def _resolve_task(
-    task: str | None,
-    dataset_ctx: Any | None,
-    *,
-    persistent: Sequence[LanguageRow] = (),
-    sample_idx: int = 0,
-) -> str | None:
-    """Return the task string for ``sample_idx``.
-
-    Resolution order:
-
-    1. Explicit ``task`` override (caller-supplied) wins.
-    2. If ``persistent`` contains rows of style ``task_aug`` (role=user),
-       deterministically pick one by ``sample_idx`` so each frame of an
-       episode rotates through the available rephrasings across an epoch.
-       This realizes Xiao 2022 / CAST-style task-prompt diversity without
-       changing ``meta/tasks.parquet`` and without forcing recipes to opt
-       in: ``${task}`` automatically picks a rephrasing when one exists,
-       and falls back to the canonical task otherwise. Recipes that want
-       the literal canonical task can override the binding.
-    3. Otherwise read the canonical task from ``dataset_ctx`` (which is
-       backed by ``meta/tasks.parquet``).
-    """
-    if task is not None:
-        return task
-
-    aug_rows = [r for r in persistent if r.get("style") == "task_aug" and r.get("role") == "user"]
-    if aug_rows:
-        # Deterministic, blake2b-based pick keyed on sample_idx so the
-        # rotation is reproducible across runs (Python's built-in ``hash``
-        # is process-randomized).
-        digest = hashlib.blake2b(f"task_aug:{sample_idx}".encode(), digest_size=8).digest()
-        idx = int.from_bytes(digest, "big") % len(aug_rows)
-        chosen = aug_rows[idx].get("content")
-        if chosen:
-            return str(chosen)
-
-    if dataset_ctx is None:
-        return None
-    if isinstance(dataset_ctx, dict):
-        return dataset_ctx.get("task")
-    return getattr(dataset_ctx, "task", None)
-
-
-def _resolve_spec(
-    spec: str,
-    *,
-    persistent: Sequence[LanguageRow],
-    events: Sequence[LanguageRow],
-    t: float,
-) -> LanguageRow | None:
-    """Parse a single binding's resolver expression and dispatch to its function."""
-    match = _RESOLVER_RE.match(spec.strip())
-    if match is None:
-        raise ValueError(f"Invalid resolver expression: {spec!r}")
-    name = match.group("name")
-    kwargs = _parse_resolver_args(match.group("args"))
-    kwargs.pop("t_arg", None)
-
-    if name == "emitted_at":
-        return emitted_at(t, persistent=persistent, events=events, **kwargs)
-    if name == "active_at":
-        return active_at(t, persistent=persistent, **kwargs)
-    if name == "nth_prev":
-        return nth_prev(t, persistent=persistent, **kwargs)
-    if name == "nth_next":
-        return nth_next(t, persistent=persistent, **kwargs)
-    raise ValueError(f"Unknown language resolver: {name!r}")
-
-
-def _parse_resolver_args(args: str) -> dict[str, Any]:
-    """Parse a comma-separated resolver argument list into a kwargs dict."""
-    kwargs: dict[str, Any] = {}
-    if not args.strip():
-        return kwargs
-
-    parts = [part.strip() for part in args.split(",") if part.strip()]
-    for part in parts:
-        if part == "t":
-            kwargs["t_arg"] = True
-            continue
-        if "=" not in part:
-            raise ValueError(f"Invalid resolver argument: {part!r}")
-        key, value = (item.strip() for item in part.split("=", 1))
-        if key == "offset":
-            kwargs[key] = int(value)
-        else:
-            kwargs[key] = value.strip("\"'")
-    return kwargs
-
-
-def _render_message_recipe(
-    recipe: TrainingRecipe,
-    bindings: dict[str, LanguageRow | str | None],
-) -> RenderedMessages | None:
-    """Expand ``recipe.messages`` into rendered chat messages using ``bindings``."""
-    assert recipe.messages is not None
-    messages: list[dict[str, Any]] = []
-    streams: list[str | None] = []
-    target_indices: list[int] = []
-
-    for turn in recipe.messages:
-        if turn.if_present is not None and bindings.get(turn.if_present) is None:
-            continue
-
-        message = {"role": turn.role}
-        if turn.content is not None:
-            message["content"] = _render_content(turn.content, bindings)
-
-        if turn.tool_calls_from is not None:
-            row = bindings.get(turn.tool_calls_from)
-            tool_calls = row.get("tool_calls") if isinstance(row, dict) else None
-            if tool_calls:
-                message["tool_calls"] = copy.deepcopy(tool_calls)
-
-        message_idx = len(messages)
-        messages.append(message)
-        streams.append(turn.stream)
-        if turn.target:
-            target_indices.append(message_idx)
-
-    if not target_indices:
-        return None
-
-    rendered = {
-        "messages": messages,
-        "message_streams": streams,
-        "target_message_indices": target_indices,
-    }
-    _validate_rendered(rendered)
-    return rendered
-
-
-def _render_content(
-    content: str | list[dict[str, Any]],
-    bindings: dict[str, LanguageRow | str | None],
-) -> str | list[dict[str, Any]]:
-    """Substitute bindings into a string or each string field of multimodal blocks."""
-    if isinstance(content, str):
-        return _substitute(content, bindings)
-
-    rendered_blocks = []
-    for block in content:
-        rendered_block = copy.deepcopy(block)
-        for key, value in rendered_block.items():
-            if isinstance(value, str):
-                rendered_block[key] = _substitute(value, bindings)
-        rendered_blocks.append(rendered_block)
-    return rendered_blocks
-
-
-def _substitute(template: str, bindings: dict[str, LanguageRow | str | None]) -> str:
-    """Replace ``${name}`` placeholders in ``template`` with their bound values."""
-
-    def replace(match: re.Match[str]) -> str:
-        """Resolve a single ``${name}`` match to its bound string value."""
-        name = match.group(1)
-        if name not in bindings:
-            raise ValueError(f"Unknown template binding: {name!r}")
-        value = bindings[name]
-        if value is None:
-            return ""
-        if isinstance(value, dict):
-            content = value.get("content")
-            return "" if content is None else str(content)
-        return str(value)
-
-    return _PLACEHOLDER_RE.sub(replace, template)
-
-
-def _validate_rendered(rendered: RenderedMessages) -> None:
-    """Sanity-check the rendered output for stream/target alignment."""
-    messages = rendered["messages"]
-    streams = rendered["message_streams"]
-    target_indices = rendered["target_message_indices"]
-
-    if len(streams) != len(messages):
-        raise ValueError("message_streams must be aligned with messages.")
-    if not target_indices:
-        raise ValueError("Rendered samples must contain at least one target message.")
-    for idx in target_indices:
-        if idx < 0 or idx >= len(messages):
-            raise ValueError(f"Target message index {idx} is out of bounds.")
-    for idx, stream in enumerate(streams):
-        if stream is None:
-            raise ValueError(f"Rendered message {idx} has no stream.")
-
-
-def _nth_relative(
-    name: str,
-    t: float,
-    persistent: Sequence[LanguageRow],
-    style: str | None,
-    offset: int,
-    role: str | None,
-    tool_name: str | None,
-    camera: str | None,
-) -> LanguageRow | None:
-    """Shared body for ``nth_prev`` / ``nth_next`` with signed ``offset``."""
-    _validate_persistent_resolver(name, style)
-    if abs(offset) < 1:
-        raise ValueError(f"{name} offset must be non-zero.")
-
-    rows = sorted(
-        _matching_rows(persistent, style=style, role=role, tool_name=tool_name, camera=camera),
-        key=_row_sort_key,
-    )
-    if not rows:
-        return None
-
-    anchor_idx = None
-    for idx, row in enumerate(rows):
-        if _timestamp(row) <= t:
-            anchor_idx = idx
-        else:
-            break
-
-    target_idx = (offset - 1 if offset > 0 else None) if anchor_idx is None else anchor_idx + offset
-
-    if target_idx is None or target_idx < 0 or target_idx >= len(rows):
-        return None
-    return rows[target_idx]
-
-
-def _validate_persistent_resolver(name: str, style: str | None) -> None:
-    """Reject calls with missing or event-only ``style`` for persistent resolvers."""
-    if style is None:
-        raise ValueError(f"{name} requires a persistent style.")
-    if column_for_style(style) != LANGUAGE_PERSISTENT:
-        raise ValueError(f"{name} cannot be used with event-only style {style!r}.")
-
-
-def _matching_rows(
-    rows: Sequence[LanguageRow],
-    *,
-    style: str | None,
-    role: str | None,
-    tool_name: str | None,
-    camera: str | None,
-) -> list[LanguageRow]:
-    """Return ``rows`` filtered by optional ``style``/``role``/``tool_name``/``camera`` selectors."""
-    return [
-        row
-        for row in rows
-        if (style is None or row.get("style") == style)
-        and (role is None or row.get("role") == role)
-        and (tool_name is None or _row_has_tool_name(row, tool_name))
-        and (camera is None or row.get("camera") == camera)
-    ]
-
-
-def _select_one(
-    rows: Sequence[LanguageRow],
-    *,
-    style: str | None,
-    role: str | None,
-    tool_name: str | None,
-    camera: str | None,
-) -> LanguageRow | None:
-    """Return the single matching row, or raise if the resolver is ambiguous.
-
-    Multiple matches always raise — even when the caller already passed
-    some selectors — because remaining ambiguity means the data has
-    several rows that look identical to the resolver and the caller
-    needs to pin down a specific one (e.g. add ``camera=...`` for VQA
-    rows shared across cameras).
-    """
-    if not rows:
-        return None
-    if len(rows) > 1:
-        raise ValueError(
-            f"Ambiguous resolver for style={style!r} role={role!r} "
-            f"tool_name={tool_name!r} camera={camera!r}: {len(rows)} matching rows. "
-            f"Add a selector that distinguishes them."
-        )
-    return rows[0]
-
-
-def _row_sort_key(row: LanguageRow) -> tuple[float, str, str]:
-    """Stable sort key for both persistent and event rows.
-
-    Event rows lack ``timestamp`` (it is implicit in the frame), so default
-    to ``0.0`` — within a single frame all event rows share the same sort
-    bucket and are tiebroken by ``(style, role)``.
-    """
-    timestamp = row.get("timestamp")
-    ts = (
-        float(timestamp.item() if hasattr(timestamp, "item") else timestamp) if timestamp is not None else 0.0
-    )
-    return (ts, row.get("style") or "", row.get("role") or "")
-
-
-def _timestamp(row: LanguageRow) -> float:
-    """Extract a row's ``timestamp`` as a Python float (unwrapping numpy scalars)."""
-    value = row["timestamp"]
-    return float(value.item() if hasattr(value, "item") else value)
-
-
-def _row_has_tool_name(row: LanguageRow, tool_name: str) -> bool:
-    """Return ``True`` if any of the row's tool calls invokes ``tool_name``."""
-    for tool_call in row.get("tool_calls") or []:
-        if isinstance(tool_call, str):
-            continue
-        function = tool_call.get("function") if isinstance(tool_call, dict) else None
-        if isinstance(function, dict) and function.get("name") == tool_name:
-            return True
-    return False
-
-
-def _normalize_rows(rows: Sequence[Any]) -> list[LanguageRow]:
-    """Convert pyarrow scalars / mappings into a fresh list of plain dict rows."""
-    normalized = []
-    for row in rows:
-        if row is None:
-            continue
-        if hasattr(row, "as_py"):
-            row = row.as_py()
-        if not isinstance(row, dict):
-            raise TypeError(f"Language rows must be dictionaries, got {type(row).__name__}.")
-        normalized.append(dict(row))
-    return normalized
--- a/src/lerobot/datasets/utils.py
+++ b/src/lerobot/datasets/utils.py
@@ -88,6 +88,7 @@ VIDEO_DIR = "videos"

 CHUNK_FILE_PATTERN = "chunk-{chunk_index:03d}/file-{file_index:03d}"
 DEFAULT_TASKS_PATH = "meta/tasks.parquet"
+DEFAULT_SUBTASKS_PATH = "meta/subtasks.parquet"
 DEFAULT_EPISODES_PATH = EPISODES_DIR + "/" + CHUNK_FILE_PATTERN + ".parquet"
 DEFAULT_DATA_PATH = DATA_DIR + "/" + CHUNK_FILE_PATTERN + ".parquet"
 DEFAULT_VIDEO_PATH = VIDEO_DIR + "/{video_key}/" + CHUNK_FILE_PATTERN + ".mp4"
@@ -129,9 +130,6 @@ class DatasetInfo:
    # Optional metadata
    robot_type: str | None = None
    splits: dict[str, str] = field(default_factory=dict)
-    # OpenAI-style tool schemas declared by the dataset. ``None`` means the
-    # dataset doesn't declare any — readers fall back to ``DEFAULT_TOOLS``.
-    tools: list[dict] | None = None

    def __post_init__(self) -> None:
        # Coerce feature shapes from list to tuple — JSON deserialisation
@@ -153,15 +151,11 @@ class DatasetInfo:
        """Return a JSON-serialisable dict.

        Converts tuple shapes back to lists so ``json.dump`` can handle them.
-        Drops ``tools`` when unset so existing datasets keep a clean
-        ``info.json``.
        """
        d = dataclasses.asdict(self)
        for ft in d["features"].values():
            if isinstance(ft.get("shape"), tuple):
                ft["shape"] = list(ft["shape"])
-        if d.get("tools") is None:
-            d.pop("tools", None)
        return d

    @classmethod
--- a/src/lerobot/policies/init.py
+++ b/src/lerobot/policies/init.py
@@ -16,6 +16,7 @@ from lerobot.utils.action_interpolator import ActionInterpolator as ActionInterp

 from .act.configuration_act import ACTConfig as ACTConfig
 from .diffusion.configuration_diffusion import DiffusionConfig as DiffusionConfig
+from .eo1.configuration_eo1 import EO1Config as EO1Config
 from .factory import get_policy_class, make_policy, make_policy_config, make_pre_post_processors
 from .groot.configuration_groot import GrootConfig as GrootConfig
 from .multi_task_dit.configuration_multi_task_dit import MultiTaskDiTConfig as MultiTaskDiTConfig
@@ -41,6 +42,7 @@ __all__ = [
    "DiffusionConfig",
    "GrootConfig",
    "MultiTaskDiTConfig",
+    "EO1Config",
    "PI0Config",
    "PI0FastConfig",
    "PI05Config",
--- a/src/lerobot/policies/eo1/README.md
+++ b/src/lerobot/policies/eo1/README.md
@@ -0,0 +1 @@
+../../../../docs/source/eo1.mdx
--- a/src/lerobot/policies/eo1/init.py
+++ b/src/lerobot/policies/eo1/init.py
@@ -0,0 +1,7 @@
+#!/usr/bin/env python
+
+from .configuration_eo1 import EO1Config
+from .modeling_eo1 import EO1Policy
+from .processor_eo1 import make_eo1_pre_post_processors
+
+__all__ = ["EO1Config", "EO1Policy", "make_eo1_pre_post_processors"]
--- a/src/lerobot/policies/eo1/configuration_eo1.py
+++ b/src/lerobot/policies/eo1/configuration_eo1.py
@@ -0,0 +1,193 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import annotations
+
+from copy import deepcopy
+from dataclasses import dataclass, field
+from typing import TYPE_CHECKING
+
+from lerobot.configs.policies import PreTrainedConfig
+from lerobot.configs.types import FeatureType, NormalizationMode, PolicyFeature
+from lerobot.optim.optimizers import AdamWConfig
+from lerobot.optim.schedulers import CosineDecayWithWarmupSchedulerConfig
+from lerobot.utils.constants import ACTION, OBS_STATE
+from lerobot.utils.import_utils import _transformers_available, require_package
+
+if TYPE_CHECKING or _transformers_available:
+    from transformers.models.qwen2_5_vl.configuration_qwen2_5_vl import (
+        Qwen2_5_VLConfig,
+        Qwen2_5_VLTextConfig,
+        Qwen2_5_VLVisionConfig,
+    )
+else:
+    Qwen2_5_VLConfig = None
+    Qwen2_5_VLTextConfig = None
+    Qwen2_5_VLVisionConfig = None
+
+
+@PreTrainedConfig.register_subclass("eo1")
+@dataclass
+class EO1Config(PreTrainedConfig):
+    """Configuration for native EO1 policy integration in LeRobot."""
+
+    vlm_base: str = "Qwen/Qwen2.5-VL-3B-Instruct"
+    vlm_config: dict | None = None
+
+    # Vision processor settings.
+    image_min_pixels: int | None = 64 * 28 * 28
+    image_max_pixels: int | None = 128 * 28 * 28
+    use_fast_processor: bool = False
+
+    # Execution and action horizon.
+    n_obs_steps: int = 1
+    chunk_size: int = 8
+    n_action_steps: int = 8
+
+    # State/action padding to match EO1 flow head dimensionality.
+    max_state_dim: int = 32
+    max_action_dim: int = 32
+
+    # Flow matching sampling.
+    num_denoise_steps: int = 10
+    num_action_layers: int = 2
+    action_act: str = "linear"
+    time_sampling_beta_alpha: float = 1.5
+    time_sampling_beta_beta: float = 1.0
+    time_sampling_scale: float = 0.999
+    time_sampling_offset: float = 0.001
+    min_period: float = 4e-3
+    max_period: float = 4.0
+    supervise_padding_action_dims: bool = True
+    supervise_padding_actions: bool = True
+
+    # Policy-level dtype request for the Qwen backbone.
+    # - "auto": follow the backbone config/checkpoint default dtype. For Qwen2.5-VL this resolves to bf16.
+    #           The EO1 flow-matching head still keeps its own parameters in fp32.
+    # - "bfloat16": force the backbone to initialize/load in bf16 regardless of the saved config default.
+    # - "float32": force the backbone to initialize/load in fp32 for maximum numerical conservatism.
+    dtype: str = "auto"  # Options: "auto", "bfloat16", "float32"
+    force_fp32_autocast: bool = True
+
+    # Optional attention backend request passed through to the Qwen backbone.
+    # Common values: None, "eager", "sdpa", "flash_attention_2".
+    attn_implementation: str | None = None
+
+    # Training settings.
+    gradient_checkpointing: bool = False  # Enable gradient checkpointing for memory optimization
+
+    normalization_mapping: dict[str, NormalizationMode] = field(
+        default_factory=lambda: {
+            "VISUAL": NormalizationMode.IDENTITY,
+            "STATE": NormalizationMode.MEAN_STD,
+            "ACTION": NormalizationMode.MEAN_STD,
+        }
+    )
+
+    # Optimizer settings aligned with EO1/experiments/2_libero/train.sh and EO1 TrainPipelineConfig defaults.
+    optimizer_lr: float = 1e-4
+    optimizer_betas: tuple[float, float] = (0.9, 0.999)
+    optimizer_eps: float = 1e-8
+    optimizer_weight_decay: float = 0.1
+    optimizer_grad_clip_norm: float = 1.0
+
+    # Scheduler settings aligned with EO1 train.sh: cosine schedule with warmup_ratio=0.03.
+    # Note: These will auto-scale if --steps < scheduler_decay_steps
+    # For example, --steps=3000 will scale warmup to 100 and decay to 3000
+    scheduler_warmup_steps: int = 900  # 0.03 * 30_000 long-run steps
+    scheduler_decay_steps: int = 30_000
+    scheduler_decay_lr: float = 0.0
+
+    def __post_init__(self):
+        super().__post_init__()
+
+        if self.n_action_steps > self.chunk_size:
+            raise ValueError(
+                f"n_action_steps ({self.n_action_steps}) cannot be greater than chunk_size ({self.chunk_size})"
+            )
+
+        # Populate the serialized backbone config only when the caller did not provide one.
+        if self.vlm_config is None:
+            require_package("transformers", extra="eo1")
+            self.vlm_config = Qwen2_5_VLConfig.from_pretrained(self.vlm_base).to_dict()
+
+    @property
+    def vlm_backbone_config(self) -> Qwen2_5_VLConfig:
+        require_package("transformers", extra="eo1")
+        config_dict = deepcopy(self.vlm_config)
+        if self.attn_implementation is not None:
+            config_dict["attn_implementation"] = self.attn_implementation
+        return Qwen2_5_VLConfig(**config_dict)
+
+    @property
+    def text_config(self) -> Qwen2_5_VLTextConfig:
+        return self.vlm_backbone_config.text_config
+
+    @property
+    def vision_config(self) -> Qwen2_5_VLVisionConfig:
+        return self.vlm_backbone_config.vision_config
+
+    def validate_features(self) -> None:
+        """Validate and set up EO1 input and output features."""
+        image_features = [key for key, feat in self.input_features.items() if feat.type == FeatureType.VISUAL]
+        if not image_features:
+            raise ValueError(
+                "EO1 policy requires at least one visual input feature. "
+                "No features of type FeatureType.VISUAL found in input_features."
+            )
+
+        if OBS_STATE not in self.input_features:
+            state_feature = PolicyFeature(
+                type=FeatureType.STATE,
+                shape=(self.max_state_dim,),
+            )
+            self.input_features[OBS_STATE] = state_feature
+
+        if ACTION not in self.output_features:
+            action_feature = PolicyFeature(
+                type=FeatureType.ACTION,
+                shape=(self.max_action_dim,),
+            )
+            self.output_features[ACTION] = action_feature
+
+    def get_optimizer_preset(self) -> AdamWConfig:
+        return AdamWConfig(
+            lr=self.optimizer_lr,
+            betas=self.optimizer_betas,
+            eps=self.optimizer_eps,
+            weight_decay=self.optimizer_weight_decay,
+            grad_clip_norm=self.optimizer_grad_clip_norm,
+        )
+
+    def get_scheduler_preset(self):
+        return CosineDecayWithWarmupSchedulerConfig(
+            peak_lr=self.optimizer_lr,
+            decay_lr=self.scheduler_decay_lr,
+            num_warmup_steps=self.scheduler_warmup_steps,
+            num_decay_steps=self.scheduler_decay_steps,
+        )
+
+    @property
+    def observation_delta_indices(self) -> None:
+        return None
+
+    @property
+    def action_delta_indices(self) -> list[int]:
+        return list(range(self.chunk_size))
+
+    @property
+    def reward_delta_indices(self) -> None:
+        return None
--- a/src/lerobot/policies/eo1/modeling_eo1.py
+++ b/src/lerobot/policies/eo1/modeling_eo1.py
@@ -0,0 +1,620 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import annotations
+
+import contextlib
+import logging
+import math
+from collections import deque
+from typing import TYPE_CHECKING, Any
+
+import torch
+import torch.nn as nn
+import torch.nn.functional as F  # noqa: N812
+import torch.utils.checkpoint
+from torch import Tensor
+
+from lerobot.policies.eo1.configuration_eo1 import EO1Config
+from lerobot.policies.pretrained import PreTrainedPolicy
+from lerobot.utils.constants import ACTION, OBS_STATE
+from lerobot.utils.import_utils import _transformers_available, require_package
+
+if TYPE_CHECKING or _transformers_available:
+    from transformers.activations import ACT2FN
+    from transformers.models.qwen2_5_vl import Qwen2_5_VLForConditionalGeneration
+    from transformers.utils import torch_compilable_check
+else:
+    ACT2FN = None
+    Qwen2_5_VLForConditionalGeneration = None
+    torch_compilable_check = None
+
+logger = logging.getLogger(__name__)
+
+
+def pad_vector(vector, new_dim):
+    """Pad the last dimension of a vector to new_dim with zeros.
+
+    Can be (batch_size x sequence_length x features_dimension)
+    or (batch_size x features_dimension)
+    """
+    if vector.shape[-1] >= new_dim:
+        return vector
+    return F.pad(vector, (0, new_dim - vector.shape[-1]))
+
+
+class EO1Policy(PreTrainedPolicy):
+    """EO1 policy wrapper for LeRobot robot-only training/evaluation."""
+
+    config_class = EO1Config
+    name = "eo1"
+
+    def __init__(self, config: EO1Config, **kwargs):
+        require_package("transformers", extra="eo1")
+        super().__init__(config)
+        config.validate_features()
+        self.config = config
+
+        if config.pretrained_path is None:
+            # Initialize from pretrained VLM
+            vlm_backbone = Qwen2_5_VLForConditionalGeneration.from_pretrained(
+                config.vlm_base,
+                dtype=config.dtype,
+                attn_implementation=config.attn_implementation,
+            )
+        else:
+            vlm_backbone = Qwen2_5_VLForConditionalGeneration._from_config(
+                config.vlm_backbone_config,
+                dtype=config.vlm_backbone_config.dtype if config.dtype == "auto" else config.dtype,
+            )
+
+        self.model = EO1VisionFlowMatchingModel(config, vlm_backbone)
+        if config.gradient_checkpointing:
+            self.model.gradient_checkpointing_enable()
+
+        self.model.to(config.device)
+        self.reset()
+
+    def reset(self):
+        self._action_queue = deque(maxlen=self.config.n_action_steps)
+
+    @staticmethod
+    def _get_model_inputs(batch: dict[str, Tensor], excluded_keys: set[str]) -> dict[str, Tensor]:
+        return {key: value for key, value in batch.items() if key not in excluded_keys}
+
+    def forward(self, batch: dict[str, Tensor]) -> tuple[Tensor, dict]:
+        state = self.prepare_state(batch[OBS_STATE])
+        actions = self.prepare_action(batch[ACTION])
+        model_inputs = self._get_model_inputs(batch, {OBS_STATE, ACTION})
+        loss = self.model(states=state, action=actions, **model_inputs)
+
+        loss_dict = {"loss": loss.item()}
+        return loss, loss_dict
+
+    @torch.no_grad()
+    def predict_action_chunk(self, batch: dict[str, Tensor], **kwargs) -> Tensor:
+        self.eval()
+
+        states = self.prepare_state(batch[OBS_STATE])
+        model_inputs = self._get_model_inputs(batch, {OBS_STATE})
+        actions = self.model.sample_actions(states=states, **model_inputs).to(torch.float32)
+
+        original_action_dim = self.config.output_features[ACTION].shape[0]
+        return actions[:, :, :original_action_dim]
+
+    def prepare_state(self, state: Tensor) -> Tensor:
+        return pad_vector(state, self.config.max_state_dim)
+
+    def prepare_action(self, action: Tensor) -> Tensor:
+        return pad_vector(action, self.config.max_action_dim)
+
+    @torch.no_grad()
+    def select_action(self, batch: dict[str, Tensor]) -> Tensor:
+        self.eval()
+
+        if len(self._action_queue) == 0:
+            actions = self.predict_action_chunk(batch)[:, : self.config.n_action_steps]
+            self._action_queue.extend(actions.transpose(0, 1))
+
+        return self._action_queue.popleft()
+
+    def get_optim_params(self) -> dict:
+        return self.parameters()
+
+
+def get_safe_dtype(target_dtype, device_type):
+    """Get a safe dtype for the given device type."""
+    if device_type == "mps" and target_dtype == torch.float64:
+        return torch.float32
+    if device_type == "cpu":
+        # CPU doesn't support bfloat16, use float32 instead
+        if target_dtype == torch.bfloat16:
+            return torch.float32
+        if target_dtype == torch.float64:
+            return torch.float64
+    return target_dtype
+
+
+def create_sinusoidal_pos_embedding(  # see openpi `create_sinusoidal_pos_embedding` (exact copy)
+    time: torch.Tensor, dimension: int, min_period: float, max_period: float, device="cpu"
+) -> Tensor:
+    """Computes sine-cosine positional embedding vectors for scalar positions."""
+    if dimension % 2 != 0:
+        raise ValueError(f"dimension ({dimension}) must be divisible by 2")
+
+    if time.ndim != 1:
+        raise ValueError("The time tensor is expected to be of shape `(batch_size, )`.")
+
+    dtype = get_safe_dtype(torch.float64, device.type)
+    fraction = torch.linspace(0.0, 1.0, dimension // 2, dtype=dtype, device=device)
+    period = min_period * (max_period / min_period) ** fraction
+
+    # Compute the outer product
+    scaling_factor = 1.0 / period * 2 * math.pi
+    sin_input = scaling_factor[None, :] * time[:, None]
+    return torch.cat([torch.sin(sin_input), torch.cos(sin_input)], dim=1)
+
+
+def sample_beta(alpha, beta, bsize, device):  # see openpi `sample_beta` (exact copy)
+    # Beta sampling uses _sample_dirichlet which isn't implemented for MPS, so sample on CPU
+    alpha_t = torch.tensor(alpha, dtype=torch.float32)
+    beta_t = torch.tensor(beta, dtype=torch.float32)
+    dist = torch.distributions.Beta(alpha_t, beta_t)
+    return dist.sample((bsize,)).to(device)
+
+
+class EO1VisionActionProjector(torch.nn.Sequential):
+    """This block implements the multi-layer perceptron (MLP) module."""
+
+    def __init__(
+        self,
+        in_channels: int,
+        out_channels: int,
+        num_layers: int = 2,
+        activation_layer: str = "linear",
+        bias: bool = True,
+        device: Any = None,
+        dtype: torch.dtype = torch.float32,
+    ):
+        layers = []
+        in_dim = in_channels
+        hidden_channels = [in_dim] * (num_layers - 1) + [out_channels]
+        for hidden_dim in hidden_channels[:-1]:
+            layers.append(torch.nn.Linear(in_dim, hidden_dim, bias=bias, dtype=dtype, device=device))
+            layers.append(ACT2FN[activation_layer])
+            in_dim = hidden_dim
+        layers.append(torch.nn.Linear(in_dim, hidden_channels[-1], bias=bias, dtype=dtype, device=device))
+        super().__init__(*layers)
+
+    @property
+    def dtype(self):
+        return self[0].weight.dtype
+
+
+class EO1VisionFlowMatchingModel(nn.Module):
+    def __init__(
+        self,
+        config: EO1Config,
+        vlm_backbone: Qwen2_5_VLForConditionalGeneration | None = None,
+    ):
+        require_package("transformers", extra="eo1")
+        super().__init__()
+
+        self.config = config
+        # Preserve the backbone dtype selected at construction time so Qwen's fp32 rotary buffers stay intact.
+        self.vlm_backbone = vlm_backbone
+        self.hidden_size = self.vlm_backbone.config.text_config.hidden_size
+        max_state_dim = config.max_state_dim
+        max_action_dim = config.max_action_dim
+        self.state_proj = nn.Linear(max_state_dim, self.hidden_size, dtype=torch.float32)
+        self.action_in_proj = nn.Linear(max_action_dim, self.hidden_size, dtype=torch.float32)
+        self.action_out_proj = EO1VisionActionProjector(
+            self.hidden_size,
+            max_action_dim,
+            config.num_action_layers,
+            config.action_act,
+            dtype=torch.float32,
+        )
+        self.action_time_mlp_in = nn.Linear(self.hidden_size * 2, self.hidden_size, dtype=torch.float32)
+        self.action_time_mlp_out = nn.Linear(self.hidden_size, self.hidden_size, dtype=torch.float32)
+        self.gradient_checkpointing_enabled = False
+
+    def get_input_embeddings(self):
+        return self.vlm_backbone.get_input_embeddings()
+
+    def flow_head_autocast_context(self):
+        if self.config.force_fp32_autocast:
+            return torch.autocast(
+                device_type=self.state_proj.weight.device.type,
+                enabled=False,
+            )
+        return contextlib.nullcontext()
+
+    def gradient_checkpointing_enable(self):
+        """Enable gradient checkpointing for the Qwen2.5-VL backbone."""
+        self.gradient_checkpointing_enabled = True
+        self.vlm_backbone.gradient_checkpointing_enable(
+            gradient_checkpointing_kwargs={"use_reentrant": False}
+        )
+        logger.info("Enabled gradient checkpointing for EO1VisionFlowMatchingModel")
+
+    def gradient_checkpointing_disable(self):
+        """Disable gradient checkpointing for the Qwen2.5-VL backbone."""
+        self.gradient_checkpointing_enabled = False
+        self.vlm_backbone.gradient_checkpointing_disable()
+        logger.info("Disabled gradient checkpointing for EO1VisionFlowMatchingModel")
+
+    def _apply_checkpoint(self, func, *args, **kwargs):
+        """Apply manual gradient checkpointing to EO1 flow-head computations when training."""
+        if self.gradient_checkpointing_enabled and self.training and torch.is_grad_enabled():
+            return torch.utils.checkpoint.checkpoint(
+                func, *args, use_reentrant=False, preserve_rng_state=False, **kwargs
+            )
+        return func(*args, **kwargs)
+
+    def sample_noise(self, shape, device):
+        noise = torch.normal(
+            mean=0.0,
+            std=1.0,
+            size=shape,
+            dtype=torch.float32,
+            device=device,
+        )
+        return noise
+
+    def sample_time(self, bsize, device):
+        time_beta = sample_beta(
+            self.config.time_sampling_beta_alpha, self.config.time_sampling_beta_beta, bsize, device
+        )
+        time = time_beta * self.config.time_sampling_scale + self.config.time_sampling_offset
+        return time.to(dtype=torch.float32, device=device)
+
+    def get_placeholder_mask(
+        self,
+        input_ids: torch.LongTensor | None,
+        inputs_embeds: torch.FloatTensor | None,
+        state_features: torch.FloatTensor | None = None,
+        action_features: torch.FloatTensor | None = None,
+        *,
+        state_token_id: int,
+        action_token_id: int,
+    ) -> tuple[torch.BoolTensor, torch.BoolTensor]:
+        """Return EO1 state/action placeholder masks, following Qwen's multimodal mask style."""
+        if input_ids is None:
+            special_state_mask = inputs_embeds == self.get_input_embeddings()(
+                torch.tensor(state_token_id, dtype=torch.long, device=inputs_embeds.device)
+            )
+            special_state_mask = special_state_mask.all(-1)
+            special_action_mask = inputs_embeds == self.get_input_embeddings()(
+                torch.tensor(action_token_id, dtype=torch.long, device=inputs_embeds.device)
+            )
+            special_action_mask = special_action_mask.all(-1)
+        else:
+            special_state_mask = input_ids == state_token_id
+            special_action_mask = input_ids == action_token_id
+
+        n_state_tokens = special_state_mask.sum()
+        special_state_mask = (
+            special_state_mask.unsqueeze(-1).expand_as(inputs_embeds).to(inputs_embeds.device)
+        )
+        if state_features is not None:
+            torch_compilable_check(
+                inputs_embeds[special_state_mask].numel() == state_features.numel(),
+                f"State features and state tokens do not match, tokens: {n_state_tokens}, features: {state_features.shape[0]}",
+            )
+
+        n_action_tokens = special_action_mask.sum()
+        special_action_mask = (
+            special_action_mask.unsqueeze(-1).expand_as(inputs_embeds).to(inputs_embeds.device)
+        )
+        if action_features is not None:
+            torch_compilable_check(
+                inputs_embeds[special_action_mask].numel() == action_features.numel(),
+                f"Action features and action tokens do not match, tokens: {n_action_tokens}, features: {action_features.shape[0]}",
+            )
+
+        return special_state_mask, special_action_mask
+
+    def embed_prefix(
+        self,
+        input_ids: torch.LongTensor,
+        states: torch.Tensor,
+        *,
+        state_token_id: int,
+        action_token_id: int,
+    ) -> torch.FloatTensor:
+        """Embed the EO1 prefix tokens before native Qwen injects multimodal features."""
+
+        # Get the input embeddings for the input IDs
+        def input_embed_func(input_ids: torch.LongTensor) -> torch.FloatTensor:
+            return self.get_input_embeddings()(input_ids)
+
+        inputs_embeds = self._apply_checkpoint(input_embed_func, input_ids)
+
+        # Project the states to the hidden size
+        def state_proj_func(states: torch.Tensor) -> torch.FloatTensor:
+            with self.flow_head_autocast_context():
+                states = states.to(dtype=self.state_proj.weight.dtype)
+                return self.state_proj(states)
+
+        state_embs = self._apply_checkpoint(state_proj_func, states)
+        state_mask, _ = self.get_placeholder_mask(
+            input_ids,
+            inputs_embeds,
+            state_features=state_embs,
+            state_token_id=state_token_id,
+            action_token_id=action_token_id,
+        )
+        state_embs = state_embs.to(inputs_embeds.device, inputs_embeds.dtype)
+        inputs_embeds = inputs_embeds.masked_scatter(state_mask, state_embs)
+        return inputs_embeds
+
+    def embed_suffix(
+        self,
+        timestep: torch.Tensor,
+        noisy_actions: torch.Tensor,
+    ) -> torch.FloatTensor:
+        """Embed the suffix"""
+
+        def action_proj_func(noisy_actions: torch.Tensor) -> torch.FloatTensor:
+            with self.flow_head_autocast_context():
+                noisy_actions = noisy_actions.to(dtype=self.action_in_proj.weight.dtype)
+                return self.action_in_proj(noisy_actions)
+
+        action_embs = self._apply_checkpoint(action_proj_func, noisy_actions)
+        time_embs = create_sinusoidal_pos_embedding(
+            timestep,
+            self.hidden_size,
+            min_period=self.config.min_period,
+            max_period=self.config.max_period,
+            device=action_embs.device,
+        )
+        time_embs = time_embs.to(dtype=action_embs.dtype)
+        time_embs = time_embs[:, None, :].expand_as(action_embs)
+        action_time_embs = torch.cat([action_embs, time_embs], dim=2)
+
+        def mlp_func(action_time_embs: torch.Tensor) -> torch.FloatTensor:
+            with self.flow_head_autocast_context():
+                action_time_embs = action_time_embs.to(dtype=self.action_time_mlp_in.weight.dtype)
+                action_time_embs = self.action_time_mlp_in(action_time_embs)
+                action_time_embs = F.silu(action_time_embs)
+                return self.action_time_mlp_out(action_time_embs)
+
+        action_time_embs = self._apply_checkpoint(mlp_func, action_time_embs)
+        return action_time_embs
+
+    def forward(
+        self,
+        input_ids: torch.LongTensor | None = None,
+        attention_mask: torch.LongTensor | None = None,
+        pixel_values: torch.FloatTensor | None = None,
+        image_grid_thw: torch.LongTensor | None = None,
+        mm_token_type_ids: torch.IntTensor | None = None,
+        states: torch.FloatTensor | None = None,
+        action: torch.FloatTensor | None = None,
+        action_is_pad: torch.BoolTensor | None = None,
+        *,
+        state_token_id: int,
+        action_token_id: int,
+        **kwargs,
+    ) -> Tensor:
+        """Run the EO1 training forward pass and compute the flow-matching loss."""
+
+        # 1. Build the EO1 prefix with state placeholders resolved.
+        inputs_embeds = self.embed_prefix(
+            input_ids,
+            states=states,
+            state_token_id=state_token_id,
+            action_token_id=action_token_id,
+        )
+
+        # 2. Sample the diffusion target and replace the action placeholders.
+        time = self.sample_time(action.shape[0], inputs_embeds.device)
+        noise = self.sample_noise(action.shape, inputs_embeds.device)
+
+        time_expanded = time[:, None, None]
+        x_t = time_expanded * noise + (1 - time_expanded) * action
+        u_t = noise - action
+        action_time_embs = self.embed_suffix(time, x_t)
+        _, action_mask = self.get_placeholder_mask(
+            input_ids,
+            inputs_embeds,
+            action_features=action_time_embs,
+            state_token_id=state_token_id,
+            action_token_id=action_token_id,
+        )
+        action_time_embs = action_time_embs.to(inputs_embeds.device, inputs_embeds.dtype)
+        inputs_embeds = inputs_embeds.masked_scatter(action_mask, action_time_embs)
+
+        # 3. Optionally drop padded action tokens from backbone attention.
+        if attention_mask is not None:
+            attention_mask = attention_mask.to(inputs_embeds.device)
+
+        if not self.config.supervise_padding_actions:
+            action_is_pad = action_is_pad.to(device=inputs_embeds.device, dtype=torch.bool)
+            action_token_mask = action_mask[..., 0]
+            action_padding_mask = torch.zeros_like(action_token_mask)
+            action_padding_mask = action_padding_mask.masked_scatter(
+                action_token_mask,
+                action_is_pad.reshape(-1),
+            )
+            attention_mask = attention_mask.masked_fill(action_padding_mask, 0)
+
+        # 4. Run the Qwen backbone on the fused EO1 sequence.
+        def vlm_forward_func(
+            input_ids: torch.LongTensor,
+            attention_mask: torch.Tensor | None,
+            inputs_embeds: torch.FloatTensor,
+            pixel_values: torch.Tensor | None,
+            image_grid_thw: torch.LongTensor | None,
+            mm_token_type_ids: torch.IntTensor | None,
+        ) -> torch.FloatTensor:
+            outputs = self.vlm_backbone.model(
+                input_ids=input_ids,
+                attention_mask=attention_mask,
+                inputs_embeds=inputs_embeds,
+                pixel_values=pixel_values,
+                image_grid_thw=image_grid_thw,
+                mm_token_type_ids=mm_token_type_ids,
+                use_cache=False,
+                output_hidden_states=False,
+                return_dict=True,
+            )
+            return outputs.last_hidden_state
+
+        hidden_states = self._apply_checkpoint(
+            vlm_forward_func,
+            input_ids,
+            attention_mask,
+            inputs_embeds,
+            pixel_values,
+            image_grid_thw,
+            mm_token_type_ids,
+        )
+        action_hidden_states = hidden_states[action_mask[..., 0]]
+
+        # 5. Project the action-token hidden states back to the flow target space.
+        def action_out_proj_func(action_hidden_states: torch.FloatTensor) -> torch.FloatTensor:
+            with self.flow_head_autocast_context():
+                action_hidden_states = action_hidden_states.to(dtype=self.action_out_proj.dtype)
+                return self.action_out_proj(action_hidden_states)
+
+        v_t = self._apply_checkpoint(action_out_proj_func, action_hidden_states)
+        v_t = v_t.reshape(u_t.shape).to(dtype=u_t.dtype)
+        losses = F.mse_loss(u_t, v_t, reduction="none")
+
+        # 6. Apply the configured supervision mask and reduce the loss.
+        if not self.config.supervise_padding_action_dims:
+            original_action_dim = self.config.output_features[ACTION].shape[0]
+            losses = losses[..., :original_action_dim]
+
+        if not self.config.supervise_padding_actions:
+            losses = losses[~action_is_pad]
+
+        return losses.mean()
+
+    @torch.no_grad()
+    def sample_actions(
+        self,
+        input_ids: torch.LongTensor | None = None,
+        attention_mask: torch.Tensor | None = None,
+        pixel_values: torch.Tensor | None = None,
+        image_grid_thw: torch.LongTensor | None = None,
+        mm_token_type_ids: torch.IntTensor | None = None,
+        states: torch.Tensor | None = None,
+        *,
+        state_token_id: int,
+        action_token_id: int,
+        **kwargs,
+    ) -> Tensor:
+        """Sample actions from the model."""
+        if states is None:
+            raise ValueError("states are required for EO1 action sampling.")
+        if mm_token_type_ids is None:
+            raise ValueError("mm_token_type_ids are required for EO1 action sampling.")
+
+        # 1. Resolve the left-padded rollout prompt and locate the action span.
+        chunk_size = self.config.chunk_size
+
+        inputs_embeds = self.embed_prefix(
+            input_ids,
+            states=states,
+            state_token_id=state_token_id,
+            action_token_id=action_token_id,
+        ).clone()
+        _, action_placeholder_mask = self.get_placeholder_mask(
+            input_ids,
+            inputs_embeds,
+            state_token_id=state_token_id,
+            action_token_id=action_token_id,
+        )
+        action_mask = action_placeholder_mask[..., 0]
+        token_counts = action_mask.sum(dim=1)
+        if not torch.all(token_counts == chunk_size):
+            raise ValueError(
+                f"Each sample must contain exactly {chunk_size} action tokens, got {token_counts.tolist()}."
+            )
+        if action_mask.ne(action_mask[:1]).any():
+            raise ValueError(
+                "Batch inference expects all samples to share the same action token mask after left padding."
+            )
+        act_start = int(action_mask[0].to(torch.int64).argmax().item())
+        act_end = act_start + self.config.chunk_size
+        if not torch.all(action_mask[:, act_start:act_end]):
+            raise ValueError("Action tokens must form a contiguous chunk of length chunk_size.")
+        act_slice = slice(act_start, act_end)
+
+        # 2. Encode the fixed prefix once and cache its KV state.
+        batch_size = input_ids.shape[0]
+        device = inputs_embeds.device
+        attention_mask = attention_mask.to(device)
+        mm_token_type_ids = mm_token_type_ids.to(device)
+        position_ids, _ = self.vlm_backbone.model.get_rope_index(
+            input_ids,
+            image_grid_thw=image_grid_thw,
+            attention_mask=attention_mask,
+            mm_token_type_ids=mm_token_type_ids,
+        )
+        position_ids = position_ids.to(device)
+
+        outputs = self.vlm_backbone.model(
+            input_ids=input_ids[:, :act_start],
+            attention_mask=attention_mask[:, :act_start],
+            position_ids=position_ids[..., :act_start],
+            inputs_embeds=inputs_embeds[:, :act_start],
+            pixel_values=pixel_values,
+            image_grid_thw=image_grid_thw,
+            mm_token_type_ids=mm_token_type_ids[:, :act_start],
+            use_cache=True,
+            return_dict=True,
+        )
+
+        x_t = self.sample_noise(
+            (batch_size, chunk_size, self.config.max_action_dim),
+            device,
+        ).to(dtype=self.action_in_proj.weight.dtype)
+        dt = -1.0 / self.config.num_denoise_steps
+        past_key_values = outputs.past_key_values
+
+        # 3. Denoise only the action chunk while keeping the prefix cache invariant.
+        for step in range(self.config.num_denoise_steps):
+            time = torch.full(
+                (batch_size,),
+                1.0 + step * dt,
+                device=device,
+                dtype=torch.float32,
+            )
+            action_time_embs = self.embed_suffix(time, x_t)
+            inputs_embeds[:, act_slice] = action_time_embs.to(inputs_embeds.dtype)
+
+            # Keep the prefix KV cache invariant across denoising steps.
+            past_key_values.crop(act_start)
+            outputs = self.vlm_backbone.model(
+                attention_mask=attention_mask[:, :act_end],
+                past_key_values=past_key_values,
+                inputs_embeds=inputs_embeds[:, act_slice],
+                position_ids=position_ids[..., act_slice],
+                use_cache=True,
+                return_dict=True,
+            )
+            with self.flow_head_autocast_context():
+                hidden_states = outputs.last_hidden_state[:, :chunk_size]
+                hidden_states = hidden_states.to(dtype=self.action_out_proj.dtype)
+                v_t = self.action_out_proj(hidden_states)
+
+            x_t += dt * v_t.reshape(x_t.shape)
+
+        return x_t
--- a/src/lerobot/policies/eo1/processor_eo1.py
+++ b/src/lerobot/policies/eo1/processor_eo1.py
@@ -0,0 +1,282 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+from typing import TYPE_CHECKING, Any
+
+import torch
+
+from lerobot.configs.types import FeatureType, PipelineFeatureType, PolicyFeature
+from lerobot.policies.eo1.configuration_eo1 import EO1Config
+from lerobot.processor import (
+    AddBatchDimensionProcessorStep,
+    ComplementaryDataProcessorStep,
+    DeviceProcessorStep,
+    NormalizerProcessorStep,
+    PolicyAction,
+    PolicyProcessorPipeline,
+    ProcessorStep,
+    ProcessorStepRegistry,
+    RenameObservationsProcessorStep,
+    UnnormalizerProcessorStep,
+)
+from lerobot.processor.converters import policy_action_to_transition, transition_to_policy_action
+from lerobot.types import TransitionKey
+from lerobot.utils.constants import (
+    OBS_STATE,
+    POLICY_POSTPROCESSOR_DEFAULT_NAME,
+    POLICY_PREPROCESSOR_DEFAULT_NAME,
+)
+from lerobot.utils.import_utils import _transformers_available, require_package
+
+if TYPE_CHECKING or _transformers_available:
+    from transformers.models.qwen2_5_vl import Qwen2_5_VLProcessor
+else:
+    Qwen2_5_VLProcessor = None
+
+SYSTEM_MESSAGE = "You are a helpful physical assistant."
+
+# EO-1 special tokens
+ACTION_START_TOKEN = "<|action_start|>"  # nosec B105
+DEFAULT_ACTION_TOKEN = "<|action_pad|>"  # nosec B105
+ACTION_END_TOKEN = "<|action_end|>"  # nosec B105
+STATE_START_TOKEN = "<|state_start|>"  # nosec B105
+DEFAULT_STATE_TOKEN = "<|state_pad|>"  # nosec B105
+STATE_END_TOKEN = "<|state_end|>"  # nosec B105
+TASK_VLA_TOKEN = "<|vla|>"  # nosec B105
+
+EO1_SPECIAL_TOKENS = [
+    ACTION_START_TOKEN,
+    DEFAULT_ACTION_TOKEN,
+    ACTION_END_TOKEN,
+    STATE_START_TOKEN,
+    DEFAULT_STATE_TOKEN,
+    STATE_END_TOKEN,
+    TASK_VLA_TOKEN,
+]
+
+
+@dataclass
+@ProcessorStepRegistry.register(name="eo1_conversation_template_processor")
+class EO1ConversationTemplateStep(ComplementaryDataProcessorStep):
+    input_features: dict[str, PolicyFeature] | dict[str, dict[str, Any]]
+    chunk_size: int
+
+    _image_keys: list[str] = field(default_factory=list, init=False, repr=False)
+
+    def __post_init__(self):
+        # Robust JSON deserialization handling (guard empty maps).
+        if self.input_features:
+            first_val = next(iter(self.input_features.values()))
+            if isinstance(first_val, dict):
+                reconstructed = {}
+                for key, ft_dict in self.input_features.items():
+                    reconstructed[key] = PolicyFeature(
+                        type=FeatureType(ft_dict["type"]), shape=tuple(ft_dict["shape"])
+                    )
+                self.input_features = reconstructed
+
+        self._image_keys = [
+            key for key, value in self.input_features.items() if value.type == FeatureType.VISUAL
+        ]
+
+    def complementary_data(self, complementary_data):
+        tasks = complementary_data.get("task")
+        if tasks is None:
+            raise ValueError("Task is required for EO1ConversationTemplateStep.")
+
+        observation = self.transition.get(TransitionKey.OBSERVATION)
+        if observation is None:
+            raise ValueError("Observation is required for EO1ConversationTemplateStep.")
+
+        if OBS_STATE in observation and observation[OBS_STATE].shape[0] != len(tasks):
+            raise ValueError("Batch size mismatch between observation.state and task list.")
+
+        # LeRobot visual observations reach in processor as float32 tensors in [0, 1].
+        # Convert to uint8 in [0, 255] to meet the input requirement of Qwen2.5-VL-3B-Instruct.
+        images = {
+            key: observation[key].clamp(0, 1).mul(255.0).round().to(torch.uint8) for key in self._image_keys
+        }
+        messages = []
+        for i in range(len(tasks)):
+            content = [
+                *[{"type": "image", "image": images[key][i]} for key in self._image_keys],
+                {
+                    "type": "text",
+                    "text": (
+                        f"{STATE_START_TOKEN}{DEFAULT_STATE_TOKEN}{STATE_END_TOKEN}{tasks[i]}{TASK_VLA_TOKEN}"
+                    ),
+                },
+            ]
+            messages.append(
+                [
+                    {"role": "system", "content": [{"type": "text", "text": SYSTEM_MESSAGE}]},
+                    {"role": "user", "content": content},
+                    {
+                        "role": "assistant",
+                        "content": [
+                            {
+                                "type": "text",
+                                "text": f"{ACTION_START_TOKEN}{DEFAULT_ACTION_TOKEN * self.chunk_size}{ACTION_END_TOKEN}",
+                            }
+                        ],
+                    },
+                ]
+            )
+
+        complementary_data["messages"] = messages
+
+        return complementary_data
+
+    def transform_features(
+        self, features: dict[PipelineFeatureType, dict[str, PolicyFeature]]
+    ) -> dict[PipelineFeatureType, dict[str, PolicyFeature]]:
+        """
+        This step only materializes EO1-specific message objects in complementary_data.
+        PipelineFeatureType tracks only ACTION and OBSERVATION, so there is no static
+        feature contract change to record here.
+        """
+        return features
+
+    def get_config(self) -> dict[str, Any]:
+        return {
+            "input_features": {
+                key: {"type": ft.type.value, "shape": ft.shape} for key, ft in self.input_features.items()
+            },
+            "chunk_size": self.chunk_size,
+        }
+
+
+@dataclass
+@ProcessorStepRegistry.register(name="eo1_qwen_processor")
+class EO1QwenProcessorStep(ComplementaryDataProcessorStep):
+    processor_name: str = "Qwen/Qwen2.5-VL-3B-Instruct"
+    image_min_pixels: int | None = 64 * 28 * 28
+    image_max_pixels: int | None = 128 * 28 * 28
+    use_fast_processor: bool = False
+
+    _processor: Qwen2_5_VLProcessor | None = field(default=None, init=False, repr=False)
+    _state_token_id: int | None = field(default=None, init=False, repr=False)
+    _action_token_id: int | None = field(default=None, init=False, repr=False)
+
+    def __post_init__(self):
+        require_package("transformers", extra="eo1")
+        self._processor = Qwen2_5_VLProcessor.from_pretrained(
+            self.processor_name,
+            use_fast=self.use_fast_processor,
+        )
+        self._processor.tokenizer.add_tokens(EO1_SPECIAL_TOKENS, special_tokens=True)
+        self._state_token_id = self._processor.tokenizer.convert_tokens_to_ids(DEFAULT_STATE_TOKEN)
+        self._action_token_id = self._processor.tokenizer.convert_tokens_to_ids(DEFAULT_ACTION_TOKEN)
+
+    def complementary_data(self, complementary_data):
+        messages = complementary_data.pop("messages", None)
+        if messages is None:
+            raise ValueError("Messages are required for EO1QwenProcessorStep.")
+
+        # Rollout batches use left padding so action spans stay aligned across samples.
+        # Supervised batches use right padding to match standard training collation.
+        padding_side = "right" if self.transition.get(TransitionKey.ACTION) is not None else "left"
+
+        inputs = self._processor.apply_chat_template(
+            messages,
+            tokenize=True,
+            padding=True,
+            padding_side=padding_side,
+            min_pixels=self.image_min_pixels,
+            max_pixels=self.image_max_pixels,
+            add_generation_prompt=False,
+            return_dict=True,
+            return_tensors="pt",
+        )
+
+        complementary_data["input_ids"] = inputs["input_ids"]
+        complementary_data["pixel_values"] = inputs["pixel_values"]
+        complementary_data["image_grid_thw"] = inputs["image_grid_thw"]
+        complementary_data["attention_mask"] = inputs["attention_mask"]
+        complementary_data["mm_token_type_ids"] = inputs["mm_token_type_ids"]
+        complementary_data["state_token_id"] = self._state_token_id
+        complementary_data["action_token_id"] = self._action_token_id
+
+        return complementary_data
+
+    def get_config(self) -> dict[str, Any]:
+        return {
+            "processor_name": self.processor_name,
+            "image_min_pixels": self.image_min_pixels,
+            "image_max_pixels": self.image_max_pixels,
+            "use_fast_processor": self.use_fast_processor,
+        }
+
+    def transform_features(
+        self, features: dict[PipelineFeatureType, dict[str, PolicyFeature]]
+    ) -> dict[PipelineFeatureType, dict[str, PolicyFeature]]:
+        """
+        This step only converts the messages to the model input format.
+        """
+        return features
+
+
+def make_eo1_pre_post_processors(
+    config: EO1Config,
+    dataset_stats: dict[str, dict[str, torch.Tensor]] | None = None,
+) -> tuple[
+    PolicyProcessorPipeline[dict[str, Any], dict[str, Any]],
+    PolicyProcessorPipeline[PolicyAction, PolicyAction],
+]:
+    """Build pre/post processor pipelines for EO1."""
+
+    input_steps: list[ProcessorStep] = [
+        RenameObservationsProcessorStep(rename_map={}),
+        AddBatchDimensionProcessorStep(),
+        NormalizerProcessorStep(
+            features={**config.input_features, **config.output_features},
+            norm_map=config.normalization_mapping,
+            stats=dataset_stats,
+        ),
+        EO1ConversationTemplateStep(input_features=config.input_features, chunk_size=config.chunk_size),
+        EO1QwenProcessorStep(
+            processor_name=config.vlm_base,
+            image_min_pixels=config.image_min_pixels,
+            image_max_pixels=config.image_max_pixels,
+            use_fast_processor=config.use_fast_processor,
+        ),
+        DeviceProcessorStep(device=config.device),
+    ]
+
+    output_steps: list[ProcessorStep] = [
+        UnnormalizerProcessorStep(
+            features=config.output_features,
+            norm_map=config.normalization_mapping,
+            stats=dataset_stats,
+        ),
+        DeviceProcessorStep(device="cpu"),
+    ]
+
+    return (
+        PolicyProcessorPipeline[dict[str, Any], dict[str, Any]](
+            steps=input_steps,
+            name=POLICY_PREPROCESSOR_DEFAULT_NAME,
+        ),
+        PolicyProcessorPipeline[PolicyAction, PolicyAction](
+            steps=output_steps,
+            name=POLICY_POSTPROCESSOR_DEFAULT_NAME,
+            to_transition=policy_action_to_transition,
+            to_output=transition_to_policy_action,
+        ),
+    )
--- a/src/lerobot/policies/factory.py
+++ b/src/lerobot/policies/factory.py
@@ -46,6 +46,7 @@ from lerobot.utils.feature_utils import dataset_to_policy_features

 from .act.configuration_act import ACTConfig
 from .diffusion.configuration_diffusion import DiffusionConfig
+from .eo1.configuration_eo1 import EO1Config
 from .groot.configuration_groot import GrootConfig
 from .multi_task_dit.configuration_multi_task_dit import MultiTaskDiTConfig
 from .pi0.configuration_pi0 import PI0Config
@@ -146,6 +147,10 @@ def get_policy_class(name: str) -> type[PreTrainedPolicy]:
        from .wall_x.modeling_wall_x import WallXPolicy

        return WallXPolicy
+    elif name == "eo1":
+        from .eo1.modeling_eo1 import EO1Policy
+
+        return EO1Policy
    else:
        try:
            return _get_policy_cls_from_policy_name(name=name)
@@ -196,6 +201,8 @@ def make_policy_config(policy_type: str, **kwargs) -> PreTrainedConfig:
        return XVLAConfig(**kwargs)
    elif policy_type == "wall_x":
        return WallXConfig(**kwargs)
+    elif policy_type == "eo1":
+        return EO1Config(**kwargs)
    else:
        try:
            config_cls = PreTrainedConfig.get_choice_class(policy_type)
@@ -399,6 +406,13 @@ def make_pre_post_processors(
            config=policy_cfg,
            dataset_stats=kwargs.get("dataset_stats"),
        )
+    elif isinstance(policy_cfg, EO1Config):
+        from .eo1.processor_eo1 import make_eo1_pre_post_processors
+
+        processors = make_eo1_pre_post_processors(
+            config=policy_cfg,
+            dataset_stats=kwargs.get("dataset_stats"),
+        )

    else:
        try:
--- a/src/lerobot/processor/init.py
+++ b/src/lerobot/processor/init.py
@@ -95,13 +95,6 @@ from .relative_action_processor import (
 from .rename_processor import RenameObservationsProcessorStep, rename_stats
 from .tokenizer_processor import ActionTokenizerProcessorStep, TokenizerProcessorStep

-# RenderMessagesStep is intentionally NOT re-exported here: it pulls in
-# `lerobot.datasets.language`, which requires the `[dataset]` extra
-# (`datasets`, `pyarrow`). Importing it from the processor package would
-# break every base-install consumer of `lerobot.processor`. Users that
-# need it import directly:
-#   from lerobot.processor.render_messages_processor import RenderMessagesStep
-
 __all__ = [
    "ActionProcessorStep",
    "AddTeleopActionAsComplimentaryDataStep",
--- a/src/lerobot/processor/batch_processor.py
+++ b/src/lerobot/processor/batch_processor.py
@@ -174,24 +174,6 @@ class AddBatchDimensionComplementaryDataStep(ComplementaryDataProcessorStep):
            task_index_value = complementary_data["task_index"]
            if isinstance(task_index_value, Tensor) and task_index_value.dim() == 0:
                complementary_data["task_index"] = task_index_value.unsqueeze(0)
-
-        complementary_data.pop("language_persistent", None)
-        complementary_data.pop("language_events", None)
-
-        if "messages" in complementary_data:
-            messages = complementary_data["messages"]
-            if isinstance(messages, list) and (not messages or isinstance(messages[0], dict)):
-                complementary_data["messages"] = [messages]
-
-        if "message_streams" in complementary_data:
-            streams = complementary_data["message_streams"]
-            if isinstance(streams, list) and (not streams or isinstance(streams[0], str)):
-                complementary_data["message_streams"] = [streams]
-
-        if "target_message_indices" in complementary_data:
-            indices = complementary_data["target_message_indices"]
-            if isinstance(indices, list) and (not indices or isinstance(indices[0], int)):
-                complementary_data["target_message_indices"] = [indices]
        return complementary_data

    def transform_features(
--- a/src/lerobot/processor/converters.py
+++ b/src/lerobot/processor/converters.py
@@ -167,35 +167,12 @@ def _extract_complementary_data(batch: dict[str, Any]) -> dict[str, Any]:
    """
    pad_keys = {k: v for k, v in batch.items() if "_is_pad" in k}
    task_key = {"task": batch["task"]} if "task" in batch else {}
+    subtask_key = {"subtask": batch["subtask"]} if "subtask" in batch else {}
    index_key = {"index": batch["index"]} if "index" in batch else {}
    task_index_key = {"task_index": batch["task_index"]} if "task_index" in batch else {}
    episode_index_key = {"episode_index": batch["episode_index"]} if "episode_index" in batch else {}
-    timestamp_key = {"timestamp": batch["timestamp"]} if "timestamp" in batch else {}
-    language_persistent_key = (
-        {"language_persistent": batch["language_persistent"]} if "language_persistent" in batch else {}
-    )
-    language_events_key = {"language_events": batch["language_events"]} if "language_events" in batch else {}
-    messages_key = {"messages": batch["messages"]} if "messages" in batch else {}
-    message_streams_key = {"message_streams": batch["message_streams"]} if "message_streams" in batch else {}
-    target_message_indices_key = (
-        {"target_message_indices": batch["target_message_indices"]}
-        if "target_message_indices" in batch
-        else {}
-    )

-    return {
-        **pad_keys,
-        **task_key,
-        **index_key,
-        **task_index_key,
-        **episode_index_key,
-        **timestamp_key,
-        **language_persistent_key,
-        **language_events_key,
-        **messages_key,
-        **message_streams_key,
-        **target_message_indices_key,
-    }
+    return {**pad_keys, **task_key, **subtask_key, **index_key, **task_index_key, **episode_index_key}


 def create_transition(
--- a/src/lerobot/processor/render_messages_processor.py
+++ b/src/lerobot/processor/render_messages_processor.py
@@ -1,92 +0,0 @@
-#!/usr/bin/env python
-
-# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-from __future__ import annotations
-
-from dataclasses import dataclass
-from typing import Any
-
-from lerobot.configs import PipelineFeatureType, PolicyFeature
-from lerobot.configs.recipe import TrainingRecipe
-from lerobot.datasets.language import LANGUAGE_EVENTS, LANGUAGE_PERSISTENT
-from lerobot.datasets.language_render import render_sample
-from lerobot.types import EnvTransition, TransitionKey
-
-from .pipeline import ProcessorStep, ProcessorStepRegistry
-
-
-@dataclass
-@ProcessorStepRegistry.register(name="render_messages_processor")
-class RenderMessagesStep(ProcessorStep):
-    """Processor step that turns raw language columns into rendered chat messages.
-
-    Reads ``language_persistent`` and ``language_events`` from the transition's
-    complementary data, renders them through ``recipe`` at the sample timestamp,
-    and replaces the raw columns with the resulting ``messages`` /
-    ``message_streams`` / ``target_message_indices`` keys.
-    """
-
-    recipe: TrainingRecipe
-    dataset_ctx: Any | None = None
-
-    def __call__(self, transition: EnvTransition) -> EnvTransition | None:
-        """Render messages for a single transition; return ``None`` to drop it."""
-        complementary_data = transition.get(TransitionKey.COMPLEMENTARY_DATA) or {}
-        persistent = complementary_data.get(LANGUAGE_PERSISTENT) or []
-        events = complementary_data.get(LANGUAGE_EVENTS) or []
-
-        if not persistent and not events:
-            return transition
-
-        timestamp = complementary_data.get("timestamp")
-        if timestamp is None:
-            raise KeyError("RenderMessagesStep requires sample timestamp in complementary data.")
-
-        sample_idx = complementary_data.get("index", 0)
-        rendered = render_sample(
-            recipe=self.recipe,
-            persistent=persistent,
-            events=events,
-            t=_scalar(timestamp),
-            sample_idx=int(_scalar(sample_idx)),
-            task=complementary_data.get("task"),
-            dataset_ctx=self.dataset_ctx,
-        )
-        if rendered is None:
-            return None
-
-        new_transition = transition.copy()
-        new_complementary_data = dict(complementary_data)
-        new_complementary_data.pop(LANGUAGE_PERSISTENT, None)
-        new_complementary_data.pop(LANGUAGE_EVENTS, None)
-        new_complementary_data.update(rendered)
-        new_transition[TransitionKey.COMPLEMENTARY_DATA] = new_complementary_data
-        return new_transition
-
-    def transform_features(
-        self, features: dict[PipelineFeatureType, dict[str, PolicyFeature]]
-    ) -> dict[PipelineFeatureType, dict[str, PolicyFeature]]:
-        """Pass features through unchanged; rendering only touches complementary data."""
-        return features
-
-
-def _scalar(value: Any) -> float | int:
-    """Unwrap a tensor/array/single-element list into a Python scalar."""
-    if hasattr(value, "item"):
-        return value.item()
-    if isinstance(value, list) and len(value) == 1:
-        return _scalar(value[0])
-    return value
--- a/src/lerobot/robots/bi_openarm_follower/bi_openarm_follower.py
+++ b/src/lerobot/robots/bi_openarm_follower/bi_openarm_follower.py
@@ -54,6 +54,7 @@ class BiOpenArmFollower(Robot):
            calibration_dir=config.calibration_dir,
            port=config.left_arm_config.port,
            disable_torque_on_disconnect=config.left_arm_config.disable_torque_on_disconnect,
+            use_velocity_and_torque=config.left_arm_config.use_velocity_and_torque,
            max_relative_target=config.left_arm_config.max_relative_target,
            cameras=left_cameras,
            side=config.left_arm_config.side,
@@ -72,6 +73,7 @@ class BiOpenArmFollower(Robot):
            calibration_dir=config.calibration_dir,
            port=config.right_arm_config.port,
            disable_torque_on_disconnect=config.right_arm_config.disable_torque_on_disconnect,
+            use_velocity_and_torque=config.right_arm_config.use_velocity_and_torque,
            max_relative_target=config.right_arm_config.max_relative_target,
            cameras=right_cameras,
            side=config.right_arm_config.side,
--- a/src/lerobot/robots/openarm_follower/config_openarm_follower.py
+++ b/src/lerobot/robots/openarm_follower/config_openarm_follower.py
@@ -66,6 +66,10 @@ class OpenArmFollowerConfigBase:
    # Whether to disable torque when disconnecting
    disable_torque_on_disconnect: bool = True

+    # When True, expose `.vel` and `.torque` per motor in observation features.
+    # Default False for compatibility with the position-only openarm_mini teleoperator.
+    use_velocity_and_torque: bool = False
+
    # Safety limit for relative target positions
    # Set to a positive scalar for all motors, or a dict mapping motor names to limits
    max_relative_target: float | dict[str, float] | None = None
--- a/src/lerobot/robots/openarm_follower/openarm_follower.py
+++ b/src/lerobot/robots/openarm_follower/openarm_follower.py
@@ -93,8 +93,9 @@ class OpenArmFollower(Robot):
        features: dict[str, type] = {}
        for motor in self.bus.motors:
            features[f"{motor}.pos"] = float
-            features[f"{motor}.vel"] = float  # Add this
-            features[f"{motor}.torque"] = float  # Add this
+            if self.config.use_velocity_and_torque:
+                features[f"{motor}.vel"] = float
+                features[f"{motor}.torque"] = float
        return features

    @property
@@ -235,8 +236,9 @@ class OpenArmFollower(Robot):
        for motor in self.bus.motors:
            state = states.get(motor, {})
            obs_dict[f"{motor}.pos"] = state.get("position", 0.0)
-            obs_dict[f"{motor}.vel"] = state.get("velocity", 0.0)
-            obs_dict[f"{motor}.torque"] = state.get("torque", 0.0)
+            if self.config.use_velocity_and_torque:
+                obs_dict[f"{motor}.vel"] = state.get("velocity", 0.0)
+                obs_dict[f"{motor}.torque"] = state.get("torque", 0.0)

        # Capture images from cameras
        for cam_key, cam in self.cameras.items():
--- a/src/lerobot/rollout/strategies/base.py
+++ b/src/lerobot/rollout/strategies/base.py
@@ -23,6 +23,7 @@ from lerobot.utils.robot_utils import precise_sleep

 from ..context import RolloutContext
 from .core import RolloutStrategy, send_next_action
+from .display import BaseDisplay

 logger = logging.getLogger(__name__)

@@ -38,6 +39,8 @@ class BaseStrategy(RolloutStrategy):
        """Initialise the inference engine."""
        self._init_engine(ctx)
        logger.info("Base strategy ready")
+        self._display = BaseDisplay(duration=ctx.runtime.cfg.duration)
+        self._display.show_banner()

    def run(self, ctx: RolloutContext) -> None:
        """Run the autonomous control loop until shutdown or duration expires."""
@@ -72,9 +75,7 @@ class BaseStrategy(RolloutStrategy):
            if (sleep_t := control_interval - dt) > 0:
                precise_sleep(sleep_t)
            else:
-                logger.warning(
-                    f"Record loop is running slower ({1 / dt:.1f} Hz) than the target FPS ({cfg.fps} Hz). Dataset frames might be dropped and robot control might be unstable. Common causes are: 1) Camera FPS not keeping up 2) Policy inference taking too long 3) CPU starvation"
-                )
+                self._warn_slow_loop(dt, control_interval, cfg.fps)

    def teardown(self, ctx: RolloutContext) -> None:
        """Disconnect hardware and stop inference."""
--- a/src/lerobot/rollout/strategies/core.py
+++ b/src/lerobot/rollout/strategies/core.py
@@ -33,6 +33,7 @@ from ..inference import InferenceEngine
 if TYPE_CHECKING:
    from ..configs import RolloutStrategyConfig
    from ..context import HardwareContext, ProcessorContext, RolloutContext, RuntimeContext
+    from .display import RolloutStatusDisplay

 logger = logging.getLogger(__name__)

@@ -51,6 +52,17 @@ class RolloutStrategy(abc.ABC):
        self._interpolator: ActionInterpolator | None = None
        self._warmup_flushed: bool = False
        self._cached_obs_processed: dict | None = None
+        self._display: RolloutStatusDisplay | None = None
+
+    def _warn_slow_loop(self, dt: float, control_interval: float, fps: float) -> None:
+        """Warn when the control loop runs slower than the target FPS."""
+        if dt > control_interval:
+            logger.warning(
+                "Control loop running slower (%.1f Hz) than target (%.0f Hz). "
+                "Possible causes: camera FPS not keeping up, slow policy inference, CPU starvation.",
+                1 / dt,
+                fps,
+            )

    def _init_engine(self, ctx: RolloutContext) -> None:
        """Attach the inference engine and action interpolator, then start the backend.
--- a/src/lerobot/rollout/strategies/dagger.py
+++ b/src/lerobot/rollout/strategies/dagger.py
@@ -71,6 +71,7 @@ from ..configs import DAggerKeyboardConfig, DAggerPedalConfig, DAggerStrategyCon
 from ..context import RolloutContext
 from ..robot_wrapper import ThreadSafeRobot
 from .core import RolloutStrategy, estimate_max_episode_seconds, safe_push_to_hub, send_next_action
+from .display import DAggerDisplay

 PYNPUT_AVAILABLE = _pynput_available
 keyboard = None
@@ -286,7 +287,7 @@ def _init_dagger_keyboard(events: DAggerEvents, cfg: DAggerKeyboardConfig):

    listener = keyboard.Listener(on_press=on_press)
    listener.start()
-    logger.info(
+    logger.debug(
        "DAgger keyboard listener started (pause_resume='%s', correction='%s', upload='%s', ESC=stop)",
        cfg.pause_resume,
        cfg.correction,
@@ -370,6 +371,28 @@ class DAggerStrategy(RolloutStrategy):
            self._episode_duration_s,
        )

+        if self.config.input_device == "keyboard":
+            kb = self.config.keyboard
+            pause_key, correction_key, upload_key = (
+                kb.pause_resume.upper(),
+                kb.correction.upper(),
+                kb.upload.upper(),
+            )
+        else:
+            pb = self.config.pedal
+            pause_key, correction_key, upload_key = pb.pause_resume, pb.correction, pb.upload
+
+        self._display = DAggerDisplay(
+            record_autonomous=self.config.record_autonomous,
+            num_episodes=self.config.num_episodes,
+            episode_duration_s=self._episode_duration_s,
+            input_device=self.config.input_device,
+            pause_key=pause_key,
+            correction_key=correction_key,
+            upload_key=upload_key,
+        )
+        self._display.show_banner()
+
    def run(self, ctx: RolloutContext) -> None:
        """Run DAgger episodes with human-in-the-loop intervention."""
        if self.config.record_autonomous:
@@ -442,6 +465,7 @@ class DAggerStrategy(RolloutStrategy):
        interpolator.reset()
        events.reset()
        engine.resume()
+        self._display.show_state(DAggerPhase.AUTONOMOUS)

        last_action: dict[str, Any] | None = None
        record_tick = 0
@@ -472,6 +496,7 @@ class DAggerStrategy(RolloutStrategy):
                            ctx,
                            last_action,
                        )
+                        self._display.show_state(new_phase)
                        if new_phase == DAggerPhase.AUTONOMOUS:
                            last_action = None

@@ -556,9 +581,7 @@ class DAggerStrategy(RolloutStrategy):
                    if (sleep_t := control_interval - dt) > 0:
                        precise_sleep(sleep_t)
                    else:
-                        logger.warning(
-                            f"Record loop is running slower ({1 / dt:.1f} Hz) than the target FPS ({cfg.fps} Hz). Dataset frames might be dropped and robot control might be unstable. Common causes are: 1) Camera FPS not keeping up 2) Policy inference taking too long 3) CPU starvation"
-                        )
+                        self._warn_slow_loop(dt, control_interval, cfg.fps)

            finally:
                logger.info("DAgger continuous control loop ended — pausing engine")
@@ -599,6 +622,7 @@ class DAggerStrategy(RolloutStrategy):
        interpolator.reset()
        events.reset()
        engine.resume()
+        self._display.show_state(DAggerPhase.AUTONOMOUS)

        last_action: dict[str, Any] | None = None
        start_time = time.perf_counter()
@@ -633,6 +657,7 @@ class DAggerStrategy(RolloutStrategy):
                            ctx,
                            last_action,
                        )
+                        self._display.show_state(new_phase)
                        if new_phase == DAggerPhase.AUTONOMOUS:
                            last_action = None

@@ -705,9 +730,7 @@ class DAggerStrategy(RolloutStrategy):
                    if (sleep_t := control_interval - dt) > 0:
                        precise_sleep(sleep_t)
                    else:
-                        logger.warning(
-                            f"Record loop is running slower ({1 / dt:.1f} Hz) than the target FPS ({cfg.fps} Hz). Dataset frames might be dropped and robot control might be unstable. Common causes are: 1) Camera FPS not keeping up 2) Policy inference taking too long 3) CPU starvation"
-                        )
+                        self._warn_slow_loop(dt, control_interval, cfg.fps)

            finally:
                logger.info("DAgger corrections-only loop ended — pausing engine")
--- a/src/lerobot/rollout/strategies/display.py
+++ b/src/lerobot/rollout/strategies/display.py
@@ -0,0 +1,263 @@
+# Copyright 2025 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Console status display for rollout strategies.
+
+One subclass per strategy — static states/controls are declared as class
+constants; runtime-dependent values are passed to ``__init__``.
+
+In each strategy's ``setup()``:
+
+    self._display = DAggerDisplay(
+        record_autonomous=self.config.record_autonomous,
+        num_episodes=self.config.num_episodes,
+        episode_duration_s=self._episode_duration_s,
+        input_device=self.config.input_device,
+        pause_key="SPACE",
+        correction_key="TAB",
+        upload_key="ENTER",
+    )
+    self._display.show_banner()
+
+On each state transition:
+
+    self._display.show_state("correcting")
+"""
+
+from __future__ import annotations
+
+import enum
+import sys
+from dataclasses import dataclass
+
+
+def _supports_color() -> bool:
+    return hasattr(sys.stdout, "isatty") and sys.stdout.isatty()
+
+
+class _C:
+    """ANSI escape codes."""
+
+    RESET = "\033[0m"
+    BOLD = "\033[1m"
+    DIM = "\033[2m"
+    GREEN = "\033[1;92m"
+    YELLOW = "\033[1;93m"
+    RED = "\033[1;91m"
+    CYAN = "\033[1;96m"
+    WHITE = "\033[1;97m"
+    GRAY = "\033[2;37m"
+
+
+@dataclass
+class StateConfig:
+    """One named rollout state.
+
+    ``key`` must match the string passed to ``RolloutStatusDisplay.show_state()``.
+    """
+
+    key: str
+    emoji: str
+    label: str
+    description: str
+    color: str = _C.WHITE
+
+
+@dataclass
+class ControlConfig:
+    """One keyboard/pedal binding shown in the startup banner."""
+
+    key: str
+    description: str
+
+
+# ---------------------------------------------------------------------------
+# Base display class
+# ---------------------------------------------------------------------------
+
+
+class RolloutStatusDisplay:
+    """Unified console status display.  Subclass once per strategy."""
+
+    def __init__(
+        self,
+        strategy: str,
+        states: list[StateConfig],
+        controls: list[ControlConfig],
+        info: list[str] | None = None,
+    ) -> None:
+        self.strategy = strategy
+        self._states = {s.key: s for s in states}
+        self._controls = controls
+        self._info = info or []
+        self._use_color = _supports_color()
+
+    def _c(self, code: str, text: str) -> str:
+        if not self._use_color:
+            return text
+        return f"{code}{text}{_C.RESET}"
+
+    def show_banner(self) -> None:
+        """Print startup banner: strategy name, states, controls, config info."""
+        width = 62
+        sep = self._c(_C.BOLD, "═" * width)
+
+        print(f"\n{sep}")
+        print(self._c(_C.BOLD, f"  lerobot-rollout  │  {self.strategy}"))
+
+        if self._states:
+            print()
+            for state in self._states.values():
+                label = self._c(state.color, f"{state.label:<14}")
+                desc = self._c(_C.GRAY, state.description)
+                print(f"  {state.emoji}  {label}  {desc}")
+
+        if self._controls:
+            print()
+            key_width = max(len(c.key) for c in self._controls)
+            for ctrl in self._controls:
+                key_str = self._c(_C.CYAN, f"[{ctrl.key:<{key_width}}]")
+                print(f"  {key_str}  {ctrl.description}")
+
+        if self._info:
+            print()
+            for item in self._info:
+                print(f"  {item}")
+
+        print(f"{sep}\n")
+
+    def show_state(self, state_key: str | enum.Enum) -> None:
+        """Print the current state and available controls - call this on every transition."""
+        key = state_key.value if isinstance(state_key, enum.Enum) else state_key
+        state = self._states.get(key)
+        if state is None:
+            return
+        label = self._c(state.color, f"{state.label:<14}")
+        desc = self._c(_C.GRAY, state.description)
+        print(f"\n  {state.emoji}  {label}  {desc}\n")
+
+        if self._controls:
+            key_width = max(len(c.key) for c in self._controls)
+            for ctrl in self._controls:
+                key_str = self._c(_C.CYAN, f"[{ctrl.key:<{key_width}}]")
+                print(f"  {key_str}  {ctrl.description}")
+            print()
+
+
+# ---------------------------------------------------------------------------
+# One display subclass per strategy
+# ---------------------------------------------------------------------------
+
+
+class BaseDisplay(RolloutStatusDisplay):
+    """Status display for the base (eval-only, no recording) strategy."""
+
+    _STATES = [StateConfig("running", "🟢", "RUNNING", "autonomous rollout — no recording", _C.GREEN)]
+    _CONTROLS = [ControlConfig("Ctrl+C", "stop session")]
+
+    def __init__(self, duration: float = 0) -> None:
+        info = ["No recording — evaluation only."]
+        if duration > 0:
+            info.append(f"Duration: {duration:.0f}s")
+        super().__init__("base", self._STATES, self._CONTROLS, info)
+
+
+class SentryDisplay(RolloutStatusDisplay):
+    """Status display for the sentry (continuous autonomous recording) strategy."""
+
+    _STATES = [StateConfig("recording", "🟢", "RECORDING", "continuous autonomous recording", _C.GREEN)]
+    _CONTROLS = [ControlConfig("Ctrl+C", "stop session")]
+
+    def __init__(self, episode_duration_s: float, upload_every_n_episodes: int) -> None:
+        info = [
+            f"Episode rotation: ~{episode_duration_s:.0f}s  |  "
+            f"Upload every {upload_every_n_episodes} episodes",
+        ]
+        super().__init__("sentry", self._STATES, self._CONTROLS, info)
+
+
+class HighlightDisplay(RolloutStatusDisplay):
+    """Status display for the highlight (ring-buffer on-demand save) strategy."""
+
+    def __init__(self, ring_buffer_seconds: float, save_key: str, push_key: str) -> None:
+        states = [
+            StateConfig(
+                "buffering",
+                "⚪",
+                "BUFFERING",
+                f"ring buffer active — last {ring_buffer_seconds:.0f}s captured",
+                _C.WHITE,
+            ),
+            StateConfig("recording", "🔴", "RECORDING", "live recording — press [s] to save episode", _C.RED),
+        ]
+        controls = [
+            ControlConfig(save_key, "BUFFERING ↔ RECORDING  start recording / save episode"),
+            ControlConfig(push_key, "push dataset to Hub (background)"),
+            ControlConfig("ESC", "stop session"),
+        ]
+        super().__init__("highlight", states, controls)
+
+
+class DAggerDisplay(RolloutStatusDisplay):
+    """Status display for the dagger (human-in-the-loop) strategy."""
+
+    _PAUSED_STATE = StateConfig("paused", "🟡", "PAUSED", "holding last position — awaiting input", _C.YELLOW)
+    _CORRECTING_STATE = StateConfig(
+        "correcting", "🔴", "CORRECTING", "human teleop active — recording correction", _C.RED
+    )
+
+    def __init__(
+        self,
+        record_autonomous: bool,
+        num_episodes: int,
+        episode_duration_s: float,
+        input_device: str,
+        pause_key: str,
+        correction_key: str,
+        upload_key: str,
+    ) -> None:
+        mode = "continuous recording" if record_autonomous else "corrections only"
+        auto_desc = "policy running — recording" if record_autonomous else "policy running — no recording"
+        states = [
+            StateConfig("autonomous", "🟢", "AUTONOMOUS", auto_desc, _C.GREEN),
+            self._PAUSED_STATE,
+            self._CORRECTING_STATE,
+        ]
+        controls = [
+            ControlConfig(pause_key, "AUTONOMOUS ↔ PAUSED    pause / resume policy"),
+            ControlConfig(correction_key, "PAUSED ↔ CORRECTING   start / stop correction"),
+            ControlConfig(upload_key, "push dataset to Hub"),
+            ControlConfig("ESC", "stop session"),
+        ]
+        info = [f"Target: {num_episodes} episodes  |  Input: {input_device}"]
+        if record_autonomous:
+            info.append(f"Episode rotation: ~{episode_duration_s:.0f}s")
+        super().__init__(f"dagger  [{mode}]", states, controls, info)
+
+
+if __name__ == "__main__":
+    dagger_display = DAggerDisplay(
+        record_autonomous=False,
+        num_episodes=20,
+        episode_duration_s=30,
+        input_device="keyboard",
+        pause_key="SPACE",
+        correction_key="TAB",
+        upload_key="ENTER",
+    )
+    dagger_display.show_banner()
+    dagger_display.show_state("paused")
+    dagger_display.show_state("correcting")
+    dagger_display.show_state("paused")
+    dagger_display.show_state("autonomous")
--- a/src/lerobot/rollout/strategies/highlight.py
+++ b/src/lerobot/rollout/strategies/highlight.py
@@ -17,6 +17,7 @@
 from __future__ import annotations

 import contextlib
+import enum
 import logging
 import os
 import sys
@@ -36,6 +37,7 @@ from ..configs import HighlightStrategyConfig
 from ..context import RolloutContext
 from ..ring_buffer import RolloutRingBuffer
 from .core import RolloutStrategy, safe_push_to_hub, send_next_action
+from .display import HighlightDisplay

 PYNPUT_AVAILABLE = _pynput_available
 keyboard = None
@@ -53,6 +55,13 @@ if PYNPUT_AVAILABLE:
 logger = logging.getLogger(__name__)


+class HighlightPhase(enum.Enum):
+    """Observable phases of a Highlight session."""
+
+    BUFFERING = "buffering"  # Ring buffer accumulating frames, not recording
+    RECORDING = "recording"  # Live recording active
+
+
 class HighlightStrategy(RolloutStrategy):
    """Autonomous rollout with on-demand recording via ring buffer.

@@ -105,6 +114,13 @@ class HighlightStrategy(RolloutStrategy):
            self.config.save_key,
            self.config.push_key,
        )
+        self._display = HighlightDisplay(
+            ring_buffer_seconds=self.config.ring_buffer_seconds,
+            save_key=self.config.save_key,
+            push_key=self.config.push_key,
+        )
+        self._display.show_banner()
+        self._display.show_state(HighlightPhase.BUFFERING)

    def run(self, ctx: RolloutContext) -> None:
        """Run the autonomous loop, buffering frames and recording on demand."""
@@ -162,6 +178,7 @@ class HighlightStrategy(RolloutStrategy):
                                for buffered_frame in ring.drain():
                                    dataset.add_frame(buffered_frame)
                                self._recording_live.set()
+                                self._display.show_state(HighlightPhase.RECORDING)
                            else:
                                dataset.add_frame(frame)
                                with self._episode_lock:
@@ -172,6 +189,7 @@ class HighlightStrategy(RolloutStrategy):
                                    play_sounds,
                                )
                                self._recording_live.clear()
+                                self._display.show_state(HighlightPhase.BUFFERING)
                                continue  # frame already consumed — skip ring.append

                        if self._push_requested.is_set():
@@ -188,9 +206,7 @@ class HighlightStrategy(RolloutStrategy):
                    if (sleep_t := control_interval - dt) > 0:
                        precise_sleep(sleep_t)
                    else:
-                        logger.warning(
-                            f"Record loop is running slower ({1 / dt:.1f} Hz) than the target FPS ({cfg.fps} Hz). Dataset frames might be dropped and robot control might be unstable. Common causes are: 1) Camera FPS not keeping up 2) Policy inference taking too long 3) CPU starvation"
-                        )
+                        self._warn_slow_loop(dt, control_interval, cfg.fps)

            finally:
                logger.info("Highlight control loop ended")
@@ -255,7 +271,7 @@ class HighlightStrategy(RolloutStrategy):

            self._listener = keyboard.Listener(on_press=on_press)
            self._listener.start()
-            logger.info("Keyboard listener started (save='%s', push='%s', ESC=stop)", save_key, push_key)
+            logger.debug("Keyboard listener started (save='%s', push='%s', ESC=stop)", save_key, push_key)
        except ImportError:
            logger.warning("pynput not available — keyboard listener disabled")

--- a/src/lerobot/rollout/strategies/sentry.py
+++ b/src/lerobot/rollout/strategies/sentry.py
@@ -32,6 +32,7 @@ from lerobot.utils.utils import log_say
 from ..configs import SentryStrategyConfig
 from ..context import RolloutContext
 from .core import RolloutStrategy, estimate_max_episode_seconds, safe_push_to_hub, send_next_action
+from .display import SentryDisplay

 logger = logging.getLogger(__name__)

@@ -79,6 +80,11 @@ class SentryStrategy(RolloutStrategy):
            self._episode_duration_s,
            self.config.upload_every_n_episodes,
        )
+        self._display = SentryDisplay(
+            episode_duration_s=self._episode_duration_s,
+            upload_every_n_episodes=self.config.upload_every_n_episodes,
+        )
+        self._display.show_banner()

    def run(self, ctx: RolloutContext) -> None:
        """Run the continuous recording loop with automatic episode rotation."""
@@ -160,9 +166,7 @@ class SentryStrategy(RolloutStrategy):
                    if (sleep_t := control_interval - dt) > 0:
                        precise_sleep(sleep_t)
                    else:
-                        logger.warning(
-                            f"Record loop is running slower ({1 / dt:.1f} Hz) than the target FPS ({cfg.fps} Hz). Dataset frames might be dropped and robot control might be unstable. Common causes are: 1) Camera FPS not keeping up 2) Policy inference taking too long 3) CPU starvation"
-                        )
+                        self._warn_slow_loop(dt, control_interval, cfg.fps)

            finally:
                logger.info("Sentry control loop ended — saving final episode")
--- a/src/lerobot/scripts/lerobot_train.py
+++ b/src/lerobot/scripts/lerobot_train.py
@@ -48,7 +48,6 @@ from lerobot.envs import close_envs, make_env, make_env_pre_post_processors
 from lerobot.optim.factory import make_optimizer_and_scheduler
 from lerobot.policies import PreTrainedPolicy, make_policy, make_pre_post_processors
 from lerobot.rewards import make_reward_pre_post_processors
-from lerobot.utils.collate import lerobot_collate_fn
 from lerobot.utils.import_utils import register_third_party_plugins
 from lerobot.utils.logging_utils import AverageMeter, MetricsTracker
 from lerobot.utils.random_utils import set_seed
@@ -402,10 +401,6 @@ def train(cfg: TrainPipelineConfig, accelerator: "Accelerator | None" = None):
        shuffle = True
        sampler = None

-    # Only swap in the language-aware collate when the dataset actually
-    # declares language columns; otherwise stay on PyTorch's default
-    # collate so non-language training runs are unaffected.
-    collate_fn = lerobot_collate_fn if dataset.meta.has_language_columns else None
    dataloader = torch.utils.data.DataLoader(
        dataset,
        num_workers=cfg.num_workers,
@@ -414,7 +409,6 @@ def train(cfg: TrainPipelineConfig, accelerator: "Accelerator | None" = None):
        sampler=sampler,
        pin_memory=device.type == "cuda",
        drop_last=False,
-        collate_fn=collate_fn,
        prefetch_factor=cfg.prefetch_factor if cfg.num_workers > 0 else None,
        persistent_workers=cfg.persistent_workers and cfg.num_workers > 0,
    )
--- a/src/lerobot/teleoperators/bi_openarm_leader/bi_openarm_leader.py
+++ b/src/lerobot/teleoperators/bi_openarm_leader/bi_openarm_leader.py
@@ -49,6 +49,7 @@ class BiOpenArmLeader(Teleoperator):
            can_data_bitrate=config.left_arm_config.can_data_bitrate,
            motor_config=config.left_arm_config.motor_config,
            manual_control=config.left_arm_config.manual_control,
+            use_velocity_and_torque=config.left_arm_config.use_velocity_and_torque,
            position_kd=config.left_arm_config.position_kd,
            position_kp=config.left_arm_config.position_kp,
        )
@@ -63,6 +64,7 @@ class BiOpenArmLeader(Teleoperator):
            can_data_bitrate=config.right_arm_config.can_data_bitrate,
            motor_config=config.right_arm_config.motor_config,
            manual_control=config.right_arm_config.manual_control,
+            use_velocity_and_torque=config.right_arm_config.use_velocity_and_torque,
            position_kd=config.right_arm_config.position_kd,
            position_kp=config.right_arm_config.position_kp,
        )
--- a/src/lerobot/teleoperators/openarm_leader/config_openarm_leader.py
+++ b/src/lerobot/teleoperators/openarm_leader/config_openarm_leader.py
@@ -60,6 +60,10 @@ class OpenArmLeaderConfigBase:
    # When enabled, motors have torque disabled for manual movement
    manual_control: bool = True

+    # When True, expose `.vel` and `.torque` per motor in action features.
+    # Default False for compatibility with the position-only openarm_mini teleoperator.
+    use_velocity_and_torque: bool = False
+
    # TODO(Steven, Pepijn): Not used ... ?
    # MIT control parameters (used when manual_control=False for torque control)
    # List of 8 values: [joint_1, joint_2, joint_3, joint_4, joint_5, joint_6, joint_7, gripper]
--- a/src/lerobot/teleoperators/openarm_leader/openarm_leader.py
+++ b/src/lerobot/teleoperators/openarm_leader/openarm_leader.py
@@ -70,8 +70,9 @@ class OpenArmLeader(Teleoperator):
        features: dict[str, type] = {}
        for motor in self.bus.motors:
            features[f"{motor}.pos"] = float
-            features[f"{motor}.vel"] = float
-            features[f"{motor}.torque"] = float
+            if self.config.use_velocity_and_torque:
+                features[f"{motor}.vel"] = float
+                features[f"{motor}.torque"] = float
        return features

    @property
@@ -201,8 +202,9 @@ class OpenArmLeader(Teleoperator):
        for motor in self.bus.motors:
            state = states.get(motor, {})
            action_dict[f"{motor}.pos"] = state.get("position")
-            action_dict[f"{motor}.vel"] = state.get("velocity")
-            action_dict[f"{motor}.torque"] = state.get("torque")
+            if self.config.use_velocity_and_torque:
+                action_dict[f"{motor}.vel"] = state.get("velocity")
+                action_dict[f"{motor}.torque"] = state.get("torque")

        dt_ms = (time.perf_counter() - start) * 1e3
        logger.debug(f"{self} read state: {dt_ms:.1f}ms")
--- a/src/lerobot/utils/collate.py
+++ b/src/lerobot/utils/collate.py
@@ -1,54 +0,0 @@
-#!/usr/bin/env python
-
-# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-from __future__ import annotations
-
-from typing import Any
-
-from torch.utils.data._utils.collate import default_collate
-
-from lerobot.datasets.language import LANGUAGE_COLUMNS
-
-_PYTHON_LIST_KEYS = {"messages", "message_streams", "target_message_indices"}
-
-
-def lerobot_collate_fn(batch: list[dict[str, Any] | None]) -> dict[str, Any] | None:
-    """Collate function that preserves Python-list and language fields as lists.
-
-    Drops ``None`` samples (e.g. recipes that yielded no target message), keeps
-    rendered-message and language fields as plain Python lists, and delegates
-    every other key to PyTorch's ``default_collate``.
-    """
-    batch = [sample for sample in batch if sample is not None]
-    if not batch:
-        return None
-
-    preserved = {
-        key: [sample[key] for sample in batch if key in sample]
-        for key in _PYTHON_LIST_KEYS
-        if any(key in sample for sample in batch)
-    }
-    tensorizable = [
-        {
-            key: value
-            for key, value in sample.items()
-            if key not in _PYTHON_LIST_KEYS and key not in LANGUAGE_COLUMNS
-        }
-        for sample in batch
-    ]
-    collated = default_collate(tensorizable)
-    collated.update(preserved)
-    return collated
--- a/tests/configs/test_recipe.py
+++ b/tests/configs/test_recipe.py
@@ -1,15 +0,0 @@
-#!/usr/bin/env python
-
-import pytest
-
-from lerobot.configs.recipe import MessageTurn, TrainingRecipe
-
-
-def test_message_recipe_validates_unknown_binding():
-    with pytest.raises(ValueError, match="unknown binding"):
-        TrainingRecipe(
-            messages=[
-                MessageTurn(role="user", content="${missing}", stream="high_level"),
-                MessageTurn(role="assistant", content="ok", stream="high_level", target=True),
-            ]
-        )
--- a/tests/datasets/test_dataset_metadata.py
+++ b/tests/datasets/test_dataset_metadata.py
@@ -385,84 +385,3 @@ def test_finalize_flushes_buffered_metadata(tmp_path):
    assert episodes_dir.exists()
    parquet_files = list(episodes_dir.rglob("*.parquet"))
    assert len(parquet_files) > 0
-
-
-# ── Tools accessor ───────────────────────────────────────────────────
-
-
-def test_tools_falls_back_to_default_when_info_has_no_tools_field(tmp_path):
-    """meta.tools returns DEFAULT_TOOLS when info.json doesn't declare any."""
-    from lerobot.datasets.language import DEFAULT_TOOLS
-
-    root = tmp_path / "no_tools"
-    meta = LeRobotDatasetMetadata.create(
-        repo_id="test/no_tools",
-        fps=DEFAULT_FPS,
-        features=SIMPLE_FEATURES,
-        root=root,
-        use_videos=False,
-    )
-
-    assert meta.tools == DEFAULT_TOOLS
-    # info.json on disk should NOT include a `tools` key for clean datasets
-    with open(root / INFO_PATH) as f:
-        info_on_disk = json.load(f)
-    assert "tools" not in info_on_disk
-
-
-def test_tools_reads_declared_tools_from_info_json(tmp_path):
-    """A `tools` list written into info.json survives load → meta.tools.
-
-    Regression test for the bug where ``DatasetInfo.from_dict`` silently
-    dropped the ``tools`` key (no matching dataclass field), so
-    ``meta.tools`` always returned ``DEFAULT_TOOLS`` regardless of
-    what was on disk.
-    """
-    from lerobot.datasets.io_utils import load_info
-
-    root = tmp_path / "with_tools"
-    meta = LeRobotDatasetMetadata.create(
-        repo_id="test/with_tools",
-        fps=DEFAULT_FPS,
-        features=SIMPLE_FEATURES,
-        root=root,
-        use_videos=False,
-    )
-
-    custom_tool = {
-        "type": "function",
-        "function": {
-            "name": "record_observation",
-            "description": "Capture a still image.",
-            "parameters": {
-                "type": "object",
-                "properties": {"label": {"type": "string"}},
-                "required": ["label"],
-            },
-        },
-    }
-    info_path = root / INFO_PATH
-    with open(info_path) as f:
-        raw = json.load(f)
-    raw["tools"] = [custom_tool]
-    with open(info_path, "w") as f:
-        json.dump(raw, f)
-
-    # Reload info from disk and rebind it on the metadata object
-    meta.info = load_info(root)
-    assert meta.tools == [custom_tool]
-
-
-def test_tools_round_trip_through_dataset_info(tmp_path):
-    """A `tools` list survives DatasetInfo.from_dict / to_dict."""
-    from lerobot.datasets.utils import DatasetInfo
-
-    raw = {
-        "codebase_version": "v3.1",
-        "fps": 30,
-        "features": SIMPLE_FEATURES,
-        "tools": [{"type": "function", "function": {"name": "say"}}],
-    }
-    info = DatasetInfo.from_dict(raw)
-    assert info.tools == raw["tools"]
-    assert info.to_dict()["tools"] == raw["tools"]
--- a/tests/datasets/test_language.py
+++ b/tests/datasets/test_language.py
@@ -1,156 +0,0 @@
-#!/usr/bin/env python
-
-import pytest
-
-pytest.importorskip("datasets", reason="datasets is required (install lerobot[dataset])")
-pytest.importorskip("pandas", reason="pandas is required (install lerobot[dataset])")
-
-import numpy as np  # noqa: E402
-import pandas as pd  # noqa: E402
-import pyarrow as pa  # noqa: E402
-
-from lerobot.datasets import LeRobotDataset  # noqa: E402
-from lerobot.datasets.io_utils import write_info  # noqa: E402
-from lerobot.datasets.language import (  # noqa: E402
-    EVENT_ONLY_STYLES,
-    LANGUAGE_EVENTS,
-    LANGUAGE_PERSISTENT,
-    PERSISTENT_STYLES,
-    STYLE_REGISTRY,
-    VIEW_DEPENDENT_STYLES,
-    column_for_style,
-    is_view_dependent_style,
-    language_events_arrow_type,
-    language_feature_info,
-    language_persistent_arrow_type,
-    validate_camera_field,
-)
-from lerobot.datasets.utils import DEFAULT_DATA_PATH  # noqa: E402
-
-
-def test_language_arrow_schema_has_expected_fields():
-    persistent_row_type = language_persistent_arrow_type().value_type
-    event_row_type = language_events_arrow_type().value_type
-
-    assert isinstance(persistent_row_type, pa.StructType)
-    assert persistent_row_type.names == [
-        "role",
-        "content",
-        "style",
-        "timestamp",
-        "camera",
-        "tool_calls",
-    ]
-
-    assert isinstance(event_row_type, pa.StructType)
-    assert event_row_type.names == ["role", "content", "style", "camera", "tool_calls"]
-
-
-def test_style_registry_routes_columns():
-    assert {"subtask", "plan", "memory", "motion", "task_aug"} == PERSISTENT_STYLES
-    assert {"interjection", "vqa", "trace"} == EVENT_ONLY_STYLES
-    assert PERSISTENT_STYLES | EVENT_ONLY_STYLES <= STYLE_REGISTRY
-
-    assert column_for_style("subtask") == LANGUAGE_PERSISTENT
-    assert column_for_style("plan") == LANGUAGE_PERSISTENT
-    assert column_for_style("memory") == LANGUAGE_PERSISTENT
-    assert column_for_style("motion") == LANGUAGE_PERSISTENT
-    assert column_for_style("task_aug") == LANGUAGE_PERSISTENT
-    assert column_for_style("interjection") == LANGUAGE_EVENTS
-    assert column_for_style("vqa") == LANGUAGE_EVENTS
-    assert column_for_style("trace") == LANGUAGE_EVENTS
-    assert column_for_style(None) == LANGUAGE_EVENTS
-
-
-def test_view_dependent_styles():
-    # motion lives in PERSISTENT_STYLES and is described in robot-frame
-    # (joint / Cartesian) terms, so it is NOT view-dependent. Only vqa
-    # (event) and trace (event, pixel-trajectory) carry a camera tag.
-    assert {"vqa", "trace"} == VIEW_DEPENDENT_STYLES
-    assert is_view_dependent_style("vqa")
-    assert is_view_dependent_style("trace")
-    assert not is_view_dependent_style("motion")
-    assert not is_view_dependent_style("subtask")
-    assert not is_view_dependent_style("plan")
-    assert not is_view_dependent_style("interjection")
-    assert not is_view_dependent_style(None)
-
-
-def test_validate_camera_field_requires_camera_for_view_dependent_styles():
-    validate_camera_field("vqa", "observation.images.top")
-    validate_camera_field("trace", "observation.images.front")
-    with pytest.raises(ValueError, match="view-dependent"):
-        validate_camera_field("vqa", None)
-    with pytest.raises(ValueError, match="view-dependent"):
-        validate_camera_field("trace", "")
-
-
-def test_validate_camera_field_rejects_camera_on_non_view_dependent_styles():
-    validate_camera_field("subtask", None)
-    validate_camera_field("plan", None)
-    validate_camera_field("memory", None)
-    validate_camera_field("motion", None)
-    validate_camera_field("interjection", None)
-    validate_camera_field(None, None)
-    with pytest.raises(ValueError, match="must have camera=None"):
-        validate_camera_field("subtask", "observation.images.top")
-    with pytest.raises(ValueError, match="must have camera=None"):
-        validate_camera_field("motion", "observation.images.top")
-    with pytest.raises(ValueError, match="must have camera=None"):
-        validate_camera_field("interjection", "observation.images.top")
-    with pytest.raises(ValueError, match="must have camera=None"):
-        validate_camera_field(None, "observation.images.top")
-
-
-def test_unknown_style_rejected():
-    with pytest.raises(ValueError, match="Unknown language style"):
-        column_for_style("surprise")
-
-
-def test_lerobot_dataset_passes_language_columns_through(tmp_path, empty_lerobot_dataset_factory):
-    root = tmp_path / "language_dataset"
-    dataset = empty_lerobot_dataset_factory(
-        root=root,
-        features={"state": {"dtype": "float32", "shape": (2,), "names": None}},
-        use_videos=False,
-    )
-    dataset.add_frame({"state": np.array([0.0, 1.0], dtype=np.float32), "task": "tidy"})
-    dataset.add_frame({"state": np.array([1.0, 2.0], dtype=np.float32), "task": "tidy"})
-    dataset.save_episode()
-    dataset.finalize()
-
-    persistent = [
-        {
-            "role": "assistant",
-            "content": "reach for the cup",
-            "style": "subtask",
-            "timestamp": 0.0,
-            "camera": None,
-            "tool_calls": None,
-        }
-    ]
-    event = {
-        "role": "user",
-        "content": "what is visible?",
-        "style": "vqa",
-        "camera": "observation.images.top",
-        "tool_calls": None,
-    }
-    data_path = root / DEFAULT_DATA_PATH.format(chunk_index=0, file_index=0)
-    df = pd.read_parquet(data_path)
-    df[LANGUAGE_PERSISTENT] = [persistent, persistent]
-    df[LANGUAGE_EVENTS] = [[event], []]
-    df.to_parquet(data_path)
-
-    info = dataset.meta.info
-    info["features"].update(language_feature_info())
-    write_info(info, root)
-
-    reloaded = LeRobotDataset(repo_id=dataset.repo_id, root=root)
-
-    first = reloaded[0]
-    second = reloaded[1]
-    assert first[LANGUAGE_PERSISTENT] == persistent
-    assert first[LANGUAGE_EVENTS] == [event]
-    assert second[LANGUAGE_PERSISTENT] == persistent
-    assert second[LANGUAGE_EVENTS] == []
--- a/tests/datasets/test_language_render.py
+++ b/tests/datasets/test_language_render.py
@@ -1,378 +0,0 @@
-#!/usr/bin/env python
-
-import pytest
-
-pytest.importorskip("datasets", reason="datasets is required (install lerobot[dataset])")
-
-from lerobot.configs.recipe import MessageTurn, TrainingRecipe  # noqa: E402
-from lerobot.datasets.language_render import (  # noqa: E402
-    active_at,
-    emitted_at,
-    nth_next,
-    nth_prev,
-    render_sample,
-)
-
-
-def persistent_row(role, content, style, timestamp, tool_calls=None, camera=None):
-    return {
-        "role": role,
-        "content": content,
-        "style": style,
-        "timestamp": timestamp,
-        "camera": camera,
-        "tool_calls": tool_calls,
-    }
-
-
-def event_row(role, content, style, tool_calls=None, camera=None):
-    return {
-        "role": role,
-        "content": content,
-        "style": style,
-        "camera": camera,
-        "tool_calls": tool_calls,
-    }
-
-
-PERSISTENT = [
-    persistent_row("assistant", "plan 0", "plan", 0.0),
-    persistent_row("assistant", "memory 0", "memory", 0.0),
-    persistent_row("assistant", "subtask 0", "subtask", 0.0),
-    persistent_row("assistant", "memory 1", "memory", 1.0),
-    persistent_row("assistant", "subtask 1", "subtask", 1.0),
-]
-EVENTS_AT_1 = [
-    event_row("user", "what is visible?", "vqa", camera="observation.images.top"),
-    event_row("assistant", '{"count": 2}', "vqa", camera="observation.images.top"),
-]
-EVENTS_AT_2 = [
-    event_row("user", "skip wiping", "interjection"),
-    event_row(
-        "assistant",
-        None,
-        None,
-        [{"type": "function", "function": {"name": "say", "arguments": {"text": "Skipping wiping."}}}],
-    ),
-]
-# Same emission tick, two cameras: triggers per-camera disambiguation in
-# resolvers, mirroring how Module 3 of the annotation pipeline writes one
-# (vqa, user) + (vqa, assistant) pair per camera.
-EVENTS_AT_3_TWO_CAMERAS = [
-    event_row("user", "how many cups (top)?", "vqa", camera="observation.images.top"),
-    event_row("assistant", '{"count": 3}', "vqa", camera="observation.images.top"),
-    event_row("user", "how many cups (wrist)?", "vqa", camera="observation.images.wrist"),
-    event_row("assistant", '{"count": 1}', "vqa", camera="observation.images.wrist"),
-]
-
-
-def test_resolver_temporal_semantics():
-    assert active_at(0.5, persistent=PERSISTENT, style="subtask")["content"] == "subtask 0"
-    assert active_at(1.0, persistent=PERSISTENT, style="subtask")["content"] == "subtask 1"
-    assert emitted_at(0.5, persistent=PERSISTENT, events=[], style="vqa", role="assistant") is None
-    assert (
-        emitted_at(1.0, persistent=PERSISTENT, events=EVENTS_AT_1, style="vqa", role="assistant")["content"]
-        == '{"count": 2}'
-    )
-
-
-def test_persistent_relative_resolvers_reject_event_styles():
-    with pytest.raises(ValueError, match="event-only"):
-        active_at(1.0, persistent=PERSISTENT, style="vqa")
-    with pytest.raises(ValueError, match="event-only"):
-        nth_prev(1.0, persistent=PERSISTENT, style="interjection")
-
-
-def test_nth_prev_and_next():
-    assert nth_prev(1.0, persistent=PERSISTENT, style="subtask", offset=1)["content"] == "subtask 0"
-    assert nth_next(0.0, persistent=PERSISTENT, style="subtask", offset=1)["content"] == "subtask 1"
-
-
-def test_substitution_if_present_multimodal_and_tool_calls():
-    recipe = TrainingRecipe(
-        messages=[
-            MessageTurn(
-                role="user",
-                content=[
-                    {"type": "image", "feature": "observation.images.top"},
-                    {"type": "text", "text": "${task}: ${interjection}"},
-                ],
-                stream="high_level",
-                if_present="interjection",
-            ),
-            MessageTurn(
-                role="assistant",
-                content="${plan}",
-                stream="high_level",
-                target=True,
-                tool_calls_from="speech",
-            ),
-        ],
-        bindings={"plan": "active_at(t, style=plan)"},
-    )
-
-    rendered = render_sample(
-        recipe=recipe,
-        persistent=PERSISTENT,
-        events=EVENTS_AT_2,
-        t=2.0,
-        sample_idx=0,
-        task="clean kitchen",
-    )
-
-    assert rendered["messages"][0]["content"][1]["text"] == "clean kitchen: skip wiping"
-    assert rendered["messages"][1]["content"] == "plan 0"
-    assert rendered["messages"][1]["tool_calls"][0]["function"]["name"] == "say"
-    assert rendered["message_streams"] == ["high_level", "high_level"]
-    assert rendered["target_message_indices"] == [1]
-
-
-def test_exact_event_miss_returns_none_when_target_skips():
-    recipe = TrainingRecipe(
-        messages=[
-            MessageTurn(role="user", content="${vqa_query}", stream="high_level", if_present="vqa_query"),
-            MessageTurn(
-                role="assistant",
-                content="${vqa}",
-                stream="high_level",
-                target=True,
-                if_present="vqa",
-            ),
-        ]
-    )
-
-    assert (
-        render_sample(recipe=recipe, persistent=PERSISTENT, events=EVENTS_AT_2, t=0.0, sample_idx=0) is None
-    )
-
-
-def test_deterministic_blend_sampling():
-    recipe = TrainingRecipe(
-        blend={
-            "a": TrainingRecipe(
-                weight=1.0,
-                messages=[
-                    MessageTurn(role="user", content="${task}", stream="high_level"),
-                    MessageTurn(role="assistant", content="a", stream="high_level", target=True),
-                ],
-            ),
-            "b": TrainingRecipe(
-                weight=1.0,
-                messages=[
-                    MessageTurn(role="user", content="${task}", stream="high_level"),
-                    MessageTurn(role="assistant", content="b", stream="high_level", target=True),
-                ],
-            ),
-        }
-    )
-
-    first = render_sample(
-        recipe=recipe, persistent=PERSISTENT, events=EVENTS_AT_2, t=0.0, sample_idx=123, task="x"
-    )
-    second = render_sample(
-        recipe=recipe, persistent=PERSISTENT, events=EVENTS_AT_2, t=0.0, sample_idx=123, task="x"
-    )
-    assert first == second
-
-
-def test_emitted_at_filters_vqa_by_camera():
-    top = emitted_at(
-        3.0,
-        persistent=PERSISTENT,
-        events=EVENTS_AT_3_TWO_CAMERAS,
-        style="vqa",
-        role="assistant",
-        camera="observation.images.top",
-    )
-    wrist = emitted_at(
-        3.0,
-        persistent=PERSISTENT,
-        events=EVENTS_AT_3_TWO_CAMERAS,
-        style="vqa",
-        role="assistant",
-        camera="observation.images.wrist",
-    )
-    assert top["content"] == '{"count": 3}'
-    assert wrist["content"] == '{"count": 1}'
-
-
-def test_emitted_at_raises_on_ambiguous_per_camera_vqa():
-    with pytest.raises(ValueError, match="Ambiguous resolver"):
-        emitted_at(
-            3.0,
-            persistent=PERSISTENT,
-            events=EVENTS_AT_3_TWO_CAMERAS,
-            style="vqa",
-            role="assistant",
-        )
-
-
-def _vqa_subrecipe(camera: str) -> TrainingRecipe:
-    return TrainingRecipe(
-        weight=1.0,
-        bindings={
-            "vqa_query": f"emitted_at(t, style=vqa, role=user, camera={camera})",
-            "vqa": f"emitted_at(t, style=vqa, role=assistant, camera={camera})",
-        },
-        messages=[
-            MessageTurn(
-                role="user",
-                content=[{"type": "image", "feature": camera}, {"type": "text", "text": "${vqa_query}"}],
-                stream="high_level",
-                if_present="vqa_query",
-            ),
-            MessageTurn(
-                role="assistant",
-                content="${vqa}",
-                stream="high_level",
-                target=True,
-                if_present="vqa",
-            ),
-        ],
-    )
-
-
-@pytest.mark.parametrize(
-    ("camera", "expected_query", "expected_answer"),
-    [
-        ("observation.images.top", "how many cups (top)?", '{"count": 3}'),
-        ("observation.images.wrist", "how many cups (wrist)?", '{"count": 1}'),
-    ],
-)
-def test_per_camera_blend_renders_both_views(camera, expected_query, expected_answer):
-    rendered = render_sample(
-        recipe=_vqa_subrecipe(camera),
-        persistent=PERSISTENT,
-        events=EVENTS_AT_3_TWO_CAMERAS,
-        t=3.0,
-        sample_idx=0,
-    )
-
-    assert rendered["messages"][0]["content"][0]["feature"] == camera
-    assert rendered["messages"][0]["content"][1]["text"] == expected_query
-    assert rendered["messages"][1]["content"] == expected_answer
-
-
-def test_resolve_task_picks_rephrasing_deterministically_per_sample():
-    rephrasings = [
-        persistent_row("user", "tidy the kitchen", "task_aug", 0.0),
-        persistent_row("user", "please clean up the kitchen", "task_aug", 0.0),
-        persistent_row("user", "kitchen needs tidying", "task_aug", 0.0),
-        persistent_row("user", "make the kitchen clean", "task_aug", 0.0),
-    ]
-    recipe = TrainingRecipe(
-        messages=[
-            MessageTurn(role="user", content="${task}", stream="high_level"),
-            MessageTurn(role="assistant", content="ok", stream="high_level", target=True),
-        ]
-    )
-
-    # No explicit task override → resolver consults persistent rows.
-    seen: set[str] = set()
-    for sample_idx in range(64):
-        rendered = render_sample(
-            recipe=recipe,
-            persistent=rephrasings,
-            events=[],
-            t=0.0,
-            sample_idx=sample_idx,
-            dataset_ctx={"task": "canonical kitchen task"},
-        )
-        seen.add(rendered["messages"][0]["content"])
-    # Every rephrasing should be reachable across enough samples.
-    assert seen == {r["content"] for r in rephrasings}
-    # Same sample_idx → same pick (determinism).
-    a = render_sample(
-        recipe=recipe,
-        persistent=rephrasings,
-        events=[],
-        t=0.0,
-        sample_idx=42,
-        dataset_ctx={"task": "canonical"},
-    )
-    b = render_sample(
-        recipe=recipe,
-        persistent=rephrasings,
-        events=[],
-        t=0.0,
-        sample_idx=42,
-        dataset_ctx={"task": "canonical"},
-    )
-    assert a["messages"][0]["content"] == b["messages"][0]["content"]
-
-
-def test_resolve_task_falls_back_to_canonical_without_rephrasings():
-    recipe = TrainingRecipe(
-        messages=[
-            MessageTurn(role="user", content="${task}", stream="high_level"),
-            MessageTurn(role="assistant", content="ok", stream="high_level", target=True),
-        ]
-    )
-    rendered = render_sample(
-        recipe=recipe,
-        persistent=PERSISTENT,  # no task_aug rows
-        events=[],
-        t=0.0,
-        sample_idx=0,
-        dataset_ctx={"task": "clean the kitchen"},
-    )
-    assert rendered["messages"][0]["content"] == "clean the kitchen"
-
-
-def test_resolve_task_explicit_override_beats_rephrasings():
-    rephrasings = [
-        persistent_row("user", "rephrased one", "task_aug", 0.0),
-        persistent_row("user", "rephrased two", "task_aug", 0.0),
-    ]
-    recipe = TrainingRecipe(
-        messages=[
-            MessageTurn(role="user", content="${task}", stream="high_level"),
-            MessageTurn(role="assistant", content="ok", stream="high_level", target=True),
-        ]
-    )
-    rendered = render_sample(
-        recipe=recipe,
-        persistent=rephrasings,
-        events=[],
-        t=0.0,
-        sample_idx=0,
-        task="explicit override wins",
-        dataset_ctx={"task": "canonical"},
-    )
-    assert rendered["messages"][0]["content"] == "explicit override wins"
-
-
-def test_low_level_branch_renders_active_subtask():
-    low_level = TrainingRecipe(
-        blend={
-            "low": TrainingRecipe(
-                weight=1.0,
-                messages=[
-                    MessageTurn(
-                        role="user",
-                        content="${task}\nPlan: ${plan}\nMemory: ${memory}",
-                        stream="high_level",
-                    ),
-                    MessageTurn(
-                        role="assistant",
-                        content="${subtask}",
-                        stream="low_level",
-                        target=True,
-                    ),
-                ],
-            )
-        }
-    )
-
-    rendered = render_sample(
-        recipe=low_level,
-        persistent=PERSISTENT,
-        events=[],
-        t=0.5,
-        sample_idx=0,
-        task="clean kitchen",
-    )
-
-    assert rendered["messages"][-1] == {"role": "assistant", "content": "subtask 0"}
-    assert rendered["message_streams"][-1] == "low_level"
-    assert rendered["target_message_indices"] == [1]
--- a/tests/datasets/test_subtask_dataset.py
+++ b/tests/datasets/test_subtask_dataset.py
@@ -0,0 +1,193 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""
+Tests for subtask functionality in LeRobotDataset.
+
+These tests verify that:
+- Subtask information is correctly loaded from datasets that have subtask data
+- The __getitem__ method correctly adds subtask strings to returned items
+- Subtask handling gracefully handles missing data
+"""
+
+import pytest
+
+pytest.importorskip("pandas", reason="pandas is required (install lerobot[dataset])")
+
+import pandas as pd  # noqa: E402
+import torch
+
+from lerobot.datasets.lerobot_dataset import LeRobotDataset
+
+
+class TestSubtaskDataset:
+    """Tests for subtask handling in LeRobotDataset."""
+
+    @pytest.fixture
+    def subtask_dataset(self):
+        """Load the test subtask dataset from the hub."""
+        # Use lerobot/pusht-subtask dataset with episode 1
+        return LeRobotDataset(
+            repo_id="lerobot/pusht-subtask",
+            episodes=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
+        )
+
+    def test_subtask_dataset_loads(self, subtask_dataset):
+        """Test that the subtask dataset loads successfully."""
+        assert subtask_dataset is not None
+        assert len(subtask_dataset) > 0
+
+    def test_subtask_metadata_loaded(self, subtask_dataset):
+        """Test that subtask metadata is loaded when present in dataset."""
+        # The dataset should have subtasks metadata loaded
+        assert subtask_dataset.meta.subtasks is not None
+        assert isinstance(subtask_dataset.meta.subtasks, pd.DataFrame)
+
+    def test_subtask_index_in_features(self, subtask_dataset):
+        """Test that subtask_index is a feature when dataset has subtasks."""
+        assert "subtask_index" in subtask_dataset.features
+
+    def test_getitem_returns_subtask_string(self, subtask_dataset):
+        """Test that __getitem__ correctly adds subtask string to returned item."""
+        item = subtask_dataset[0]
+
+        # Subtask should be present in the returned item
+        assert "subtask" in item
+        assert isinstance(item["subtask"], str)
+        assert len(item["subtask"]) > 0  # Should not be empty
+
+    def test_getitem_has_subtask_index(self, subtask_dataset):
+        """Test that __getitem__ includes subtask_index."""
+        item = subtask_dataset[0]
+
+        assert "subtask_index" in item
+        assert isinstance(item["subtask_index"], torch.Tensor)
+
+    def test_subtask_index_maps_to_valid_subtask(self, subtask_dataset):
+        """Test that subtask_index correctly maps to a subtask in metadata."""
+        item = subtask_dataset[0]
+
+        subtask_idx = item["subtask_index"].item()
+        subtask_from_metadata = subtask_dataset.meta.subtasks.iloc[subtask_idx].name
+
+        assert item["subtask"] == subtask_from_metadata
+
+    def test_all_items_have_subtask(self, subtask_dataset):
+        """Test that all items in the dataset have subtask information."""
+        for i in range(min(len(subtask_dataset), 5)):  # Check first 5 items
+            item = subtask_dataset[i]
+            assert "subtask" in item
+            assert isinstance(item["subtask"], str)
+
+    def test_task_and_subtask_coexist(self, subtask_dataset):
+        """Test that both task and subtask are present in returned items."""
+        item = subtask_dataset[0]
+
+        # Both task and subtask should be present
+        assert "task" in item
+        assert "subtask" in item
+        assert isinstance(item["task"], str)
+        assert isinstance(item["subtask"], str)
+
+
+class TestSubtaskDatasetMissing:
+    """Tests for graceful handling when subtask data is missing."""
+
+    @pytest.fixture
+    def dataset_without_subtasks(self, tmp_path, empty_lerobot_dataset_factory):
+        """Create a dataset without subtask information."""
+        features = {"state": {"dtype": "float32", "shape": (2,), "names": None}}
+        dataset = empty_lerobot_dataset_factory(root=tmp_path / "no_subtask", features=features)
+
+        # Add some frames and save
+        for _ in range(5):
+            dataset.add_frame({"state": torch.randn(2), "task": "Test task"})
+        dataset.save_episode()
+        dataset.finalize()
+
+        # Reload the dataset
+        return LeRobotDataset(dataset.repo_id, root=dataset.root)
+
+    def test_no_subtask_in_features(self, dataset_without_subtasks):
+        """Test that subtask_index is not in features when not provided."""
+        assert "subtask_index" not in dataset_without_subtasks.features
+
+    def test_getitem_without_subtask(self, dataset_without_subtasks):
+        """Test that __getitem__ works when subtask is not present."""
+        item = dataset_without_subtasks[0]
+
+        # Item should still be retrievable
+        assert item is not None
+        assert "state" in item
+        assert "task" in item
+
+        # Subtask should NOT be present
+        assert "subtask" not in item
+
+    def test_subtasks_metadata_is_none(self, dataset_without_subtasks):
+        """Test that subtasks metadata is None when not present."""
+        assert dataset_without_subtasks.meta.subtasks is None
+
+
+class TestSubtaskEdgeCases:
+    """Edge case tests for subtask handling."""
+
+    def test_subtask_with_multiple_episodes(self):
+        """Test subtask handling with multiple episodes if available."""
+        try:
+            dataset = LeRobotDataset(
+                repo_id="lerobot/pusht-subtask",
+                episodes=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
+            )
+        except Exception:
+            pytest.skip("Could not load test-subtask dataset")
+
+        # Check first and last items have valid subtasks
+        first_item = dataset[0]
+        last_item = dataset[len(dataset) - 1]
+
+        assert "subtask" in first_item
+        assert "subtask" in last_item
+        assert isinstance(first_item["subtask"], str)
+        assert isinstance(last_item["subtask"], str)
+
+    def test_subtask_index_consistency(self):
+        """Test that same subtask_index returns same subtask string."""
+        try:
+            dataset = LeRobotDataset(
+                repo_id="lerobot/pusht-subtask",
+                episodes=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
+            )
+        except Exception:
+            pytest.skip("Could not load test-subtask dataset")
+
+        if len(dataset) < 2:
+            pytest.skip("Dataset too small for this test")
+
+        # Collect subtask_index to subtask mappings
+        subtask_map = {}
+        for i in range(min(len(dataset), 10)):
+            item = dataset[i]
+            idx = item["subtask_index"].item()
+            subtask = item["subtask"]
+
+            if idx in subtask_map:
+                # Same index should always return same subtask
+                assert subtask_map[idx] == subtask, (
+                    f"Inconsistent subtask for index {idx}: '{subtask_map[idx]}' vs '{subtask}'"
+                )
+            else:
+                subtask_map[idx] = subtask
--- a/tests/policies/eo1/test_eo1.py
+++ b/tests/policies/eo1/test_eo1.py
@@ -0,0 +1,186 @@
+#!/usr/bin/env python
+
+# Copyright 2026 The HuggingFace Inc. team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Smoke tests for EO1's public LeRobot policy interface."""
+
+from __future__ import annotations
+
+from types import SimpleNamespace
+
+import pytest
+import torch
+from torch import nn
+
+pytest.importorskip("transformers")
+
+from lerobot.configs.types import FeatureType, PolicyFeature
+from lerobot.policies.eo1.modeling_eo1 import EO1Policy
+from lerobot.utils.constants import ACTION, OBS_STATE
+
+HIDDEN_SIZE = 8
+STATE_DIM = 4
+ACTION_DIM = 3
+CHUNK_SIZE = 3
+N_ACTION_STEPS = 2
+MAX_ACTION_DIM = 6
+STATE_TOKEN_ID = 5
+ACTION_TOKEN_ID = 6
+
+
+class DummyVLMBackbone(nn.Module):
+    def __init__(self, hidden_size: int, vocab_size: int = 64):
+        super().__init__()
+        self.embedding = nn.Embedding(vocab_size, hidden_size)
+        self.config = SimpleNamespace(text_config=SimpleNamespace(hidden_size=hidden_size))
+
+    @property
+    def model(self):
+        return self
+
+    def get_input_embeddings(self):
+        return self.embedding
+
+    def get_rope_index(
+        self,
+        input_ids: torch.Tensor,
+        image_grid_thw: torch.Tensor | None = None,
+        attention_mask: torch.Tensor | None = None,
+        mm_token_type_ids: torch.Tensor | None = None,
+    ):
+        batch_size, seq_len = input_ids.shape
+        if attention_mask is None:
+            text_positions = torch.arange(seq_len, device=input_ids.device).expand(batch_size, -1)
+        else:
+            text_positions = attention_mask.long().cumsum(-1) - 1
+            text_positions = text_positions.masked_fill(attention_mask == 0, 0)
+        position_ids = text_positions.view(1, batch_size, seq_len).expand(3, batch_size, seq_len)
+        rope_deltas = torch.zeros(batch_size, 1, dtype=torch.long, device=input_ids.device)
+        return position_ids, rope_deltas
+
+    def gradient_checkpointing_enable(self, gradient_checkpointing_kwargs=None):
+        return gradient_checkpointing_kwargs
+
+    def gradient_checkpointing_disable(self):
+        return None
+
+    def forward(
+        self,
+        *,
+        input_ids: torch.Tensor | None = None,
+        inputs_embeds: torch.Tensor | None = None,
+        **kwargs,
+    ):
+        if inputs_embeds is None:
+            inputs_embeds = self.embedding(input_ids)
+        return SimpleNamespace(
+            last_hidden_state=inputs_embeds,
+            past_key_values=SimpleNamespace(crop=lambda prefix_len: None),
+        )
+
+
+def make_eo1_config():
+    from lerobot.policies.eo1.configuration_eo1 import EO1Config
+
+    return EO1Config(
+        device="cpu",
+        dtype="float32",
+        vlm_base="dummy-qwen",
+        vlm_config={},
+        chunk_size=CHUNK_SIZE,
+        n_action_steps=N_ACTION_STEPS,
+        max_state_dim=STATE_DIM,
+        max_action_dim=MAX_ACTION_DIM,
+        num_denoise_steps=2,
+        input_features={
+            OBS_STATE: PolicyFeature(type=FeatureType.STATE, shape=(STATE_DIM,)),
+            "observation.images.image": PolicyFeature(type=FeatureType.VISUAL, shape=(3, 16, 16)),
+        },
+        output_features={
+            ACTION: PolicyFeature(type=FeatureType.ACTION, shape=(ACTION_DIM,)),
+        },
+    )
+
+
+def make_policy_batch(include_action: bool) -> dict[str, torch.Tensor | int]:
+    batch_size = 1
+    seq_len = CHUNK_SIZE + 4
+    input_ids = torch.tensor(
+        [[11, STATE_TOKEN_ID, 12, ACTION_TOKEN_ID, ACTION_TOKEN_ID, ACTION_TOKEN_ID, 13]],
+        dtype=torch.long,
+    )
+    assert input_ids.shape == (batch_size, seq_len)
+
+    batch: dict[str, torch.Tensor | int] = {
+        OBS_STATE: torch.randn(batch_size, STATE_DIM, dtype=torch.float32),
+        "input_ids": input_ids,
+        "attention_mask": torch.ones(batch_size, seq_len, dtype=torch.long),
+        "pixel_values": torch.zeros(batch_size, 3, 4, 4, dtype=torch.float32),
+        "image_grid_thw": torch.tensor([[1, 2, 2]], dtype=torch.long),
+        "mm_token_type_ids": torch.zeros(batch_size, seq_len, dtype=torch.int32),
+        "state_token_id": STATE_TOKEN_ID,
+        "action_token_id": ACTION_TOKEN_ID,
+    }
+    if include_action:
+        batch[ACTION] = torch.randn(batch_size, CHUNK_SIZE, ACTION_DIM, dtype=torch.float32)
+    return batch
+
+
+def test_lerobot_eo1_forward_pass(monkeypatch):
+    monkeypatch.setattr(
+        "lerobot.policies.eo1.modeling_eo1.Qwen2_5_VLForConditionalGeneration.from_pretrained",
+        lambda *args, **kwargs: DummyVLMBackbone(HIDDEN_SIZE),
+    )
+    policy = EO1Policy(make_eo1_config())
+
+    loss, metrics = policy.forward(make_policy_batch(include_action=True))
+
+    assert loss.ndim == 0
+    assert torch.isfinite(loss)
+    assert metrics["loss"] == pytest.approx(loss.item())
+
+
+def test_lerobot_eo1_inference(monkeypatch):
+    monkeypatch.setattr(
+        "lerobot.policies.eo1.modeling_eo1.Qwen2_5_VLForConditionalGeneration.from_pretrained",
+        lambda *args, **kwargs: DummyVLMBackbone(HIDDEN_SIZE),
+    )
+    policy = EO1Policy(make_eo1_config())
+
+    sample_calls = {"count": 0}
+    fixed_chunk = torch.tensor(
+        [
+            [
+                [0.1, 0.2, 0.3, 9.0, 9.0, 9.0],
+                [1.1, 1.2, 1.3, 9.0, 9.0, 9.0],
+                [2.1, 2.2, 2.3, 9.0, 9.0, 9.0],
+            ]
+        ],
+        dtype=torch.float32,
+    )
+
+    def fake_sample_actions(**kwargs):
+        sample_calls["count"] += 1
+        return fixed_chunk
+
+    monkeypatch.setattr(policy.model, "sample_actions", fake_sample_actions)
+
+    batch = make_policy_batch(include_action=False)
+    action_0 = policy.select_action(batch)
+    action_1 = policy.select_action(batch)
+
+    torch.testing.assert_close(action_0, fixed_chunk[:, 0, :ACTION_DIM])
+    torch.testing.assert_close(action_1, fixed_chunk[:, 1, :ACTION_DIM])
+    assert sample_calls["count"] == 1
--- a/tests/processor/test_render_messages_processor.py
+++ b/tests/processor/test_render_messages_processor.py
@@ -1,60 +0,0 @@
-#!/usr/bin/env python
-
-import pytest
-
-pytest.importorskip("datasets", reason="datasets is required (install lerobot[dataset])")
-
-import torch  # noqa: E402
-
-from lerobot.configs.recipe import MessageTurn, TrainingRecipe  # noqa: E402
-from lerobot.processor.converters import create_transition  # noqa: E402
-from lerobot.processor.render_messages_processor import RenderMessagesStep  # noqa: E402
-from lerobot.types import TransitionKey  # noqa: E402
-
-
-def test_render_messages_step_noops_without_language_columns():
-    recipe = TrainingRecipe(
-        messages=[
-            MessageTurn(role="user", content="${task}", stream="high_level"),
-            MessageTurn(role="assistant", content="${subtask}", stream="low_level", target=True),
-        ]
-    )
-    transition = create_transition(complementary_data={"task": "do it"})
-
-    assert RenderMessagesStep(recipe)(transition) == transition
-
-
-def test_render_messages_step_renders_and_drops_raw_language():
-    recipe = TrainingRecipe(
-        messages=[
-            MessageTurn(role="user", content="${task}", stream="high_level"),
-            MessageTurn(role="assistant", content="${subtask}", stream="low_level", target=True),
-        ]
-    )
-    transition = create_transition(
-        complementary_data={
-            "task": "do it",
-            "timestamp": torch.tensor(0.0),
-            "index": torch.tensor(7),
-            "language_persistent": [
-                {
-                    "role": "assistant",
-                    "content": "reach carefully",
-                    "style": "subtask",
-                    "timestamp": 0.0,
-                    "camera": None,
-                    "tool_calls": None,
-                }
-            ],
-            "language_events": [],
-        }
-    )
-
-    out = RenderMessagesStep(recipe)(transition)
-    data = out[TransitionKey.COMPLEMENTARY_DATA]
-
-    assert "language_persistent" not in data
-    assert "language_events" not in data
-    assert data["messages"][-1]["content"] == "reach carefully"
-    assert data["message_streams"] == ["high_level", "low_level"]
-    assert data["target_message_indices"] == [1]
--- a/tests/utils/test_collate.py
+++ b/tests/utils/test_collate.py
@@ -1,84 +0,0 @@
-#!/usr/bin/env python
-
-import pytest
-
-pytest.importorskip("datasets", reason="datasets is required (install lerobot[dataset])")
-
-import torch  # noqa: E402
-
-from lerobot.utils.collate import lerobot_collate_fn  # noqa: E402
-
-
-def test_lerobot_collate_preserves_messages_and_drops_raw_language():
-    batch = [
-        {
-            "index": torch.tensor(0),
-            "messages": [{"role": "assistant", "content": "a"}],
-            "message_streams": ["low_level"],
-            "target_message_indices": [0],
-            "language_persistent": [{"content": "raw"}],
-            "language_events": [],
-        },
-        {
-            "index": torch.tensor(1),
-            "messages": [{"role": "assistant", "content": "b"}],
-            "message_streams": ["low_level"],
-            "target_message_indices": [0],
-            "language_persistent": [{"content": "raw"}],
-            "language_events": [],
-        },
-    ]
-
-    out = lerobot_collate_fn(batch)
-
-    assert out["index"].tolist() == [0, 1]
-    assert out["messages"][0][0]["content"] == "a"
-    assert out["messages"][1][0]["content"] == "b"
-    assert out["message_streams"] == [["low_level"], ["low_level"]]
-    assert out["target_message_indices"] == [[0], [0]]
-    assert "language_persistent" not in out
-    assert "language_events" not in out
-
-
-def test_lerobot_collate_passes_through_standard_batch():
-    """On a non-language batch, the collate must match ``default_collate``.
-
-    Guards against silent regressions: ``lerobot_train.py`` only opts into
-    ``lerobot_collate_fn`` when the dataset declares language columns, but
-    if a future change ever wires it in unconditionally we want the
-    behavior to remain a transparent pass-through for ordinary tensor
-    batches.
-    """
-    from torch.utils.data._utils.collate import default_collate
-
-    batch = [
-        {
-            "observation.image": torch.zeros(3, 4, 4),
-            "action": torch.tensor([0.0, 1.0]),
-            "index": torch.tensor(0),
-        },
-        {
-            "observation.image": torch.ones(3, 4, 4),
-            "action": torch.tensor([2.0, 3.0]),
-            "index": torch.tensor(1),
-        },
-    ]
-
-    custom = lerobot_collate_fn(batch)
-    expected = default_collate(batch)
-
-    assert custom.keys() == expected.keys()
-    for key in expected:
-        assert torch.equal(custom[key], expected[key]), f"key={key} diverged"
-
-
-def test_lerobot_collate_drops_none_samples():
-    """Recipes that yielded no target message return ``None`` — those samples
-    must be filtered out, and an entirely-``None`` batch must collapse to ``None``.
-    """
-    batch = [None, {"index": torch.tensor(0)}, None]
-    out = lerobot_collate_fn(batch)
-    assert out is not None
-    assert out["index"].tolist() == [0]
-
-    assert lerobot_collate_fn([None, None]) is None
--- a/uv.lock
+++ b/uv.lock
@@ -960,7 +960,7 @@ name = "cuda-bindings"
 version = "12.9.4"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
-    { name = "cuda-pathfinder", marker = "platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'" },
+    { name = "cuda-pathfinder", marker = "platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and sys_platform == 'linux'" },
 ]
 wheels = [
    { url = "https://files.pythonhosted.org/packages/a9/c1/dabe88f52c3e3760d861401bb994df08f672ec893b8f7592dc91626adcf3/cuda_bindings-12.9.4-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:fda147a344e8eaeca0c6ff113d2851ffca8f7dfc0a6c932374ee5c47caa649c8", size = 12151019, upload-time = "2025-10-21T14:51:43.167Z" },
@@ -1047,7 +1047,7 @@ name = "decord"
 version = "0.6.0"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
-    { name = "numpy", marker = "(platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x') or (platform_machine != 's390x' and sys_platform != 'linux')" },
+    { name = "numpy", marker = "(platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l') or sys_platform != 'linux'" },
 ]
 wheels = [
    { url = "https://files.pythonhosted.org/packages/11/79/936af42edf90a7bd4e41a6cac89c913d4b47fa48a26b042d5129a9242ee3/decord-0.6.0-py3-none-manylinux2010_x86_64.whl", hash = "sha256:51997f20be8958e23b7c4061ba45d0efcd86bffd5fe81c695d0befee0d442976", size = 13602299, upload-time = "2021-06-14T21:30:55.486Z" },
@@ -2723,6 +2723,10 @@ dynamixel = [
    { name = "dynamixel-sdk" },
    { name = "pyserial" },
 ]
+eo1 = [
+    { name = "qwen-vl-utils" },
+    { name = "transformers" },
+]
 evaluation = [
    { name = "av" },
 ]
@@ -2937,7 +2941,7 @@ requires-dist = [
    { name = "av", marker = "extra == 'av-dep'", specifier = ">=15.0.0,<16.0.0" },
    { name = "cmake", specifier = ">=3.29.0.1,<4.2.0" },
    { name = "contourpy", marker = "extra == 'matplotlib-dep'", specifier = ">=1.3.0,<2.0.0" },
-    { name = "datasets", marker = "extra == 'dataset'", specifier = ">=4.7.0,<5.0.0" },
+    { name = "datasets", marker = "extra == 'dataset'", specifier = ">=4.0.0,<5.0.0" },
    { name = "debugpy", marker = "extra == 'dev'", specifier = ">=1.8.1,<1.9.0" },
    { name = "decord", marker = "(platform_machine == 'AMD64' and extra == 'groot') or (platform_machine == 'x86_64' and extra == 'groot')", specifier = ">=0.6.0,<1.0.0" },
    { name = "deepdiff", marker = "extra == 'deepdiff-dep'", specifier = ">=7.0.1,<9.0.0" },
@@ -3029,6 +3033,7 @@ requires-dist = [
    { name = "lerobot", extras = ["pyserial-dep"], marker = "extra == 'unitree-g1'" },
    { name = "lerobot", extras = ["pyzmq-dep"], marker = "extra == 'lekiwi'" },
    { name = "lerobot", extras = ["pyzmq-dep"], marker = "extra == 'unitree-g1'" },
+    { name = "lerobot", extras = ["qwen-vl-utils-dep"], marker = "extra == 'eo1'" },
    { name = "lerobot", extras = ["qwen-vl-utils-dep"], marker = "extra == 'sarm'" },
    { name = "lerobot", extras = ["qwen-vl-utils-dep"], marker = "extra == 'wallx'" },
    { name = "lerobot", extras = ["reachy2"], marker = "extra == 'all'" },
@@ -3043,6 +3048,7 @@ requires-dist = [
    { name = "lerobot", extras = ["smolvla"], marker = "extra == 'all'" },
    { name = "lerobot", extras = ["test"], marker = "extra == 'all'" },
    { name = "lerobot", extras = ["training"], marker = "extra == 'all'" },
+    { name = "lerobot", extras = ["transformers-dep"], marker = "extra == 'eo1'" },
    { name = "lerobot", extras = ["transformers-dep"], marker = "extra == 'groot'" },
    { name = "lerobot", extras = ["transformers-dep"], marker = "extra == 'hilserl'" },
    { name = "lerobot", extras = ["transformers-dep"], marker = "extra == 'libero'" },
@@ -3112,7 +3118,7 @@ requires-dist = [
    { name = "transformers", marker = "extra == 'transformers-dep'", specifier = ">=5.4.0,<5.6.0" },
    { name = "wandb", marker = "extra == 'training'", specifier = ">=0.24.0,<0.25.0" },
 ]
-provides-extras = ["dataset", "training", "hardware", "viz", "core-scripts", "evaluation", "dataset-viz", "av-dep", "pygame-dep", "placo-dep", "transformers-dep", "grpcio-dep", "can-dep", "peft-dep", "scipy-dep", "diffusers-dep", "qwen-vl-utils-dep", "matplotlib-dep", "pyserial-dep", "deepdiff-dep", "pynput-dep", "pyzmq-dep", "feetech", "dynamixel", "damiao", "robstride", "openarms", "gamepad", "hopejr", "lekiwi", "unitree-g1", "reachy2", "kinematics", "intelrealsense", "phone", "diffusion", "wallx", "pi", "smolvla", "multi-task-dit", "groot", "sarm", "xvla", "hilserl", "async", "peft", "dev", "notebook", "test", "video-benchmark", "aloha", "pusht", "libero", "metaworld", "all"]
+provides-extras = ["dataset", "training", "hardware", "viz", "core-scripts", "evaluation", "dataset-viz", "av-dep", "pygame-dep", "placo-dep", "transformers-dep", "grpcio-dep", "can-dep", "peft-dep", "scipy-dep", "diffusers-dep", "qwen-vl-utils-dep", "matplotlib-dep", "pyserial-dep", "deepdiff-dep", "pynput-dep", "pyzmq-dep", "feetech", "dynamixel", "damiao", "robstride", "openarms", "gamepad", "hopejr", "lekiwi", "unitree-g1", "reachy2", "kinematics", "intelrealsense", "phone", "diffusion", "wallx", "pi", "smolvla", "multi-task-dit", "groot", "sarm", "xvla", "eo1", "hilserl", "async", "peft", "dev", "notebook", "test", "video-benchmark", "aloha", "pusht", "libero", "metaworld", "all"]

 [[package]]
 name = "librt"
@@ -3988,7 +3994,7 @@ name = "nvidia-cudnn-cu12"
 version = "9.10.2.21"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
-    { name = "nvidia-cublas-cu12", marker = "platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'" },
+    { name = "nvidia-cublas-cu12", marker = "platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and sys_platform == 'linux'" },
 ]
 wheels = [
    { url = "https://files.pythonhosted.org/packages/ba/51/e123d997aa098c61d029f76663dedbfb9bc8dcf8c60cbd6adbe42f76d049/nvidia_cudnn_cu12-9.10.2.21-py3-none-manylinux_2_27_x86_64.whl", hash = "sha256:949452be657fa16687d0930933f032835951ef0892b37d2d53824d1a84dc97a8", size = 706758467, upload-time = "2025-06-06T21:54:08.597Z" },
@@ -3999,7 +4005,7 @@ name = "nvidia-cufft-cu12"
 version = "11.3.3.83"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
-    { name = "nvidia-nvjitlink-cu12", marker = "platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'" },
+    { name = "nvidia-nvjitlink-cu12", marker = "platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and sys_platform == 'linux'" },
 ]
 wheels = [
    { url = "https://files.pythonhosted.org/packages/1f/13/ee4e00f30e676b66ae65b4f08cb5bcbb8392c03f54f2d5413ea99a5d1c80/nvidia_cufft_cu12-11.3.3.83-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:4d2dd21ec0b88cf61b62e6b43564355e5222e4a3fb394cac0db101f2dd0d4f74", size = 193118695, upload-time = "2025-03-07T01:45:27.821Z" },
@@ -4026,9 +4032,9 @@ name = "nvidia-cusolver-cu12"
 version = "11.7.3.90"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
-    { name = "nvidia-cublas-cu12", marker = "platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'" },
-    { name = "nvidia-cusparse-cu12", marker = "platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'" },
-    { name = "nvidia-nvjitlink-cu12", marker = "platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'" },
+    { name = "nvidia-cublas-cu12", marker = "platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and sys_platform == 'linux'" },
+    { name = "nvidia-cusparse-cu12", marker = "platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and sys_platform == 'linux'" },
+    { name = "nvidia-nvjitlink-cu12", marker = "platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and sys_platform == 'linux'" },
 ]
 wheels = [
    { url = "https://files.pythonhosted.org/packages/85/48/9a13d2975803e8cf2777d5ed57b87a0b6ca2cc795f9a4f59796a910bfb80/nvidia_cusolver_cu12-11.7.3.90-py3-none-manylinux_2_27_x86_64.whl", hash = "sha256:4376c11ad263152bd50ea295c05370360776f8c3427b30991df774f9fb26c450", size = 267506905, upload-time = "2025-03-07T01:47:16.273Z" },
@@ -4039,7 +4045,7 @@ name = "nvidia-cusparse-cu12"
 version = "12.5.8.93"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
-    { name = "nvidia-nvjitlink-cu12", marker = "platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and platform_machine != 's390x' and sys_platform == 'linux'" },
+    { name = "nvidia-nvjitlink-cu12", marker = "platform_machine != 'aarch64' and platform_machine != 'arm64' and platform_machine != 'armv7l' and sys_platform == 'linux'" },
 ]
 wheels = [
    { url = "https://files.pythonhosted.org/packages/c2/f5/e1854cb2f2bcd4280c44736c93550cc300ff4b8c95ebe370d0aa7d2b473d/nvidia_cusparse_cu12-12.5.8.93-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:1ec05d76bbbd8b61b06a80e1eaf8cf4959c3d4ce8e711b65ebd0443bb0ebb13b", size = 288216466, upload-time = "2025-03-07T01:48:13.779Z" },
@@ -4881,10 +4887,10 @@ name = "pyobjc-framework-applicationservices"
 version = "12.1"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
-    { name = "pyobjc-core", marker = "(platform_machine != 's390x' and sys_platform == 'win32') or (sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')" },
-    { name = "pyobjc-framework-cocoa", marker = "(platform_machine != 's390x' and sys_platform == 'win32') or (sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')" },
-    { name = "pyobjc-framework-coretext", marker = "(platform_machine != 's390x' and sys_platform == 'win32') or (sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')" },
-    { name = "pyobjc-framework-quartz", marker = "(platform_machine != 's390x' and sys_platform == 'win32') or (sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')" },
+    { name = "pyobjc-core", marker = "sys_platform != 'emscripten' and sys_platform != 'linux'" },
+    { name = "pyobjc-framework-cocoa", marker = "sys_platform != 'emscripten' and sys_platform != 'linux'" },
+    { name = "pyobjc-framework-coretext", marker = "sys_platform != 'emscripten' and sys_platform != 'linux'" },
+    { name = "pyobjc-framework-quartz", marker = "sys_platform != 'emscripten' and sys_platform != 'linux'" },
 ]
 sdist = { url = "https://files.pythonhosted.org/packages/be/6a/d4e613c8e926a5744fc47a9e9fea08384a510dc4f27d844f7ad7a2d793bd/pyobjc_framework_applicationservices-12.1.tar.gz", hash = "sha256:c06abb74f119bc27aeb41bf1aef8102c0ae1288aec1ac8665ea186a067a8945b", size = 103247, upload-time = "2025-11-14T10:08:52.18Z" }
 wheels = [
@@ -4900,7 +4906,7 @@ name = "pyobjc-framework-cocoa"
 version = "12.1"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
-    { name = "pyobjc-core", marker = "(platform_machine != 's390x' and sys_platform == 'win32') or (sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')" },
+    { name = "pyobjc-core", marker = "sys_platform != 'emscripten' and sys_platform != 'linux'" },
 ]
 sdist = { url = "https://files.pythonhosted.org/packages/02/a3/16ca9a15e77c061a9250afbae2eae26f2e1579eb8ca9462ae2d2c71e1169/pyobjc_framework_cocoa-12.1.tar.gz", hash = "sha256:5556c87db95711b985d5efdaaf01c917ddd41d148b1e52a0c66b1a2e2c5c1640", size = 2772191, upload-time = "2025-11-14T10:13:02.069Z" }
 wheels = [
@@ -4916,9 +4922,9 @@ name = "pyobjc-framework-coretext"
 version = "12.1"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
-    { name = "pyobjc-core", marker = "(platform_machine != 's390x' and sys_platform == 'win32') or (sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')" },
-    { name = "pyobjc-framework-cocoa", marker = "(platform_machine != 's390x' and sys_platform == 'win32') or (sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')" },
-    { name = "pyobjc-framework-quartz", marker = "(platform_machine != 's390x' and sys_platform == 'win32') or (sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')" },
+    { name = "pyobjc-core", marker = "sys_platform != 'emscripten' and sys_platform != 'linux'" },
+    { name = "pyobjc-framework-cocoa", marker = "sys_platform != 'emscripten' and sys_platform != 'linux'" },
+    { name = "pyobjc-framework-quartz", marker = "sys_platform != 'emscripten' and sys_platform != 'linux'" },
 ]
 sdist = { url = "https://files.pythonhosted.org/packages/29/da/682c9c92a39f713bd3c56e7375fa8f1b10ad558ecb075258ab6f1cdd4a6d/pyobjc_framework_coretext-12.1.tar.gz", hash = "sha256:e0adb717738fae395dc645c9e8a10bb5f6a4277e73cba8fa2a57f3b518e71da5", size = 90124, upload-time = "2025-11-14T10:14:38.596Z" }
 wheels = [
@@ -4934,8 +4940,8 @@ name = "pyobjc-framework-quartz"
 version = "12.1"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
-    { name = "pyobjc-core", marker = "(platform_machine != 's390x' and sys_platform == 'win32') or (sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')" },
-    { name = "pyobjc-framework-cocoa", marker = "(platform_machine != 's390x' and sys_platform == 'win32') or (sys_platform != 'emscripten' and sys_platform != 'linux' and sys_platform != 'win32')" },
+    { name = "pyobjc-core", marker = "sys_platform != 'emscripten' and sys_platform != 'linux'" },
+    { name = "pyobjc-framework-cocoa", marker = "sys_platform != 'emscripten' and sys_platform != 'linux'" },
 ]
 sdist = { url = "https://files.pythonhosted.org/packages/94/18/cc59f3d4355c9456fc945eae7fe8797003c4da99212dd531ad1b0de8a0c6/pyobjc_framework_quartz-12.1.tar.gz", hash = "sha256:27f782f3513ac88ec9b6c82d9767eef95a5cf4175ce88a1e5a65875fee799608", size = 3159099, upload-time = "2025-11-14T10:21:24.31Z" }
 wheels = [
Author	SHA1	Message	Date
Maximellerbach	6021554770	chore(rollout): nice collored cli	2026-05-07 11:12:02 +02:00
Haoming Song	e99c55af4b	feat(policies): add EO-1 model (#3403 ) * feat(policies): add EO-1 model * chore(eo1): adjust policy_eo1_README.md to to avoid duplicate with eo1.mdx * chore(eo1): remove policy_eo1_README.md, link eo1.mdx in policy folder --------- Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>	2026-05-06 18:01:16 +02:00
Steven Palma	408e0ca763	fix(robots): openarm features with openarmmini (#3524 )	2026-05-06 17:03:09 +02:00