docs/source/lerobot-dataset-v3.mdx

# LeRobotDataset v3.0

`LeRobotDataset v3.0` is a standardized format for robot learning data. It provides unified access to multi-modal time-series data, sensorimotor signals and multi‑camera video, as well as rich metadata for indexing, search, and visualization on the Hugging Face Hub.

This docs will guide you to:

- Understand the v3.0 design and directory layout
- Record a dataset and push it to the Hub
- Load datasets for training with `LeRobotDataset`
- Stream datasets without downloading using `StreamingLeRobotDataset`
- Apply image transforms for data augmentation during training
- Migrate existing `v2.1` datasets to `v3.0`
- Experiment with other `LeRobotDataset` formats and implementations like Lance

## What’s new in `v3`

- **File-based storage**: Many episodes per Parquet/MP4 file (v2 used one file per episode).
- **Relational metadata**: Episode boundaries and lookups are resolved through metadata, not filenames.
- **Hub-native streaming**: Consume datasets directly from the Hub with `StreamingLeRobotDataset`.
- **Lower file-system pressure**: Fewer, larger files ⇒ faster initialization and fewer issues at scale.
- **Unified organization**: Clean directory layout with consistent path templates across data and videos.

## Installation

`LeRobotDataset v3.0` will be included in `lerobot >= 0.4.0`.

Until that stable release, you can use the main branch by following the [build from source instructions](./installation#from-source).

## Record a dataset

Run the command below to record a dataset with the SO-101 and push to the Hub:

```bash
lerobot-record \
  --robot.type=so101_follower \
  --robot.port=/dev/tty.usbmodem585A0076841 \
  --robot.id=my_awesome_follower_arm \
  --robot.cameras="{ front: {type: opencv, index_or_path: 0, width: 1920, height: 1080, fps: 30}}" \
  --teleop.type=so101_leader \
  --teleop.port=/dev/tty.usbmodem58760431551 \
  --teleop.id=my_awesome_leader_arm \
  --display_data=true \
  --dataset.repo_id=${HF_USER}/record-test \
  --dataset.num_episodes=5 \
  --dataset.single_task="Grab the black cube" \
  --dataset.streaming_encoding=true \
  # --dataset.camera_encoder.vcodec=auto \
  --dataset.encoder_threads=2
```

See the [recording guide](./il_robots#record-a-dataset) for more details.

## Format design

A core v3 principle is **decoupling storage from the user API**: data is stored efficiently (few large files), while the public API exposes intuitive episode-level access.

`v3` has three pillars:

1. **Tabular data**: Low‑dimensional, high‑frequency signals (states, actions, timestamps) stored in **Apache Parquet**. Access is memory‑mapped or streamed via the `datasets` stack.
2. **Visual data**: Camera frames concatenated and encoded into **MP4**. Frames from the same episode are grouped; videos are sharded per camera for practical sizes.
3. **Metadata**: JSON/Parquet records describing schema (feature names, dtypes, shapes), frame rates, normalization stats, and **episode segmentation** (start/end offsets into shared Parquet/MP4 files).

> To scale to millions of episodes, tabular rows and video frames from multiple episodes are **concatenated** into larger files. Episode‑specific views are reconstructed **via metadata**, not file boundaries.

<div style="display:flex; justify-content:center; gap:12px; flex-wrap:wrap;">
  <figure style="margin:0; text-align:center;">
    <img
      src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobotdataset-v3/asset1datasetv3.png"
      alt="LeRobotDataset v3 diagram"
      width="220"
    />
    <figcaption style="font-size:0.9em; color:#666;">
      From episode‑based to file‑based datasets
    </figcaption>
  </figure>
</div>

### Directory layout (simplified)

- **`meta/info.json`**: canonical schema (features, shapes/dtypes), FPS, codebase version, and **path templates** to locate data/video shards.
- **`meta/stats.json`**: global feature statistics (mean/std/min/max) used for normalization; exposed as `dataset.meta.stats`.
- **`meta/tasks.jsonl`**: natural‑language task descriptions mapped to integer IDs for task‑conditioned policies.
- **`meta/episodes/`**: per‑episode records (lengths, tasks, offsets) stored as **chunked Parquet** for scalability.
- **`data/`**: frame‑by‑frame **Parquet** shards; each file typically contains **many episodes**.
- **`videos/`**: **MP4** shards per camera; each file typically contains **many episodes**.

## Load a dataset for training

`LeRobotDataset` returns Python dictionaries of PyTorch tensors and integrates with `torch.utils.data.DataLoader`. Here is a code example showing its use:

```python
import torch
from lerobot.datasets import LeRobotDataset

repo_id = "yaak-ai/L2D-v3"

# 1) Load from the Hub (cached locally)
dataset = LeRobotDataset(repo_id)

# 2) Random access by index
sample = dataset[100]
print(sample)
# {
#   'observation.state': tensor([...]),
#   'action': tensor([...]),
#   'observation.images.front_left': tensor([C, H, W]),
#   'timestamp': tensor(1.234),
#   ...
# }

# 3) Temporal windows via delta_timestamps (seconds relative to t)
delta_timestamps = {
    "observation.images.front_left": [-0.2, -0.1, 0.0]  # 0.2s and 0.1s before current frame
}

dataset = LeRobotDataset(repo_id, delta_timestamps=delta_timestamps)

# Accessing an index now returns a stack for the specified key(s)
sample = dataset[100]
print(sample["observation.images.front_left"].shape)  # [T, C, H, W], where T=3

# 4) Wrap with a DataLoader for training
batch_size = 16
data_loader = torch.utils.data.DataLoader(dataset, batch_size=batch_size)

device = "cuda" if torch.cuda.is_available() else "cpu"
for batch in data_loader:
    observations = batch["observation.state"].to(device)
    actions = batch["action"].to(device)
    images = batch["observation.images.front_left"].to(device)
    # model.forward(batch)
```

## Stream a dataset (no downloads)

Use `StreamingLeRobotDataset` to iterate directly from the Hub without local copies. This allows to stream large datasets without the need to downloading them onto disk or loading them onto memory, and is a key feature of the new dataset format.

```python
from lerobot.datasets import StreamingLeRobotDataset

repo_id = "yaak-ai/L2D-v3"
dataset = StreamingLeRobotDataset(repo_id)  # streams directly from the Hub
```

<div style="display:flex; justify-content:center; gap:12px; flex-wrap:wrap;">
  <figure style="margin:0; text-align:center;">
    <img
      src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobotdataset-v3/streaming-lerobot.png"
      alt="StreamingLeRobotDataset"
      width="520"
    />
    <figcaption style="font-size:0.9em; color:#666;">
      Stream directly from the Hub for on‑the‑fly training.
    </figcaption>
  </figure>
</div>

## Image transforms

Image transforms are data augmentations applied to camera frames during training to improve model robustness and generalization. LeRobot supports various transforms including brightness, contrast, saturation, hue, and sharpness adjustments.

### Using transforms during dataset creation/recording

Currently, transforms are applied during **training time only**, not during recording. When you create or record a dataset, the raw images are stored without transforms. This allows you to experiment with different augmentations later without re-recording data.

### Adding transforms to existing datasets (API)

Use the `image_transforms` parameter when loading a dataset for training:

```python
from lerobot.datasets import LeRobotDataset
from lerobot.transforms import ImageTransforms, ImageTransformsConfig, ImageTransformConfig

# Option 1: Use default transform configuration (disabled by default)
transforms_config = ImageTransformsConfig(
    enable=True,  # Enable transforms
    max_num_transforms=3,  # Apply up to 3 transforms per frame
    random_order=False,  # Apply in standard order
)
transforms = ImageTransforms(transforms_config)

dataset = LeRobotDataset(
    repo_id="your-username/your-dataset",
    image_transforms=transforms
)

# Option 2: Create custom transform configuration
custom_transforms_config = ImageTransformsConfig(
    enable=True,
    max_num_transforms=2,
    random_order=True,
    tfs={
        "brightness": ImageTransformConfig(
            weight=1.0,
            type="ColorJitter",
            kwargs={"brightness": (0.7, 1.3)}  # Adjust brightness range
        ),
        "contrast": ImageTransformConfig(
            weight=2.0,  # Higher weight = more likely to be selected
            type="ColorJitter",
            kwargs={"contrast": (0.8, 1.2)}
        ),
        "sharpness": ImageTransformConfig(
            weight=0.5,  # Lower weight = less likely to be selected
            type="SharpnessJitter",
            kwargs={"sharpness": (0.3, 2.0)}
        ),
    }
)

dataset = LeRobotDataset(
    repo_id="your-username/your-dataset",
    image_transforms=ImageTransforms(custom_transforms_config)
)

# Option 3: Use pure torchvision transforms
from torchvision.transforms import v2

torchvision_transforms = v2.Compose([
    v2.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1),
    v2.GaussianBlur(kernel_size=3, sigma=(0.1, 2.0)),
])

dataset = LeRobotDataset(
    repo_id="your-username/your-dataset",
    image_transforms=torchvision_transforms
)
```

### Available transform types

LeRobot provides several transform types:

- **`ColorJitter`**: Adjusts brightness, contrast, saturation, and hue
- **`SharpnessJitter`**: Randomly adjusts image sharpness
- **`Identity`**: No transformation (useful for testing)

You can also use any `torchvision.transforms.v2` transform by passing it directly to the `image_transforms` parameter.

### Configuration options

- **`enable`**: Enable/disable transforms (default: `False`)
- **`max_num_transforms`**: Maximum number of transforms applied per frame (default: `3`)
- **`random_order`**: Apply transforms in random order vs. standard order (default: `False`)
- **`weight`**: Sampling probability for each transform (higher = more likely, if sum of weights is not 1, they will be normalized)
- **`kwargs`**: Transform-specific parameters (e.g., brightness range)

### Visualizing transforms

Use the visualization script to preview how transforms affect your data:

```bash
lerobot-imgtransform-viz \
  --repo-id=your-username/your-dataset \
  --output-dir=./transform_examples \
  --n-examples=5
```

This saves example images showing the effect of each transform, helping you tune parameters.

### Best practices

- **Start conservative**: Begin with small ranges (e.g., brightness 0.9-1.1) and increase gradually
- **Test first**: Use the visualization script to ensure transforms look reasonable
- **Monitor training**: Strong augmentations can hurt performance if too aggressive
- **Match your domain**: If your robot operates in varying lighting, use brightness/contrast transforms
- **Combine wisely**: Using too many transforms simultaneously can make training unstable

## Migrate `v2.1` → `v3.0`

A converter aggregates per‑episode files into larger shards and writes episode offsets/metadata. Convert your dataset using the instructions below.

```bash
# Pre-release build with v3 support:
pip install "https://github.com/huggingface/lerobot/archive/33cad37054c2b594ceba57463e8f11ee374fa93c.zip"

# Convert an existing v2.1 dataset hosted on the Hub:
python -m lerobot.datasets.v30.convert_dataset_v21_to_v30 --repo-id=<HF_USER/DATASET_ID>
```

**What it does**

- Aggregates parquet files: `episode-0000.parquet`, `episode-0001.parquet`, … → **`file-0000.parquet`**, …
- Aggregates mp4 files: `episode-0000.mp4`, `episode-0001.mp4`, … → **`file-0000.mp4`**, …
- Updates `meta/episodes/*` (chunked Parquet) with per‑episode lengths, tasks, and byte/frame offsets.

## Common Issues

### Always call `finalize()` before pushing

When creating or recording datasets, you **must** call `dataset.finalize()` to properly close parquet writers. See the [PR #1903](https://github.com/huggingface/lerobot/pull/1903) for more details.

```python
from lerobot.datasets import LeRobotDataset

# Create dataset and record episodes
dataset = LeRobotDataset.create(...)

for episode in range(num_episodes):
    # Record frames
    for frame in episode_data:
        dataset.add_frame(frame)
    dataset.save_episode()

# Call finalize() when done recording and before push_to_hub()
dataset.finalize()  # Closes parquet writers, writes metadata footers
dataset.push_to_hub()
```

**Why is this necessary?**

Dataset v3.0 uses incremental parquet writing with buffered metadata for efficiency. The `finalize()` method:

- Flushes any buffered episode metadata to disk
- Closes parquet writers to write footer metadata, otherwise the parquet files will be corrupt
- Ensures the dataset is valid for loading

Without calling `finalize()`, your parquet files will be incomplete and the dataset won't load properly.

## Other formats and implementations

### Lance

Lance is a useful format for multimodal AI datasets, especially for large-scale training requiring high performance IO and random access.

The `lerobot-lancedb` package implements `LeRobotLanceDataset` (for JPEG images) and `LeRobotLanceVideoDataset` (for mp4 videos).
Those two storage layouts both subclass LeRobotDataset and can provide data loading speed ups.

`LeRobotLanceDataset` is a drop-in replacement for `LeRobotDataset`:

```python
from lerobot.datasets import LeRobotDatasetMetadata
from lerobot.policies.diffusion.configuration_diffusion import DiffusionConfig
from lerobot_lancedb import LeRobotLanceDataset, LeRobotLanceVideoDataset

cfg = DiffusionConfig(...)
meta = LeRobotDatasetMetadata(root=local_dataset_path)  # or use repo_id=... to load metadata from the Hub
delta_timestamps = {...}

# Use LeRobotLanceDataset for image datasets
dataset = LeRobotLanceDataset(
    root=local_dataset_path,                            # or use repo_id=... to stream from the Hub
    delta_timestamps=delta_timestamps,
    return_uint8=True,
)
# Or use LeRobotLanceVideoDataset for video datasets:
dataset = LeRobotLanceVideoDataset(
    root=local_dataset_path,                            # or use repo_id=... to stream from the Hub
    delta_timestamps=delta_timestamps,
    return_uint8=True,
)
```

Join the discussion on [Github](https://github.com/huggingface/lerobot/issues/3608) and explore the `lerobot-lancedb` documentation [here](https://lancedb.github.io/lerobot-lancedb/).
-												docs(dataset): add dataset v3 documentation (#1956)

* add v3 doc

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

* update changes

* iterate on review

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add changes

* create dataset section

* Update docs/source/lerobot-dataset-v3.mdx

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

* Update docs/source/lerobot-dataset-v3.mdx

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

* Update docs/source/lerobot-dataset-v3.mdx

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

---------

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
											
										
										
											2025-09-16 17:45:38 +02:00
+								# LeRobotDataset v3.0
 								`LeRobotDataset v3.0` is a standardized format for robot learning data. It provides unified access to multi-modal time-series data, sensorimotor signals and multi‑camera video, as well as rich metadata for indexing, search, and visualization on the Hugging Face Hub.
 								This docs will guide you to:
 								- Understand the v3.0 design and directory layout
 								- Record a dataset and push it to the Hub
 								- Load datasets for training with `LeRobotDataset`
 								- Stream datasets without downloading using `StreamingLeRobotDataset`
-												Add docs for LeRobot Image transforms (#1972)

* Remove unused scripts, add docs for image transforms and add example

* fix(examples): move train_policy.py under examples, remove outdated readme parts

* remove script thats copied to train folder

* remove outdated links to examples and example tests
											
										
										
											2025-09-19 15:19:49 +02:00
+								- Apply image transforms for data augmentation during training
-												docs(dataset): add dataset v3 documentation (#1956)

* add v3 doc

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

* update changes

* iterate on review

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add changes

* create dataset section

* Update docs/source/lerobot-dataset-v3.mdx

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

* Update docs/source/lerobot-dataset-v3.mdx

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

* Update docs/source/lerobot-dataset-v3.mdx

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

---------

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
											
										
										
											2025-09-16 17:45:38 +02:00
+								- Migrate existing `v2.1` datasets to `v3.0`
-												Mention the new Lance LeRobotDataset implementation in the docs (#3609)

* Enhance documentation with Lance format details

Added information about Lance format and `lerobot-lancedb` package for multimodal AI datasets.

Signed-off-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
											
										
										
											2026-05-18 14:51:26 +02:00
+								- Experiment with other `LeRobotDataset` formats and implementations like Lance
-												docs(dataset): add dataset v3 documentation (#1956)

* add v3 doc

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

* update changes

* iterate on review

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add changes

* create dataset section

* Update docs/source/lerobot-dataset-v3.mdx

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

* Update docs/source/lerobot-dataset-v3.mdx

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

* Update docs/source/lerobot-dataset-v3.mdx

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

---------

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
											
										
										
											2025-09-16 17:45:38 +02:00
 								## What’s new in `v3`
 								- **File-based storage**: Many episodes per Parquet/MP4 file (v2 used one file per episode).
 								- **Relational metadata**: Episode boundaries and lookups are resolved through metadata, not filenames.
 								- **Hub-native streaming**: Consume datasets directly from the Hub with `StreamingLeRobotDataset`.
 								- **Lower file-system pressure**: Fewer, larger files ⇒ faster initialization and fewer issues at scale.
 								- **Unified organization**: Clean directory layout with consistent path templates across data and videos.
 								## Installation
 								`LeRobotDataset v3.0` will be included in `lerobot >= 0.4.0`.
 								Until that stable release, you can use the main branch by following the [build from source instructions](./installation#from-source).
 								## Record a dataset
 								Run the command below to record a dataset with the SO-101 and push to the Hub:
 								```bash
 								lerobot-record \
 								  --robot.type=so101_follower \
 								  --robot.port=/dev/tty.usbmodem585A0076841 \
 								  --robot.id=my_awesome_follower_arm \
 								  --robot.cameras="{ front: {type: opencv, index_or_path: 0, width: 1920, height: 1080, fps: 30}}" \
 								  --teleop.type=so101_leader \
 								  --teleop.port=/dev/tty.usbmodem58760431551 \
 								  --teleop.id=my_awesome_leader_arm \
 								  --display_data=true \
 								  --dataset.repo_id=${HF_USER}/record-test \
 								  --dataset.num_episodes=5 \
-												feat(dataset): add streaming video encoding + HW encoder support (#2974)

* feat(dataset): init stream encoding

* feat(dataset): use threads to fix frame pickle latency

* refactor(dataset): remove HW encoded related changes

* add lp (#2977)

* feat(dataset): add Hw encoding + log drop frames (#2978)

* chore(docs): add streaming video encoding guide

* fix(dataset): style docs + testing

* chore(docs): simplify sttreaming video encoding guide

* chore(dataset): add commands + streaming encoding default false + print note if false + queue default is now 30

* chore(docs): add verification note advice

* chore(dataset): adjusting defaults & docs for streaming encoding

* docs(scripts): improve docstrings

* test(dataset): polish streaming encoding tests

* chore(dataset): move FYI log related to streaming

* chore(dataset): add arg vcodec to suggestions

* refactor(dataset): better handling for auto and available vcodec

* chore(dataset): change log level

* docs(dataset): add note related to training performance vcodec

* docs(dataset): add more notes to streaming encoding

---------

Co-authored-by: Caroline Pascal <caroline8.pascal@gmail.com>
Co-authored-by: Pepijn <pepijn@huggingface.co>

											
										
										
											2026-02-23 13:57:43 +01:00
+								  --dataset.single_task="Grab the black cube" \
 								  --dataset.streaming_encoding=true \
-												feat(encoding parameters): adding support for user provided video encoding parameters  (#3455)

* chore(video backend): renaming codec into video_backend in get_safe_default_video_backend()

* feat(pyav utils): adding suport for PyAV encoding parameters validation

* feat(VideoEncoderConfig): creating a VideoEncoderConfig to encapsulate encoding parameters

* feat(VideoEncoderConfig): propagating the VideoEncoderConfig in the codebase

* chore(docs): updating the docs

* feat(metadata): adding encoding parameters in dataset metadata

* fix(concatenation compatibility): adding compatibility check when concatenating video files

* feat(VideoEncoderConfig init): making VideoEncoderConfig more robust and adaptable to multiple backends

* feat(pyav checks): making pyav parameters checks more robust

* chore(duplicate): removing duplicate get_codec_options definition

* test(existing): adapting existing tests

* test(new): adding new tests for encoding related features

* chore(format): fixing formatting issues

* chore(PyAV): cleaning up PyAV utils and encoding parameters checks to stick to the minimun required tooling.

* chore(format): formatting code

* chore(doctrings): updating docstrings

* fix(camera_encoder_config): Removing camera_encoder_config from LeRobotDataset, as it's only required in LeRobotDatasetWriter.

* feat(default values): applying a consistent naming convention for default RGB cameras video encoder parameters

* fix(rollout): propagating VideoEncoderConfig to the latest recording modes

* chore(format): formatting code, fixing error messages and variable names

* fix(arguments order): reverting changes in arguments order in StreamingVideoEncoder

* chore(relative imports): switching to relative local imports within lerobot.datasets

* test(artifacts): cleaning up artifacts for the video encoding tests

* chore(docs): updating docs

* chore(fromat): formatting code

* fix(imports): refactoring the file architecture to avoid circular imports. VideoEncoderConfig is now defined in lerobot.configs and lazily imports av at runtime.

* fix(typos): fixing typos and small mistakes

* test(factories): updating factories

* feat(aggregate): updating dataset aggregation procedure. Encoding tuning paramters (crf, g,...) are ignored for validation and changed to None in the aggregated dataset if incompatible.

* docs(typos): fixing typos

* fix(deletion): reverting unwanted deletion

* fix(typos): fixing multiple typos

* feat(codec options): passing codec options to lerobot_edit_dataset episode deletion tool

* typo(typo): typo

* fix(typos): fixing remaining typos

* chore(rename): renaming camera_encoder_config to camera_encoder

* docs(clean): cleaning and formating docs

* docs(dataset): addind details about datasets

* chore(format): formatting code

* docs(warning): adding warning regarding encoding parameters modification

* fix(re-encoding): removing inconsistent re-encoding option in lerobot_edit_dataset

* typos(typos): typos

* chore(format): resolving prettier issues

* fix(h264_nvenc): fixing crf handling for h264_nvenc

* docs(clean): removing too technical parts of the docs

* fix(imports): fixing imports at the __init__ level

* fix(imports): fixing not very pretty imports in video config file
											
										
										
											2026-05-14 23:46:42 +02:00
+								  # --dataset.camera_encoder.vcodec=auto \
-												feat(dataset): add streaming video encoding + HW encoder support (#2974)

* feat(dataset): init stream encoding

* feat(dataset): use threads to fix frame pickle latency

* refactor(dataset): remove HW encoded related changes

* add lp (#2977)

* feat(dataset): add Hw encoding + log drop frames (#2978)

* chore(docs): add streaming video encoding guide

* fix(dataset): style docs + testing

* chore(docs): simplify sttreaming video encoding guide

* chore(dataset): add commands + streaming encoding default false + print note if false + queue default is now 30

* chore(docs): add verification note advice

* chore(dataset): adjusting defaults & docs for streaming encoding

* docs(scripts): improve docstrings

* test(dataset): polish streaming encoding tests

* chore(dataset): move FYI log related to streaming

* chore(dataset): add arg vcodec to suggestions

* refactor(dataset): better handling for auto and available vcodec

* chore(dataset): change log level

* docs(dataset): add note related to training performance vcodec

* docs(dataset): add more notes to streaming encoding

---------

Co-authored-by: Caroline Pascal <caroline8.pascal@gmail.com>
Co-authored-by: Pepijn <pepijn@huggingface.co>

											
										
										
											2026-02-23 13:57:43 +01:00
+								  --dataset.encoder_threads=2
-												docs(dataset): add dataset v3 documentation (#1956)

* add v3 doc

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

* update changes

* iterate on review

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add changes

* create dataset section

* Update docs/source/lerobot-dataset-v3.mdx

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

* Update docs/source/lerobot-dataset-v3.mdx

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

* Update docs/source/lerobot-dataset-v3.mdx

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

---------

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
											
										
										
											2025-09-16 17:45:38 +02:00
+								```
 								See the [recording guide](./il_robots#record-a-dataset) for more details.
 								## Format design
 								A core v3 principle is **decoupling storage from the user API**: data is stored efficiently (few large files), while the public API exposes intuitive episode-level access.
 								`v3` has three pillars:
 . **Tabular data**: Low‑dimensional, high‑frequency signals (states, actions, timestamps) stored in **Apache Parquet**. Access is memory‑mapped or streamed via the `datasets` stack.
 . **Visual data**: Camera frames concatenated and encoded into **MP4**. Frames from the same episode are grouped; videos are sharded per camera for practical sizes.
 . **Metadata**: JSON/Parquet records describing schema (feature names, dtypes, shapes), frame rates, normalization stats, and **episode segmentation** (start/end offsets into shared Parquet/MP4 files).
 								> To scale to millions of episodes, tabular rows and video frames from multiple episodes are **concatenated** into larger files. Episode‑specific views are reconstructed **via metadata**, not file boundaries.
 								<div style="display:flex; justify-content:center; gap:12px; flex-wrap:wrap;">
 								  <figure style="margin:0; text-align:center;">
 								    <img
 								      src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobotdataset-v3/asset1datasetv3.png"
 								      alt="LeRobotDataset v3 diagram"
 								      width="220"
 								    />
 								    <figcaption style="font-size:0.9em; color:#666;">
 								      From episode‑based to file‑based datasets
 								    </figcaption>
 								  </figure>
 								</div>
 								### Directory layout (simplified)
 								- **`meta/info.json`**: canonical schema (features, shapes/dtypes), FPS, codebase version, and **path templates** to locate data/video shards.
 								- **`meta/stats.json`**: global feature statistics (mean/std/min/max) used for normalization; exposed as `dataset.meta.stats`.
 								- **`meta/tasks.jsonl`**: natural‑language task descriptions mapped to integer IDs for task‑conditioned policies.
 								- **`meta/episodes/`**: per‑episode records (lengths, tasks, offsets) stored as **chunked Parquet** for scalability.
 								- **`data/`**: frame‑by‑frame **Parquet** shards; each file typically contains **many episodes**.
 								- **`videos/`**: **MP4** shards per camera; each file typically contains **many episodes**.
 								## Load a dataset for training
 								`LeRobotDataset` returns Python dictionaries of PyTorch tensors and integrates with `torch.utils.data.DataLoader`. Here is a code example showing its use:
 								```python
 								import torch
-												feat(dependencies): minimal default tag install (#3362)
											
										
										
											2026-04-12 20:03:04 +02:00
+								from lerobot.datasets import LeRobotDataset
-												docs(dataset): add dataset v3 documentation (#1956)

* add v3 doc

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

* update changes

* iterate on review

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add changes

* create dataset section

* Update docs/source/lerobot-dataset-v3.mdx

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

* Update docs/source/lerobot-dataset-v3.mdx

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

* Update docs/source/lerobot-dataset-v3.mdx

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

---------

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
											
										
										
											2025-09-16 17:45:38 +02:00
 								repo_id = "yaak-ai/L2D-v3"
 								# 1) Load from the Hub (cached locally)
 								dataset = LeRobotDataset(repo_id)
 								# 2) Random access by index
 								sample = dataset[100]
 								print(sample)
 								# {
 								#   'observation.state': tensor([...]),
 								#   'action': tensor([...]),
 								#   'observation.images.front_left': tensor([C, H, W]),
 								#   'timestamp': tensor(1.234),
 								#   ...
 								# }
 								# 3) Temporal windows via delta_timestamps (seconds relative to t)
 								delta_timestamps = {
 								    "observation.images.front_left": [-0.2, -0.1, 0.0]  # 0.2s and 0.1s before current frame
 								}
 								dataset = LeRobotDataset(repo_id, delta_timestamps=delta_timestamps)
 								# Accessing an index now returns a stack for the specified key(s)
 								sample = dataset[100]
 								print(sample["observation.images.front_left"].shape)  # [T, C, H, W], where T=3
 								# 4) Wrap with a DataLoader for training
 								batch_size = 16
 								data_loader = torch.utils.data.DataLoader(dataset, batch_size=batch_size)
 								device = "cuda" if torch.cuda.is_available() else "cpu"
 								for batch in data_loader:
 								    observations = batch["observation.state"].to(device)
 								    actions = batch["action"].to(device)
 								    images = batch["observation.images.front_left"].to(device)
 								    # model.forward(batch)
 								```
 								## Stream a dataset (no downloads)
 								Use `StreamingLeRobotDataset` to iterate directly from the Hub without local copies. This allows to stream large datasets without the need to downloading them onto disk or loading them onto memory, and is a key feature of the new dataset format.
 								```python
-												feat(dependencies): minimal default tag install (#3362)
											
										
										
											2026-04-12 20:03:04 +02:00
+								from lerobot.datasets import StreamingLeRobotDataset
-												docs(dataset): add dataset v3 documentation (#1956)

* add v3 doc

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

* update changes

* iterate on review

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add changes

* create dataset section

* Update docs/source/lerobot-dataset-v3.mdx

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

* Update docs/source/lerobot-dataset-v3.mdx

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

* Update docs/source/lerobot-dataset-v3.mdx

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

---------

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
											
										
										
											2025-09-16 17:45:38 +02:00
 								repo_id = "yaak-ai/L2D-v3"
 								dataset = StreamingLeRobotDataset(repo_id)  # streams directly from the Hub
 								```
 								<div style="display:flex; justify-content:center; gap:12px; flex-wrap:wrap;">
 								  <figure style="margin:0; text-align:center;">
 								    <img
 								      src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lerobotdataset-v3/streaming-lerobot.png"
 								      alt="StreamingLeRobotDataset"
 								      width="520"
 								    />
 								    <figcaption style="font-size:0.9em; color:#666;">
 								      Stream directly from the Hub for on‑the‑fly training.
 								    </figcaption>
 								  </figure>
 								</div>
-												Add docs for LeRobot Image transforms (#1972)

* Remove unused scripts, add docs for image transforms and add example

* fix(examples): move train_policy.py under examples, remove outdated readme parts

* remove script thats copied to train folder

* remove outdated links to examples and example tests
											
										
										
											2025-09-19 15:19:49 +02:00
+								## Image transforms
 								Image transforms are data augmentations applied to camera frames during training to improve model robustness and generalization. LeRobot supports various transforms including brightness, contrast, saturation, hue, and sharpness adjustments.
 								### Using transforms during dataset creation/recording
 								Currently, transforms are applied during **training time only**, not during recording. When you create or record a dataset, the raw images are stored without transforms. This allows you to experiment with different augmentations later without re-recording data.
 								### Adding transforms to existing datasets (API)
 								Use the `image_transforms` parameter when loading a dataset for training:
 								```python
-												feat(dependencies): minimal default tag install (#3362)
											
										
										
											2026-04-12 20:03:04 +02:00
+								from lerobot.datasets import LeRobotDataset
 								from lerobot.transforms import ImageTransforms, ImageTransformsConfig, ImageTransformConfig
-												Add docs for LeRobot Image transforms (#1972)

* Remove unused scripts, add docs for image transforms and add example

* fix(examples): move train_policy.py under examples, remove outdated readme parts

* remove script thats copied to train folder

* remove outdated links to examples and example tests
											
										
										
											2025-09-19 15:19:49 +02:00
 								# Option 1: Use default transform configuration (disabled by default)
 								transforms_config = ImageTransformsConfig(
 								    enable=True,  # Enable transforms
 								    max_num_transforms=3,  # Apply up to 3 transforms per frame
 								    random_order=False,  # Apply in standard order
 								)
 								transforms = ImageTransforms(transforms_config)
 								dataset = LeRobotDataset(
 								    repo_id="your-username/your-dataset",
 								    image_transforms=transforms
 								)
 								# Option 2: Create custom transform configuration
 								custom_transforms_config = ImageTransformsConfig(
 								    enable=True,
 								    max_num_transforms=2,
 								    random_order=True,
 								    tfs={
 								        "brightness": ImageTransformConfig(
 								            weight=1.0,
 								            type="ColorJitter",
 								            kwargs={"brightness": (0.7, 1.3)}  # Adjust brightness range
 								        ),
 								        "contrast": ImageTransformConfig(
 								            weight=2.0,  # Higher weight = more likely to be selected
 								            type="ColorJitter",
 								            kwargs={"contrast": (0.8, 1.2)}
 								        ),
 								        "sharpness": ImageTransformConfig(
 								            weight=0.5,  # Lower weight = less likely to be selected
 								            type="SharpnessJitter",
 								            kwargs={"sharpness": (0.3, 2.0)}
 								        ),
 								    }
 								)
 								dataset = LeRobotDataset(
 								    repo_id="your-username/your-dataset",
 								    image_transforms=ImageTransforms(custom_transforms_config)
 								)
 								# Option 3: Use pure torchvision transforms
 								from torchvision.transforms import v2
 								torchvision_transforms = v2.Compose([
 								    v2.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1),
 								    v2.GaussianBlur(kernel_size=3, sigma=(0.1, 2.0)),
 								])
 								dataset = LeRobotDataset(
 								    repo_id="your-username/your-dataset",
 								    image_transforms=torchvision_transforms
 								)
 								```
 								### Available transform types
 								LeRobot provides several transform types:
 								- **`ColorJitter`**: Adjusts brightness, contrast, saturation, and hue
 								- **`SharpnessJitter`**: Randomly adjusts image sharpness
 								- **`Identity`**: No transformation (useful for testing)
 								You can also use any `torchvision.transforms.v2` transform by passing it directly to the `image_transforms` parameter.
 								### Configuration options
 								- **`enable`**: Enable/disable transforms (default: `False`)
 								- **`max_num_transforms`**: Maximum number of transforms applied per frame (default: `3`)
 								- **`random_order`**: Apply transforms in random order vs. standard order (default: `False`)
 								- **`weight`**: Sampling probability for each transform (higher = more likely, if sum of weights is not 1, they will be normalized)
 								- **`kwargs`**: Transform-specific parameters (e.g., brightness range)
 								### Visualizing transforms
 								Use the visualization script to preview how transforms affect your data:
 								```bash
-												feat(script): add entry point for image transform viz (#2007)

* feat(Scripts): add entry point for img transform viz

* chore(style): pre-commit style
											
										
										
											2025-09-23 18:47:36 +02:00
+								lerobot-imgtransform-viz \
-												Add docs for LeRobot Image transforms (#1972)

* Remove unused scripts, add docs for image transforms and add example

* fix(examples): move train_policy.py under examples, remove outdated readme parts

* remove script thats copied to train folder

* remove outdated links to examples and example tests
											
										
										
											2025-09-19 15:19:49 +02:00
+								  --repo-id=your-username/your-dataset \
 								  --output-dir=./transform_examples \
 								  --n-examples=5
 								```
 								This saves example images showing the effect of each transform, helping you tune parameters.
 								### Best practices
 								- **Start conservative**: Begin with small ranges (e.g., brightness 0.9-1.1) and increase gradually
 								- **Test first**: Use the visualization script to ensure transforms look reasonable
 								- **Monitor training**: Strong augmentations can hurt performance if too aggressive
 								- **Match your domain**: If your robot operates in varying lighting, use brightness/contrast transforms
 								- **Combine wisely**: Using too many transforms simultaneously can make training unstable
-												docs(dataset): add dataset v3 documentation (#1956)

* add v3 doc

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

* update changes

* iterate on review

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add changes

* create dataset section

* Update docs/source/lerobot-dataset-v3.mdx

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

* Update docs/source/lerobot-dataset-v3.mdx

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

* Update docs/source/lerobot-dataset-v3.mdx

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

---------

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>
											
										
										
											2025-09-16 17:45:38 +02:00
+								## Migrate `v2.1` → `v3.0`
 								A converter aggregates per‑episode files into larger shards and writes episode offsets/metadata. Convert your dataset using the instructions below.
 								```bash
 								# Pre-release build with v3 support:
 								pip install "https://github.com/huggingface/lerobot/archive/33cad37054c2b594ceba57463e8f11ee374fa93c.zip"
 								# Convert an existing v2.1 dataset hosted on the Hub:
 								python -m lerobot.datasets.v30.convert_dataset_v21_to_v30 --repo-id=<HF_USER/DATASET_ID>
 								```
 								**What it does**
 								- Aggregates parquet files: `episode-0000.parquet`, `episode-0001.parquet`, … → **`file-0000.parquet`**, …
 								- Aggregates mp4 files: `episode-0000.mp4`, `episode-0001.mp4`, … → **`file-0000.mp4`**, …
 								- Updates `meta/episodes/*` (chunked Parquet) with per‑episode lengths, tasks, and byte/frame offsets.
-												Add missing finalize calls in example (#2175)

- add missing calls to dataset.finalize in the example recording scripts
- add section in the dataset docs on calling dataset.finalize

											
										
										
											2025-10-11 21:15:43 +02:00
 								## Common Issues
 								### Always call `finalize()` before pushing
 								When creating or recording datasets, you **must** call `dataset.finalize()` to properly close parquet writers. See the [PR #1903](https://github.com/huggingface/lerobot/pull/1903) for more details.
 								```python
-												feat(dependencies): minimal default tag install (#3362)
											
										
										
											2026-04-12 20:03:04 +02:00
+								from lerobot.datasets import LeRobotDataset
-												Add missing finalize calls in example (#2175)

- add missing calls to dataset.finalize in the example recording scripts
- add section in the dataset docs on calling dataset.finalize

											
										
										
											2025-10-11 21:15:43 +02:00
 								# Create dataset and record episodes
 								dataset = LeRobotDataset.create(...)
 								for episode in range(num_episodes):
 								    # Record frames
 								    for frame in episode_data:
 								        dataset.add_frame(frame)
 								    dataset.save_episode()
 								# Call finalize() when done recording and before push_to_hub()
 								dataset.finalize()  # Closes parquet writers, writes metadata footers
 								dataset.push_to_hub()
 								```
 								**Why is this necessary?**
 								Dataset v3.0 uses incremental parquet writing with buffered metadata for efficiency. The `finalize()` method:
 								- Flushes any buffered episode metadata to disk
 								- Closes parquet writers to write footer metadata, otherwise the parquet files will be corrupt
 								- Ensures the dataset is valid for loading
 								Without calling `finalize()`, your parquet files will be incomplete and the dataset won't load properly.
-												Mention the new Lance LeRobotDataset implementation in the docs (#3609)

* Enhance documentation with Lance format details

Added information about Lance format and `lerobot-lancedb` package for multimodal AI datasets.

Signed-off-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
											
										
										
											2026-05-18 14:51:26 +02:00
 								## Other formats and implementations
 								### Lance
 								Lance is a useful format for multimodal AI datasets, especially for large-scale training requiring high performance IO and random access.
 								The `lerobot-lancedb` package implements `LeRobotLanceDataset` (for JPEG images) and `LeRobotLanceVideoDataset` (for mp4 videos).
 								Those two storage layouts both subclass LeRobotDataset and can provide data loading speed ups.
 								`LeRobotLanceDataset` is a drop-in replacement for `LeRobotDataset`:
 								```python
 								from lerobot.datasets import LeRobotDatasetMetadata
 								from lerobot.policies.diffusion.configuration_diffusion import DiffusionConfig
 								from lerobot_lancedb import LeRobotLanceDataset, LeRobotLanceVideoDataset
 								cfg = DiffusionConfig(...)
 								meta = LeRobotDatasetMetadata(root=local_dataset_path)  # or use repo_id=... to load metadata from the Hub
 								delta_timestamps = {...}
 								# Use LeRobotLanceDataset for image datasets
 								dataset = LeRobotLanceDataset(
 								    root=local_dataset_path,                            # or use repo_id=... to stream from the Hub
 								    delta_timestamps=delta_timestamps,
 								    return_uint8=True,
 								)
 								# Or use LeRobotLanceVideoDataset for video datasets:
 								dataset = LeRobotLanceVideoDataset(
 								    root=local_dataset_path,                            # or use repo_id=... to stream from the Hub
 								    delta_timestamps=delta_timestamps,
 								    return_uint8=True,
 								)
 								```
 								Join the discussion on [Github](https://github.com/huggingface/lerobot/issues/3608) and explore the `lerobot-lancedb` documentation [here](https://lancedb.github.io/lerobot-lancedb/).