docs/source/pi05.mdx

# π₀.₅ (Pi05) Policy

π₀.₅ is a **Vision-Language-Action model with open-world generalization**, from Physical Intelligence. The LeRobot implementation is adapted from their open source [OpenPI](https://github.com/Physical-Intelligence/openpi) repository.

## Model Overview

π₀.₅ represents a significant evolution from π₀, developed by [Physical Intelligence](https://www.physicalintelligence.company/blog/pi05) to address a big challenge in robotics: **open-world generalization**. While robots can perform impressive tasks in controlled environments, π₀.₅ is designed to generalize to entirely new environments and situations that were never seen during training.

### The Generalization Challenge

As Physical Intelligence explains, the fundamental challenge isn't performing tasks of agility or dexterity, but generalization, the ability to correctly perform tasks in new settings with new objects. Consider a robot cleaning different homes: each home has different objects in different places. Generalization must occur at multiple levels:

- **Physical Level**: Understanding how to pick up a spoon (by the handle) or plate (by the edge), even with unseen objects in cluttered environments
- **Semantic Level**: Understanding task semantics, where to put clothes and shoes (laundry hamper, not on the bed), and what tools are appropriate for cleaning spills
- **Environmental Level**: Adapting to "messy" real-world environments like homes, grocery stores, offices, and hospitals

### Co-Training on Heterogeneous Data

The breakthrough innovation in π₀.₅ is **co-training on heterogeneous data sources**. The model learns from:

1. **Multimodal Web Data**: Image captioning, visual question answering, object detection
2. **Verbal Instructions**: Humans coaching robots through complex tasks step-by-step
3. **Subtask Commands**: High-level semantic behavior labels (e.g., "pick up the pillow" for an unmade bed)
4. **Cross-Embodiment Robot Data**: Data from various robot platforms with different capabilities
5. **Multi-Environment Data**: Static robots deployed across many different homes
6. **Mobile Manipulation Data**: ~400 hours of mobile robot demonstrations

This diverse training mixture creates a "curriculum" that enables generalization across physical, visual, and semantic levels simultaneously.

## Installation Requirements

⚠️ **Warning**: This policy requires patching the Hugging Face `transformers` library.

### Prerequisites

1. Ensure you have the exact version installed:

   ```bash
   pip show transformers
   ```

   It must be version **4.53.2**.

2. Apply the custom patches:
   ```bash
   cp -r ./src/lerobot/policies/pi05/transformers_replace/* \
     $(python -c "import transformers, os; print(os.path.dirname(transformers.__file__))")
   ```

### What the patches do:

- Support the **AdaRMS optimizer**
- Correctly control the precision of activations
- Allow the KV cache to be used without updates

**Important Notes:**

- This permanently modifies your `transformers` installation
- The changes survive reinstalls unless you explicitly remove the patched files or recreate the environment

### Restoring Clean State

To undo the patches and restore a clean state:

```bash
pip uninstall transformers
pip install transformers==4.53.2
```

## Usage

To use π₀.₅ in your LeRobot configuration, specify the policy type as:

```python
policy.type=pi05
```

## Training

### Training Command Example

Here's a complete training command for finetuning the base π₀.₅ model on your own dataset:

```bash
python src/lerobot/scripts/train.py \
    --dataset.repo_id=your_dataset \
    --policy.type=pi05 \
    --output_dir=./outputs/pi0_training \
    --job_name=pi0_training \
    --policy.repo_id=pepijn223/pi05_base_fp32 \
    --policy.pretrained_path=your_repo_id \
    --policy.compile_model=true \
    --policy.gradient_checkpointing=true \
    --wandb.enable=true \
    --policy.dtype=bfloat16 \
    --steps=3000 \
    --policy.scheduler_decay_steps=3000 \
    --policy.device=cuda \
    --batch_size=32
```

### Key Training Parameters

- **`--policy.compile_model=true`**: Enables model compilation for faster training
- **`--policy.gradient_checkpointing=true`**: Reduces memory usage significantly during training
- **`--policy.dtype=bfloat16`**: Use mixed precision training for efficiency
- **`--batch_size=32`**: Batch size for training, adapt this based on your GPU memory
- **`--policy.pretrained_path=pepijn223/pi05_base_fp32`**: The base π₀.₅ model you want to finetune, options are:
  - [pepijn223/pi05_base_fp32](https://huggingface.co/pepijn223/pi05_base_fp32)
  - [pepijn223/pi05_libero_fp32](https://huggingface.co/pepijn223/pi05_libero_fp32) (specifically trained on the Libero dataset)
  - [pepijn223/pi05_droid_fp32](https://huggingface.co/pepijn223/pi05_droid_fp32) (specifically trained on the Droid dataset)

## Performance Results

### Libero Benchmark Results

π₀.₅ has demonstrated strong performance on the Libero benchmark suite. To compare and test its LeRobot implementation, we finetuned the libero base model for an additional 6k steps on the Libero dataset and compared the results to the OpenPI reference results.

| Benchmark          | LeRobot Implementation | OpenPI Reference |
| ------------------ | ---------------------- | ---------------- |
| **Libero Spatial** | 98.0%                  | 98.8%            |
| **Libero Object**  | 99.0%                  | 98.2%            |
| **Libero Goal**    | 97.0%                  | 98.0%            |
| **Libero 10**      | 93.0%                  | 92.4%            |
| **Average**        | 96.75%                 | 96.85%           |

These results demonstrate π₀.₅'s strong generalization capabilities across diverse robotic manipulation tasks. To reproduce these results, you can follow the instructions in the [Libero](#libero) section.

## License

This model follows the **Apache 2.0 License**, consistent with the original [OpenPI repository](https://github.com/Physical-Intelligence/openpi).
Add docs 2025-09-16 10:09:42 +02:00			`# π₀.₅ (Pi05) Policy`

			`π₀.₅ is a Vision-Language-Action model with open-world generalization, from Physical Intelligence. The LeRobot implementation is adapted from their open source [OpenPI](https://github.com/Physical-Intelligence/openpi) repository.`

			`## Model Overview`

adapt docs pi05 2025-09-16 14:40:52 +02:00			`π₀.₅ represents a significant evolution from π₀, developed by [Physical Intelligence](https://www.physicalintelligence.company/blog/pi05) to address a big challenge in robotics: open-world generalization. While robots can perform impressive tasks in controlled environments, π₀.₅ is designed to generalize to entirely new environments and situations that were never seen during training.`
Add docs 2025-09-16 10:09:42 +02:00
			`### The Generalization Challenge`

adapt docs pi05 2025-09-16 14:40:52 +02:00			`As Physical Intelligence explains, the fundamental challenge isn't performing tasks of agility or dexterity, but generalization, the ability to correctly perform tasks in new settings with new objects. Consider a robot cleaning different homes: each home has different objects in different places. Generalization must occur at multiple levels:`
Add docs 2025-09-16 10:09:42 +02:00
			`- Physical Level: Understanding how to pick up a spoon (by the handle) or plate (by the edge), even with unseen objects in cluttered environments`
adapt docs pi05 2025-09-16 14:40:52 +02:00			`- Semantic Level: Understanding task semantics, where to put clothes and shoes (laundry hamper, not on the bed), and what tools are appropriate for cleaning spills`
Add docs 2025-09-16 10:09:42 +02:00			`- Environmental Level: Adapting to "messy" real-world environments like homes, grocery stores, offices, and hospitals`

			`### Co-Training on Heterogeneous Data`

			`The breakthrough innovation in π₀.₅ is co-training on heterogeneous data sources. The model learns from:`

			`1. Multimodal Web Data: Image captioning, visual question answering, object detection`
			`2. Verbal Instructions: Humans coaching robots through complex tasks step-by-step`
			`3. Subtask Commands: High-level semantic behavior labels (e.g., "pick up the pillow" for an unmade bed)`
			`4. Cross-Embodiment Robot Data: Data from various robot platforms with different capabilities`
			`5. Multi-Environment Data: Static robots deployed across many different homes`
			`6. Mobile Manipulation Data: ~400 hours of mobile robot demonstrations`

			`This diverse training mixture creates a "curriculum" that enables generalization across physical, visual, and semantic levels simultaneously.`

			`## Installation Requirements`

			⚠️ Warning: This policy requires patching the Hugging Face `transformers` library.

			`### Prerequisites`

			`1. Ensure you have the exact version installed:`

			```bash
			`pip show transformers`
			```

			`It must be version 4.53.2.`

			`2. Apply the custom patches:`
			```bash
Remove previous pi0 and rename pi0_openpi and pi05_openpi 2025-09-22 17:11:29 +02:00			`cp -r ./src/lerobot/policies/pi05/transformers_replace/* \`
Add docs 2025-09-16 10:09:42 +02:00			`$(python -c "import transformers, os; print(os.path.dirname(transformers.__file__))")`
			```

			`### What the patches do:`

			`- Support the AdaRMS optimizer`
			`- Correctly control the precision of activations`
			`- Allow the KV cache to be used without updates`

			`Important Notes:`

			- This permanently modifies your `transformers` installation
			`- The changes survive reinstalls unless you explicitly remove the patched files or recreate the environment`

			`### Restoring Clean State`

			`To undo the patches and restore a clean state:`

			```bash
			`pip uninstall transformers`
			`pip install transformers==4.53.2`
			```

			`## Usage`

			`To use π₀.₅ in your LeRobot configuration, specify the policy type as:`

			```python
Remove previous pi0 and rename pi0_openpi and pi05_openpi 2025-09-22 17:11:29 +02:00			`policy.type=pi05`
Add docs 2025-09-16 10:09:42 +02:00			```

			`## Training`

			`### Training Command Example`

			`Here's a complete training command for finetuning the base π₀.₅ model on your own dataset:`

			```bash
			`python src/lerobot/scripts/train.py \`
			`--dataset.repo_id=your_dataset \`
Remove previous pi0 and rename pi0_openpi and pi05_openpi 2025-09-22 17:11:29 +02:00			`--policy.type=pi05 \`
Add docs 2025-09-16 10:09:42 +02:00			`--output_dir=./outputs/pi0_training \`
			`--job_name=pi0_training \`
			`--policy.repo_id=pepijn223/pi05_base_fp32 \`
			`--policy.pretrained_path=your_repo_id \`
			`--policy.compile_model=true \`
			`--policy.gradient_checkpointing=true \`
			`--wandb.enable=true \`
			`--policy.dtype=bfloat16 \`
			`--steps=3000 \`
			`--policy.scheduler_decay_steps=3000 \`
			`--policy.device=cuda \`
			`--batch_size=32`
			```

			`### Key Training Parameters`

			- `--policy.compile_model=true`: Enables model compilation for faster training
			- `--policy.gradient_checkpointing=true`: Reduces memory usage significantly during training
			- `--policy.dtype=bfloat16`: Use mixed precision training for efficiency
			- `--batch_size=32`: Batch size for training, adapt this based on your GPU memory
update docs 2025-09-16 14:36:12 +02:00			- `--policy.pretrained_path=pepijn223/pi05_base_fp32`: The base π₀.₅ model you want to finetune, options are:
change docs: finetune base model options 2025-09-16 14:42:31 +02:00			`- [pepijn223/pi05_base_fp32](https://huggingface.co/pepijn223/pi05_base_fp32)`
			`- [pepijn223/pi05_libero_fp32](https://huggingface.co/pepijn223/pi05_libero_fp32) (specifically trained on the Libero dataset)`
			`- [pepijn223/pi05_droid_fp32](https://huggingface.co/pepijn223/pi05_droid_fp32) (specifically trained on the Droid dataset)`
Add docs 2025-09-16 10:09:42 +02:00
			`## Performance Results`

			`### Libero Benchmark Results`

adapt docs pi05 2025-09-16 14:40:52 +02:00			`π₀.₅ has demonstrated strong performance on the Libero benchmark suite. To compare and test its LeRobot implementation, we finetuned the libero base model for an additional 6k steps on the Libero dataset and compared the results to the OpenPI reference results.`
Add docs 2025-09-16 10:09:42 +02:00
adapt docs pi05 2025-09-16 14:40:52 +02:00			`\| Benchmark \| LeRobot Implementation \| OpenPI Reference \|`
			`\| ------------------ \| ---------------------- \| ---------------- \|`
			`\| Libero Spatial \| 98.0% \| 98.8% \|`
			`\| Libero Object \| 99.0% \| 98.2% \|`
			`\| Libero Goal \| 97.0% \| 98.0% \|`
			`\| Libero 10 \| 93.0% \| 92.4% \|`
			`\| Average \| 96.75% \| 96.85% \|`
Add docs 2025-09-16 10:09:42 +02:00
adapt docs pi05 2025-09-16 14:40:52 +02:00			`These results demonstrate π₀.₅'s strong generalization capabilities across diverse robotic manipulation tasks. To reproduce these results, you can follow the instructions in the [Libero](#libero) section.`
Add docs 2025-09-16 10:09:42 +02:00
			`## License`

			`This model follows the Apache 2.0 License, consistent with the original [OpenPI repository](https://github.com/Physical-Intelligence/openpi).`