From 50e13da845bbc7ae8f03e93149d4b1adcdfe0f52 Mon Sep 17 00:00:00 2001 From: Bryson Jones Date: Sat, 31 Jan 2026 08:03:52 -0800 Subject: [PATCH] update docs and readme files, fixing some typos and adding multitask dit to readme --- README.md | 10 ++--- docs/source/_toctree.yml | 4 +- .../{multitask_dit.mdx => multi_task_dit.mdx} | 10 ++--- docs/source/policy_multi_task_dit_README.md | 37 +++++++++++++++++++ src/lerobot/policies/multi_task_dit/README.md | 2 +- 5 files changed, 50 insertions(+), 13 deletions(-) rename docs/source/{multitask_dit.mdx => multi_task_dit.mdx} (94%) create mode 100644 docs/source/policy_multi_task_dit_README.md diff --git a/README.md b/README.md index d60cd35a9..5711b3c7b 100644 --- a/README.md +++ b/README.md @@ -100,11 +100,11 @@ lerobot-train \ --dataset.repo_id=lerobot/aloha_mobile_cabinet ``` -| Category | Models | -| -------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| **Imitation Learning** | [ACT](./docs/source/policy_act_README.md), [Diffusion](./docs/source/policy_diffusion_README.md), [VQ-BeT](./docs/source/policy_vqbet_README.md) | -| **Reinforcement Learning** | [HIL-SERL](./docs/source/hilserl.mdx), [TDMPC](./docs/source/policy_tdmpc_README.md) & QC-FQL (coming soon) | -| **VLAs Models** | [Pi0Fast](./docs/source/pi0fast.mdx), [Pi0.5](./docs/source/pi05.mdx), [GR00T N1.5](./docs/source/policy_groot_README.md), [SmolVLA](./docs/source/policy_smolvla_README.md), [XVLA](./docs/source/xvla.mdx) | +| Category | Models | +| -------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| **Imitation Learning** | [ACT](./docs/source/policy_act_README.md), [Diffusion](./docs/source/policy_diffusion_README.md), [VQ-BeT](./docs/source/policy_vqbet_README.md), [Multitask DiT Policy](./docs/source/policy_multi_task_dit_README.md) | +| **Reinforcement Learning** | [HIL-SERL](./docs/source/hilserl.mdx), [TDMPC](./docs/source/policy_tdmpc_README.md) & QC-FQL (coming soon) | +| **VLAs Models** | [Pi0Fast](./docs/source/pi0fast.mdx), [Pi0.5](./docs/source/pi05.mdx), [GR00T N1.5](./docs/source/policy_groot_README.md), [SmolVLA](./docs/source/policy_smolvla_README.md), [XVLA](./docs/source/xvla.mdx) | Similarly to the hardware, you can easily implement your own policy & leverage LeRobot's data collection, training, and visualization tools, and share your model to the HF Hub diff --git a/docs/source/_toctree.yml b/docs/source/_toctree.yml index e0c1c30ae..2f6df017e 100644 --- a/docs/source/_toctree.yml +++ b/docs/source/_toctree.yml @@ -45,8 +45,8 @@ title: NVIDIA GR00T N1.5 - local: xvla title: X-VLA - - local: multitask_dit - title: Multi-Task DiT + - local: multi_task_dit + title: Multitask DiT Policy - local: walloss title: WALL-OSS title: "Policies" diff --git a/docs/source/multitask_dit.mdx b/docs/source/multi_task_dit.mdx similarity index 94% rename from docs/source/multitask_dit.mdx rename to docs/source/multi_task_dit.mdx index c4a7d5f2b..c3cced708 100644 --- a/docs/source/multitask_dit.mdx +++ b/docs/source/multi_task_dit.mdx @@ -1,6 +1,6 @@ -# Multi-Task DiT Policy +# Multitask DiT Policy -Multi-Task Diffusion Transformer (DiT) Policy is an evolution of the original Diffusion Policy architecture, which leverages a large DiT with text and vision conditioning for multi-task robot learning. This implementation supports both diffusion and flow matching objectives for action generation, enabling robots to perform diverse manipulation tasks conditioned on language instructions. +Multitask Diffusion Transformer (DiT) Policy is an evolution of the original Diffusion Policy architecture, which leverages a large DiT with text and vision conditioning for multitask robot learning. This implementation supports both diffusion and flow matching objectives for action generation, enabling robots to perform diverse manipulation tasks conditioned on language instructions. ## Model Overview @@ -16,7 +16,7 @@ VLAs, with only ~450M parameters and significantly less training. ## Installation Requirements -Multi-Task DiT Policy has additional dependencies. Install it with: +Multitask DiT Policy has additional dependencies. Install it with: ```bash pip install lerobot[multi_task_dit] @@ -26,7 +26,7 @@ This will install all necessary dependencies including the HuggingFace Transform ## Usage -To use Multi-Task DiT in your LeRobot configuration, specify the policy type as: +To use Multitask DiT in your LeRobot configuration, specify the policy type as: ```python policy.type=multi_task_dit @@ -36,7 +36,7 @@ policy.type=multi_task_dit ### Basic Training Command -Here's a complete training command for training Multi-Task DiT on your dataset: +Here's a complete training command for training Multitask DiT on your dataset: ```bash lerobot-train \ diff --git a/docs/source/policy_multi_task_dit_README.md b/docs/source/policy_multi_task_dit_README.md new file mode 100644 index 000000000..f24fa927e --- /dev/null +++ b/docs/source/policy_multi_task_dit_README.md @@ -0,0 +1,37 @@ +# Multitask DiT Policy + +## Citation + +If you use this work, please cite the following works: + +```bibtex +@misc{jones2025multitaskditpolicy, + author = {Bryson Jones}, + title = {Dissecting and Open-Sourcing Multitask Diffusion Transformer Policy}, + year = {2025}, + url = {https://brysonkjones.substack.com/p/dissecting-and-open-sourcing-multitask-diffusion-transformer-policy}, + note = {Blog post} +} +``` + +```bibtex +@misc{trilbmteam2025carefulexaminationlargebehaviormodels, + author = {TRI LBM Team}, + title = {A Careful Examination of Large Behavior Models for Multitask Dexterous Manipulation}, + year = {2025}, + eprint = {arXiv:2507.05331}, + archivePrefix = {arXiv}, + primaryClass = {cs.RO}, + url = {https://arxiv.org/abs/2507.05331} +} +``` + +```bibtex +@misc{bostondynamics2025largebehaviormodelsatlas, + author = {Boston Dynamics and TRI Research Team}, + title = {Large Behavior Models and Atlas Find New Footing}, + year = {2025}, + url = {https://bostondynamics.com/blog/large-behavior-models-atlas-find-new-footing/}, + note = {Blog post} +} +``` diff --git a/src/lerobot/policies/multi_task_dit/README.md b/src/lerobot/policies/multi_task_dit/README.md index 3ede6f174..f24fa927e 100644 --- a/src/lerobot/policies/multi_task_dit/README.md +++ b/src/lerobot/policies/multi_task_dit/README.md @@ -1,4 +1,4 @@ -# Multi-Task DiT Policy +# Multitask DiT Policy ## Citation