Michel Aractingi
35f36e8fba
removed outdated todos
2025-08-28 10:10:17 +02:00
Michel Aractingi
db36f01e8b
add update_chunk_settings method for LeRobotDatasetMetadata. Introduce tests for chunk settings updates and validation of parameters.
2025-08-18 00:00:23 +02:00
Michel Aractingi
c7a3b02625
fixed tensor indicies in _check_cached_episode_sufficient in lerobot_dataset.py, added test
2025-08-13 16:16:32 +02:00
Michel Aractingi
4048b02d4a
improved typing in datasets/utils.py
2025-07-31 14:32:29 +02:00
Francesco Capuano
527ae8e557
Add variable-size test datasets ( #1610 )
...
* fix: dummy datasets can be written to multiple files in multiple folders based on arbitrary data size
* fix: writing atomic episodes to multiple files (maybe)
* fix: moving unused write dataset function to test code
2025-07-30 11:26:28 +02:00
Michel Aractingi
890b1e473d
Merge branch 'main' into user/michel-aractingi/2025_06_30_dataset_v3
2025-07-30 00:43:53 +02:00
Michel Aractingi
6447352439
added a check for comparing cached episodes in order to trigger a new download if the requested episodes dont match the cached ones
2025-07-30 00:32:28 +02:00
Michel Aractingi
788544d936
update lerobot_dataset docstring
2025-07-30 00:12:23 +02:00
Michel Aractingi
59d108a807
fix(convert_v2_v3) reverted concat data files from previous commit
...
fixed bug in meta data related chunk_index and file_index when concatenating video files, added clearer condition to respect conditions so that episode doesnt span multiple videos
2025-07-29 22:58:24 +02:00
HUANG TZU-CHUN
b2a71c6fe4
fix: Rename sync_cache_first to force_cache_sync in LeRobotDataset docstring ( #1310 )
2025-07-25 15:08:00 +02:00
Michel Aractingi
218ebed3ef
feat(convert_dataset_v21_to_v3) added the use of more efficient Dataset.from_parquet and concatenate_datasets
2025-07-22 17:27:41 +02:00
Caroline Pascal
9b9f4757fb
style(deprecated method): remove no longer used get_features_from_robot function (replaced by hw_to_dataset_features) ( #1560 )
2025-07-21 19:12:03 +02:00
Michel Aractingi
066b81aec2
moved concat_video function to video_utils, cleaned some code
2025-07-21 14:47:16 +02:00
Michel Aractingi
dcb02a951d
fix(convert_v1) use correct legacy path, remove comments from scripts, revert lekiwi/record.py to main
2025-07-21 11:49:15 +02:00
Michel Aractingi
23375cce3a
fix(tests) bug in clear_episode_buffer
2025-07-20 01:39:19 +02:00
Michel Aractingi
5ec70f704e
removed check_timestamps_sync that is no longer used in the code,
...
removed tests in datasets related to check_timestamps_sync
added the use of `clear_episode_buffer` that was not used in `save_episode`
added the creation of the codebase_version tag that was missing in `slurm_upload`
2025-07-18 16:33:20 +02:00
Michel Aractingi
4c0ac93eb6
nit
2025-07-18 16:33:20 +02:00
pre-commit-ci[bot]
788dde3a34
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
2025-07-18 16:33:20 +02:00
Michel Aractingi
e05d22cb7b
Merge branch 'main' into user/michel-aractingi/2025_06_30_dataset_v3
...
Signed-off-by: Michel Aractingi <michel.aractingi@huggingface.co >
2025-07-18 16:33:18 +02:00
Xingdong Zuo
e6e1f085d4
Feat: Add Batched Video Encoding for Faster Dataset Recording ( #1390 )
...
* LeRobotDataset video encoding: updated `save_episode` method and added `batch_encode_videos` method to handle video encoding based on `batch_encoding_size`, allowing for both immediate and batched encoding.
* LeRobotDataset video cleanup: Enabled individual episode cleanup and check for remaining PNG files before removing the `images` directory.
* LeRobotDataset - VideoEncodingManager: added proper handling of pending episodes (encoding, cleaning) on exit or recording failures.
* LeRobotDatasetMetadata: removed `update_video_info` to only update video info at episode index 0 encoding.
* Adjusted the `record` function to utilize the new encoding management logic.
* Removed `encode_videos` method from `LeRobotDataset` and `encode_episode_videos` outputs as they are nowhere used.
---------
Signed-off-by: Xingdong Zuo <zuoxingdong@users.noreply.github.com >
Co-authored-by: Xingdong Zuo <xingdong.zuo@navercorp.com >
Co-authored-by: Caroline Pascal <caroline8.pascal@gmail.com >
2025-07-18 12:18:52 +02:00
Steven Palma
378e1f0338
Update pre-commit-config.yaml + pyproject.toml + ceil rerun & transformer dependencies version ( #1520 )
...
* chore: update .gitignore
* chore: update pre-commit
* chore(deps): update pyproject
* fix(ci): multiple fixes
* chore: pre-commit apply
* chore: address review comments
* Update pyproject.toml
Co-authored-by: Ben Zhang <5977478+ben-z@users.noreply.github.com >
Signed-off-by: Steven Palma <imstevenpmwork@ieee.org >
* chore(deps): add todo
---------
Signed-off-by: Steven Palma <imstevenpmwork@ieee.org >
Co-authored-by: Ben Zhang <5977478+ben-z@users.noreply.github.com >
2025-07-17 14:30:20 +02:00
Michel Aractingi
a4d3a414ca
Added Francescos PRs for fixing aggregate.py
2025-07-08 14:17:01 +02:00
fracapuano
a49760e2ba
fix: tests depending on various sizes, and duration is updated
2025-07-08 13:43:19 +02:00
Michel Aractingi
4a466d94b6
moved legacy functions to convert_stats.py
2025-07-06 22:32:51 +02:00
Michel Aractingi
9287c36f37
- Added missing license in the new scripts
...
- Added back legacy functions in conversion script of v2 to v21
- Updated README description for dataset_v3
2025-07-06 22:29:05 +02:00
Michel Aractingi
83bf24cc9a
fix(tests) add features argument to load_nested_dataset
2025-07-05 10:16:29 +02:00
Michel Aractingi
3dbc3e60fb
Added docstrings to aggregate, fix test_policies.py
2025-07-04 11:27:00 +02:00
Michel Aractingi
66454a0fbf
Remove more references to lerobot.common
2025-07-02 18:18:19 +02:00
Michel Aractingi
1c17419224
Reverted back files that were changed during the rebase
2025-07-02 17:26:34 +02:00
Michel Aractingi
9dde8829e6
style nit
2025-07-02 17:10:56 +02:00
Michel Aractingi
0f66bbe2f9
Migrate PR to new folder structure introduce on 1417
2025-07-02 17:10:26 +02:00
fracapuano
378c147be6
fix: debug aggregation code
2025-07-02 11:45:27 +02:00
Remi Cadene
8c1503dafa
WIP after Francesco discussion
2025-07-02 11:45:11 +02:00
Remi Cadene
13a1f68b8e
WIP aggregate
2025-07-02 11:44:29 +02:00
Remi Cadene
a231930044
Fix aggregate (num_frames, dataset_from_index, index)
2025-07-02 11:43:46 +02:00
Remi Cadene
6f0fc7f386
Aggregate: Add concatenation
2025-07-02 11:43:36 +02:00
Remi Cadene
fde67dbae7
Fix convert v30 with image datasets
2025-07-02 11:43:35 +02:00
Remi Cadene
01bc89b6f4
Merge remote-tracking branch 'origin/user/rcadene/2025_04_11_dataset_v3' into user/rcadene/2025_04_11_dataset_v3
2025-07-02 11:43:24 +02:00
Remi Cadene
8c43b3d05e
Faster self.meta.episodes[...]
...
switch back to set_transform instead of set_format
Add video_files_size_in_mb
pre-commit run --all-files
2025-07-02 11:43:22 +02:00
Remi Cadene
d4af22418b
Fix unit tests
2025-07-02 11:42:52 +02:00
Remi Cadene
eaec52a7b7
Merge remote-tracking branch 'origin/user/rcadene/2025_04_11_dataset_v3' into user/rcadene/2025_04_11_dataset_v3
2025-07-02 11:42:49 +02:00
Remi Cadene
0a390de361
Merge remote-tracking branch 'origin/main' into user/rcadene/2025_04_11_dataset_v3
2025-07-02 11:41:53 +02:00
Simon Alibert
d4ee470b00
Package folder structure ( #1417 )
...
* Move files
* Replace imports & paths
* Update relative paths
* Update doc symlinks
* Update instructions paths
* Fix imports
* Update grpc files
* Update more instructions
* Downgrade grpc-tools
* Update manifest
* Update more paths
* Update config paths
* Update CI paths
* Update bandit exclusions
* Remove walkthrough section
2025-07-01 16:34:46 +02:00