|
1 | | -# 🦊 Fog-RT-X |
| 1 | +# 🦊 Robo-DM |
2 | 2 |
|
3 | | -🦊 Fog-RT-X: An Efficient and Scalable Data Collection and Management Framework For Robotics Learning. Support [Open-X-Embodiment](https://robotics-transformer-x.github.io/), 🤗[HuggingFace](https://huggingface.co/). |
| 3 | +🦊 Robo-DM : An Efficient and Scalable Data Collection and Management Framework For Robotics Learning. Support [Open-X-Embodiment](https://robotics-transformer-x.github.io/), 🤗[HuggingFace](https://huggingface.co/). |
4 | 4 |
|
5 | | -🦊 Fog-RT-X considers both speed 🚀 and memory efficiency 📈 with active metadata and lazily-loaded trajectory data. It supports flexible and distributed dataset partitioning. It provides native support to cloud storage. |
| 5 | +🦊 Robo-DM (Former Name: fog_x) considers both speed 🚀 and memory efficiency 📈 with active metadata and lazily-loaded trajectory data. It supports flexible and distributed dataset partitioning. It provides native support to cloud storage. |
6 | 6 |
|
7 | 7 | [Design Doc](https://docs.google.com/document/d/1woLQVLWsySGjFuz8aCsaLoc74dXQgIccnWRemjlNDws/edit#heading=h.irrfcedesnvr) | [Dataset Visualization](https://keplerc.github.io/openxvisualizer/) |
8 | 8 |
|
| 9 | +## Note to ICRA Reviewers |
| 10 | +We are actively developing the framework. See commit `a35a6` for the version we developed. |
| 11 | + |
| 12 | + |
9 | 13 | ## Install |
10 | 14 |
|
11 | 15 | ```bash |
12 | | -pip install fog_x |
| 16 | +git clone https://github.com/BerkeleyAutomation/fog_x.git |
| 17 | +cd fog_x |
| 18 | +pip install -e . |
13 | 19 | ``` |
14 | 20 |
|
15 | 21 | ## Usage |
16 | 22 |
|
17 | 23 | ```py |
18 | 24 | import fog_x |
19 | 25 |
|
20 | | -# 🦊 Dataset Creation |
21 | | -# from distributed dataset storage |
22 | | -dataset = fog_x.Dataset( |
23 | | - name="demo_ds", |
24 | | - path="~/test_dataset", # can be AWS S3, Google Bucket! |
25 | | -) |
| 26 | +path = "/tmp/output.vla" |
26 | 27 |
|
27 | 28 | # 🦊 Data collection: |
28 | 29 | # create a new trajectory |
29 | | -episode = dataset.new_episode() |
30 | | -# collect step data for the episode |
31 | | -episode.add(feature = "arm_view", value = "image1.jpg") |
| 30 | +traj = fog_x.Trajectory( |
| 31 | + path = path |
| 32 | +) |
| 33 | + |
| 34 | +traj.add(feature = "arm_view", value = "image1.jpg") |
32 | 35 | # Automatically time-aligns and saves the trajectory |
33 | | -episode.close() |
| 36 | +traj.close() |
34 | 37 |
|
35 | | -# 🦊 Data Loading: |
36 | | -# load from existing RT-X/Open-X datasets |
37 | | -dataset.load_rtx_episodes( |
38 | | - name="berkeley_autolab_ur5", |
39 | | - additional_metadata={"collector": "User 2"} |
| 38 | +# load it |
| 39 | +fog_x.Trajectory( |
| 40 | + path = path |
40 | 41 | ) |
41 | | - |
42 | | -# 🦊 Data Management and Analytics: |
43 | | -# Compute and memory efficient filter, map, aggregate, groupby |
44 | | -episode_info = dataset.get_episode_info() |
45 | | -desired_episodes = episode_info.filter(episode_info["collector"] == "User 2") |
46 | | - |
47 | | -# 🦊 Data Sharing and Usage: |
48 | | -# Export and share the dataset as standard Open-X-Embodiment format |
49 | | -# it also supports hugging face, and more! |
50 | | -dataset.export(desired_episodes, format="rtx") |
51 | | -# Load with pytorch dataloader |
52 | | -torch.utils.data.DataLoader(dataset.as_pytorch_dataset(desired_episodes)) |
53 | 42 | ``` |
54 | 43 |
|
55 | | -## Design |
56 | | -🦊 Fog-RT-X recognizes most post-processing, analytics and management involves the trajectory-level data, such as tags, while actual trajectory steps are rarely read, written and transformed. Acessing and modifying trajectory data is very expensive and hard. |
57 | | - |
58 | | -As a result, 🦊 Fog-RT-X proposes |
59 | | -* a user-friendly metadata table via Pandas Datframe for speed and freedom |
60 | | -* a LazyFrame from Polars for the trajectory dataset that only loads and transform the data if needed |
61 | | -* parquet as storage format for distributed storage and columnar support compared to tensorflow records |
62 | | -* Easy and automatic RT-X/Open-X dataset export and pytorch dataloading |
63 | | - |
64 | | - |
65 | | -## More Coming Soon! |
66 | | -Currently we see a more than 60\% space saving on some existing RT-X datasets. This can be even more by re-paritioning the dataset. Our next steps can be found in the [planning doc](./design_doc/planning_doc.md). Feedback welcome through issues or PR to planning doc! |
| 44 | +## Examples |
67 | 45 |
|
68 | | -We also note we are at a beta-testing phase. We make our best effort to be backward-compatible but interfaces may be unstable. |
| 46 | +* [Data Collection and Loading](./examples/data_collection_and_load.py) |
| 47 | +* [Convert From Open_X](./examples/openx_loader.py) |
| 48 | +* [Convert From H5](./examples/h5_loader.py) |
| 49 | +* [Running Benchmarks](./benchmarks/openx.py) |
69 | 50 |
|
70 | 51 | ## Development |
71 | 52 |
|
|
0 commit comments