Skip to content

Commit e1764a9

Browse files
further updates to documentation
1 parent 1b37fc4 commit e1764a9

File tree

1 file changed

+16
-4
lines changed

1 file changed

+16
-4
lines changed

docs/developer_guides/pipelines/datamanagers.md

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -95,9 +95,9 @@ See the code!
9595

9696
We currently don't have other implementations because most papers follow the VanillaDataManager implementation. However, it should be straightforward to add a VanillaDataManager with logic that progressively adds cameras, for instance, by relying on the step and modifying RayBundle and RayGT generation logic.
9797

98-
## Migrating Your Datamanager to the New Datamanager
98+
## Migrating Your DataManager to the New DataManager
9999

100-
As of January 2025, the FullImageDatamanager and ParallelImageDatamanager implementation now supports parallelized dataloading and dataloading from disk to preserve CPU RAM. If you would like your custom datamanager to also support these new features, you can migrate any custom dataloading logic to the `custom_view_processor` API. Let's take a look at an example for the LERF method, which was built on Nerfstudio's VanillaDataManager.
100+
As of January 2025, the FullImageDatamanager and ParallelImageDatamanager implementations now support parallelized dataloading and dataloading from disk to avoid Out-Of-Memory errors. If you would like your custom datamanager to also support these new features, you can migrate any custom dataloading logic to the new `custom_view_processor()` API. Let's take a look at an example for the LERF method, which was built on Nerfstudio's VanillaDataManager.
101101

102102
```python
103103
class LERFDataManager(VanillaDataManager): # pylint: disable=abstract-method
@@ -175,7 +175,7 @@ class LERFDataManager(VanillaDataManager): # pylint: disable=abstract-method
175175
return ray_bundle, batch
176176
```
177177

178-
To migrate this custom datamanager to the new datamanager, we can shift the data customization process in `next_train()` to `custom_view_processor()`.
178+
To migrate this custom datamanager to the new datamanager, we'll subclass the new ParallelDataManager and shift the data customization process from `next_train()` to `custom_view_processor()`.
179179

180180
```python
181181
class LERFDataManager(ParallelDataManager, Generic[TDataset]):
@@ -185,7 +185,7 @@ class LERFDataManager(ParallelDataManager, Generic[TDataset]):
185185
def custom_ray_processor(
186186
self, ray_bundle: RayBundle, batch: Dict
187187
) -> Tuple[RayBundle, Dict]:
188-
"""An API to add latents, metadata, or other further customization to the RayBundle dataloading process that is parallelized"""
188+
"""An API to add latents, metadata, or other further customization to the RayBundle dataloading process that is parallelized."""
189189
ray_indices = batch["indices"]
190190
batch["clip"], clip_scale = self.clip_interpolator(ray_indices)
191191
batch["dino"] = self.dino_dataloader(ray_indices)
@@ -196,4 +196,16 @@ class LERFDataManager(ParallelDataManager, Generic[TDataset]):
196196
ray_bundle.metadata["fy"] = self.train_dataset.cameras[0].fy.item()
197197
ray_bundle.metadata["height"] = self.train_dataset.cameras[0].height.item()
198198
return ray_bundle, batch
199+
```
200+
201+
## How to Use the New DataManagers
202+
203+
To train a NeRF-based method with a large dataset that's unable to fit in memory, please add the `load_from_disk` flag to your `ns-train` command. For example with nerfacto:
204+
```bash
205+
ns-train nerfacto --data {PROCESSED_DATA_DIR} --pipeline.datamanager.load_from_disk
206+
```
207+
208+
To train a Gaussian Splatting method with a large dataset that's unable to fit in memory, please set the device of `cache_images` to disk. For example with splatfacto:
209+
```bash
210+
ns-train splatfacto --data {PROCESSED_DATA_DIR} --pipeline.datamanager.cache_images disk
199211
```

0 commit comments

Comments
 (0)