add docs, fix depth dataset with parallel datamanager, fix mask sampling bug

kerrj · kerrj · commit 37f4ca4c62da · 2025-01-13T12:20:28.000-08:00
diff --git a/docs/developer_guides/pipelines/datamanagers.md b/docs/developer_guides/pipelines/datamanagers.md
@@ -109,7 +109,9 @@ ns-train splatfacto --data {PROCESSED_DATA_DIR} --pipeline.datamanager.cache_ima
 ## Migrating Your DataManager to the new DataManager 
 Many methods subclass a DataManager and add extra data to it. If you would like your custom datamanager to also support new parallel features, you can migrate any custom dataloading logic to the new `custom_view_processor()` API. Let's take a look at an example for the LERF method, which was built on Nerfstudio's VanillaDataManager. This API provides an interface to attach new information to the RayBundle (for ray based methods), Cameras object (for splatting based methods), or ground truth dictionary. It runs in a background process if disk caching is enabled, otherwise it runs in the main process.
 
-**Note**: naively transfering code to `custom_view_processor` may still OOM on very large datasets if initialization code requires computing something over the whole dataset. To fully take advantage of parallelization make sure your subclassed datamanager computes new information inside the `custom_view_processor`, or caches a subset of the whole dataset. This can also still be slow if pre-computation requires GPU-heavy steps on the same GPU used for training.
+Naively transfering code to `custom_view_processor` may still OOM on very large datasets if initialization code requires computing something over the whole dataset. To fully take advantage of parallelization make sure your subclassed datamanager computes new information inside the `custom_view_processor`, or caches a subset of the whole dataset. This can also still be slow if pre-computation requires GPU-heavy steps on the same GPU used for training.
+
+**Note**: Because the parallel DataManager uses background processes, any member of the DataManager needs to be *picklable* to be used inside `custom_view_processor`.
 
 ```python
 class LERFDataManager(VanillaDataManager):
diff --git a/nerfstudio/configs/method_configs.py b/nerfstudio/configs/method_configs.py
@@ -177,6 +177,7 @@
     mixed_precision=True,
     pipeline=VanillaPipelineConfig(
         datamanager=VanillaDataManagerConfig(
+            _target=ParallelDataManager[InputDataset],
             dataparser=NerfstudioDataParserConfig(),
             train_num_rays_per_batch=16384,
             eval_num_rays_per_batch=4096,
@@ -226,7 +227,7 @@
     mixed_precision=True,
     pipeline=VanillaPipelineConfig(
         datamanager=VanillaDataManagerConfig(
-            _target=VanillaDataManager[DepthDataset],
+            _target=ParallelDataManager[DepthDataset],
             dataparser=NerfstudioDataParserConfig(),
             train_num_rays_per_batch=4096,
             eval_num_rays_per_batch=4096,
diff --git a/nerfstudio/data/datasets/depth_dataset.py b/nerfstudio/data/datasets/depth_dataset.py
@@ -79,7 +79,7 @@ def __init__(
                     filenames = dataparser_outputs.image_filenames
 
                 repo = "isl-org/ZoeDepth"
-                self.zoe = torch_compile(torch.hub.load(repo, "ZoeD_NK", pretrained=True).to(device))
+                zoe = torch_compile(torch.hub.load(repo, "ZoeD_NK", pretrained=True).to(device))
 
                 for i in track(range(len(filenames)), description="Generating depth images"):
                     image_filename = filenames[i]
@@ -93,7 +93,7 @@ def __init__(
                         image = torch.permute(image, (2, 0, 1)).unsqueeze(0).to(device)
                         if image.shape[1] == 4:
                             image = image[:, :3, :, :]
-                        depth_tensor = self.zoe.infer(image).squeeze().unsqueeze(-1)
+                        depth_tensor = zoe.infer(image).squeeze().unsqueeze(-1)
 
                     depth_tensors.append(depth_tensor)
 
diff --git a/nerfstudio/data/pixel_samplers.py b/nerfstudio/data/pixel_samplers.py
@@ -106,7 +106,7 @@ def rejection_sample_mask(
         num_valid = 0
         for _ in range(self.config.max_num_iterations):
             c, y, x = (i.flatten() for i in torch.split(indices, 1, dim=-1))
-            chosen_indices_validity = mask.squeeze()[c, y, x].bool()
+            chosen_indices_validity = mask.squeeze(-1)[c, y, x].bool()
             num_valid = int(torch.sum(chosen_indices_validity).item())
             if num_valid == num_samples:
                 break