Skip to content

Commit b67b227

Browse files
authored
Fix JaggedTensor single-element constructor unconditionally initializing CUDA via pinned_memory (openvdb#468)
## Summary - Make `pinned_memory` conditional on the tensor device being CUDA in two locations where single-element `JaggedTensor` construction unconditionally allocated pinned (page-locked) memory via `cudaHostAlloc`, which forced CUDA runtime initialization even for CPU-only tensors. - This caused crashes in forked `DataLoader` worker processes (where re-initializing CUDA after `fork()` is forbidden) and added unnecessary overhead for CPU-only workloads. - Add a test verifying that CPU single-element `JaggedTensor` offsets are not pinned. Fixes openvdb#467 ## Changes ### `src/fvdb/JaggedTensor.cpp` `.pinned_memory(true)` → `.pinned_memory(mData.device().is_cuda())` in the `JaggedTensor(const std::vector<torch::Tensor>&)` single-element branch. ### `src/fvdb/detail/ops/JOffsetsFromJIdx.cu` `.pinned_memory(true)` → `.pinned_memory(jdata.device().is_cuda())` in `joffsetsFromJIdx()`, which is the shared implementation called by CPU, CUDA, and PrivateUse1 dispatch paths. ### `tests/unit/test_jagged_tensor.py` New `test_cpu_single_element_no_cuda_init` verifying both constructor paths produce non-pinned offsets for CPU tensors. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
1 parent f42ec34 commit b67b227

File tree

3 files changed

+33
-7
lines changed

3 files changed

+33
-7
lines changed

src/fvdb/JaggedTensor.cpp

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -79,11 +79,12 @@ JaggedTensor::JaggedTensor(const std::vector<torch::Tensor> &tensors) {
7979
"assigned data must have shape [N, ...], but got data.dim() = 0");
8080
mBatchIdx =
8181
torch::empty({0}, torch::TensorOptions().dtype(JIdxScalarType).device(mData.device()));
82-
mOffsets = torch::tensor({JOffsetsType(0), mData.size(0)},
83-
torch::TensorOptions()
84-
.dtype(JOffsetsScalarType)
85-
.device(mData.device())
86-
.pinned_memory(true));
82+
mOffsets = torch::tensor(
83+
{JOffsetsType(0), mData.size(0)},
84+
torch::TensorOptions()
85+
.dtype(JOffsetsScalarType)
86+
.device(mData.device())
87+
.pinned_memory(mData.device().is_cuda() || mData.device().is_privateuseone()));
8788
mListIdx = torch::empty(
8889
{0, 1}, torch::TensorOptions().dtype(JLIdxScalarType).device(mData.device()));
8990
mNumOuterLists = 1;

src/fvdb/detail/ops/JOffsetsFromJIdx.cu

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,11 @@ joffsetsFromJIdx(torch::Tensor jidx, torch::Tensor jdata, int64_t numTensors) {
2323
TORCH_CHECK_VALUE(jidx.dim() == 1, "jidx must be a 1D tensor");
2424

2525
if (jidx.size(0) == 0 && numTensors == 1) {
26-
torch::Tensor ret =
27-
torch::empty({2}, torch::TensorOptions().dtype(JOffsetsScalarType).pinned_memory(true));
26+
torch::Tensor ret = torch::empty(
27+
{2},
28+
torch::TensorOptions()
29+
.dtype(JOffsetsScalarType)
30+
.pinned_memory(jdata.device().is_cuda() || jdata.device().is_privateuseone()));
2831
auto acc = ret.accessor<JOffsetsType, 1>();
2932
acc[0] = 0;
3033
acc[1] = jdata.size(0);

tests/unit/test_jagged_tensor.py

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2738,6 +2738,28 @@ def test_from_data_indices_and_list_ids(self, device, dtype):
27382738
# self.assertTrue(torch.all(data_sorted == jt_s[i].jdata).item())
27392739
# self.assertTrue(torch.all(data_sorted == jt[i].jdata[idx[i].jdata]).item())
27402740

2741+
def test_cpu_single_element_no_cuda_init(self):
2742+
"""Test that constructing a single-element CPU JaggedTensor does not use pinned memory.
2743+
2744+
Pinned memory allocation triggers CUDA runtime initialization, which causes crashes
2745+
in forked DataLoader worker processes. This test verifies the fix for issue #467.
2746+
"""
2747+
# Test the list-of-tensors constructor (single element)
2748+
cpu_tensor = torch.randn(5, 3)
2749+
jt = fvdb.JaggedTensor([cpu_tensor])
2750+
self.assertFalse(jt.joffsets.is_pinned(), "CPU single-element JaggedTensor offsets should not be pinned")
2751+
self.assertEqual(jt.joffsets.device.type, "cpu")
2752+
self.assertEqual(jt.jdata.shape, torch.Size([5, 3]))
2753+
self.assertTrue(torch.equal(jt.jdata, cpu_tensor))
2754+
2755+
# Test the bare-tensor constructor (dispatches through joffsetsFromJIdx)
2756+
cpu_tensor2 = torch.randn(10)
2757+
jt2 = fvdb.JaggedTensor(cpu_tensor2)
2758+
self.assertFalse(jt2.joffsets.is_pinned(), "CPU bare-tensor JaggedTensor offsets should not be pinned")
2759+
self.assertEqual(jt2.joffsets.device.type, "cpu")
2760+
self.assertEqual(jt2.jdata.shape, torch.Size([10]))
2761+
self.assertTrue(torch.equal(jt2.jdata, cpu_tensor2))
2762+
27412763

27422764
if __name__ == "__main__":
27432765
unittest.main()

0 commit comments

Comments
 (0)