Skip to content

Commit a6bbf4a

Browse files
authored
Don't use dl_tensor.byte_offset when exporting capsules. (#21153)
Pytorch ignores byte_offset at least in some circumstances, meaning we end up with the wrong data getting output. This solves the issue by baking the byte_offset directly into the returned pointer. See: https://github.com/dmlc/dlpack/blob/7f393bbb86a0ddd71fde3e700fc2affa5cdce72d/include/dlpack/dlpack.h#L225 Signed-off-by: Andrew Woloszyn <[email protected]>
1 parent 1110ac1 commit a6bbf4a

File tree

1 file changed

+6
-2
lines changed

1 file changed

+6
-2
lines changed

runtime/bindings/python/hal.cc

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -739,9 +739,13 @@ py::object HalDevice::CreateDLPackCapsule(HalBufferView& buffer_view,
739739
"Cannot export device buffer");
740740
static_assert(sizeof(dl_tensor.data) >=
741741
sizeof(external_buffer.handle.device_allocation.ptr));
742+
// Set the data pointer to the offset, and the byte_offset to 0.
743+
// This SHOULD not be required, but some backends (torch GPU for example),
744+
// ignore the byte_offset entirely.
742745
dl_tensor.data =
743-
reinterpret_cast<void*>(external_buffer.handle.device_allocation.ptr);
744-
dl_tensor.byte_offset = offset;
746+
reinterpret_cast<uint8_t*>(external_buffer.handle.device_allocation.ptr) +
747+
offset;
748+
dl_tensor.byte_offset = 0;
745749

746750
// Create and return capsule.
747751
PyObject* capsule = PyCapsule_New(static_cast<DLManagedTensor*>(tensor.get()),

0 commit comments

Comments
 (0)