Allow passing tensor arguments in reader constructors#6252
Allow passing tensor arguments in reader constructors#6252rostan-t wants to merge 19 commits intoNVIDIA:mainfrom
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Greptile SummaryThis PR extends DALI's experimental dynamic mode to allow tensor (and array-like) arguments to be passed directly in reader constructors, not only at call time. Previously, tensor arguments could only be provided via Key changes:
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant User
participant Init as "Reader.__init__ (generated)"
participant Base as "Reader base __init__"
participant PTA as "_process_tensor_args"
participant Backend as "_init_backend"
participant Run as "super()._run"
User->>Init: "Reader(filenames=..., resize_x=ndd.tensor(w), resize_y=192)"
Init->>Init: "Separate tensor args from scalar args"
Note over Init: "tensor_args = {resize_x: Tensor}, scalar kwargs = {resize_y: 192}"
Init->>Base: "__init__(max_batch_size, name, resize_y=192, ...)"
Base->>Base: "Set _raw_tensor_args={}, _tensor_args={}, _previous_batch_size=None"
Base-->>Init: "return"
Init->>Init: "Override _raw_tensor_args={resize_x: Tensor}, _tensor_arg_names={resize_x, resize_y}"
User->>Init: "reader.next_epoch(batch_size=4)"
Init->>PTA: "_process_tensor_args(batch_size=4)"
PTA->>PTA: "Cache miss (4 != None) → _process_params"
PTA-->>Init: "{resize_x: Batch(tensor, 4)}"
Init->>Backend: "_init_backend(ctx, (), {resize_x: Batch})"
loop "Each batch in epoch"
Init->>PTA: "_process_tensor_args(4)"
PTA-->>Init: "cached {resize_x: Batch}"
Init->>Run: "_run(ctx, batch_size=4, resize_x=Batch)"
Run-->>User: "yield Batch outputs"
end
|
ed9a066 to
c498545
Compare
|
!build |
|
CI MESSAGE: [45875629]: BUILD STARTED |
|
CI MESSAGE: [45875629]: BUILD FAILED |
| kwargs = {k: _scalar_decay(v) for k, v in kwargs.items()} | ||
| op_class.__base__.__init__(self, max_batch_size, name, **kwargs) | ||
| if is_reader: | ||
| self._tensor_args = {k: v for k, v in tensor_kwargs.items() if v is not None} |
There was a problem hiding this comment.
This is potentially quite inefficient, as we'd aggressively promote arguments to tensors - and we should avoid it in __init__. We should only add tensor types as tensors - and possibly even demote 0D tensors back to scalar values. This will have considerable benefits at run-time, as we won't have to pass the tensors to each call (which, in case of readers, will mean many, many times).
There was a problem hiding this comment.
Also - if we get a foreign tensor in kwargs, we should make a copy - otherwise user might get a nasty surprise.
shape = np.array([480, 640])
r = readers.Numpy(files=[...], shape=shape)
shape[0] = 320 # oops, the reader will now see that| _, kwargs = self._process_params( | ||
| self._backend, self._device, batch_size, **self._tensor_args | ||
| ) |
There was a problem hiding this comment.
I'd recommend at least caching the processed kwargs when the batch size is constant.
mzient
left a comment
There was a problem hiding this comment.
To change:
- make a copy tensor arguments
- do not promote non-tensor arguments to tensors
- simplify caller_depth logic
Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
… traces Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
…essing Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
…al tensors Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
51fb904 to
f283da0
Compare
Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
@greptileai This is not necessary. Readers are now the only operators publicly exposed as classes. |
|
That makes sense - if readers are the only operators exposed as classes, then the change is already naturally scoped correctly. Thanks for the clarification! |
|
!build |
|
CI MESSAGE: [46240320]: BUILD STARTED |
|
CI MESSAGE: [46240320]: BUILD PASSED |
Category:
New feature (non-breaking change which adds functionality)
Description:
Currently, it is necessary to invoke readers in order to pass tensor arguments. The recommended way to use readers is with
next_epochand the__call__API is not even documented.This PR allows constructing readers with tensor arguments.
Additional information:
Affected modules and functionalities:
Dynamic mode.
Key points relevant for the review:
Tests:
Checklist
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: DALI-4600