Allow passing tensor arguments in reader constructors by rostan-t · Pull Request #6252 · NVIDIA/DALI

rostan-t · 2026-03-11T10:59:56Z

Category:

New feature (non-breaking change which adds functionality)

Description:

Currently, it is necessary to invoke readers in order to pass tensor arguments. The recommended way to use readers is with next_epoch and the __call__ API is not even documented.

This PR allows constructing readers with tensor arguments.

Additional information:

Affected modules and functionalities:

Dynamic mode.

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

Implements new requirements
Affects existing requirements
N/A

REQ IDs: N/A

JIRA TASK: DALI-4600

review-notebook-app · 2026-03-11T11:00:02Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

greptile-apps · 2026-03-11T11:02:41Z

Greptile Summary

This PR extends DALI's experimental dynamic mode to allow tensor (and array-like) arguments to be passed directly in reader constructors, not only at call time. Previously, tensor arguments could only be provided via __call__, which conflicted with the recommended next_epoch() API that is the primary interface for readers.

Key changes:

_op_builder.py: The generated __init__ for reader operators now separates non-scalar tensor arguments into _raw_tensor_args (stored for lazy processing) while scalar arguments continue through the normal _scalar_decay path. A _tensor_arg_names set is populated to guard against duplicate argument passing in __call__.
_ops.py: Reader gains _process_tensor_args(batch_size) — a caching helper that converts _raw_tensor_args to their processed Batch/Tensor form and passes them through all execution paths (_samples, _batches, get_metadata).
_invocation.py / _batch2tensor.py: Default caller_depth unified to 3, removing a now-redundant explicit override.
_signatures.py: Reader __init__ type stubs updated to expose tensor kwargs with appropriate type constraints (allow_data_node_kwargs=False, allow_batch_kwargs=False).
pytorch/nodes.py: get_metadata updated to the new required batch_size parameter; fixes a positional vs keyword argument call for _stream.
New tests cover: ndd.tensor() arg with next_epoch, mixed constructor/call-time scalar split, and the duplicate-arg error path.

Confidence Score: 3/5

Safe to merge after addressing the latent _previous_batch_size = None cache-hit edge case and confirming the _signatures.py stub change is limited to reader schemas.
The core logic for separating tensor args at construction time and caching their processed forms is sound, and the three new tests exercise the main paths. Two issues lower confidence: (1) _previous_batch_size is initialized to None, which creates a false cache hit if _process_tensor_args(None) is ever called with non-empty tensor args — currently not triggered but fragile; (2) the removal of include_inputs=False and include_kwarg_inputs=False from the __init__ stub generator could affect non-reader CamelCase class operators if any exist. Neither issue blocks the primary feature, but both warrant a quick fix or confirmation before merging.
_ops.py (latent cache-hit in _process_tensor_args) and _signatures.py (stub generation scope change).

Important Files Changed

Filename	Overview
dali/python/nvidia/dali/experimental/dynamic/_ops.py	Core reader class extended with `_raw_tensor_args`, `_tensor_args`, and `_previous_batch_size` fields; adds `_process_tensor_args` caching helper and wires tensor args through `_samples`, `_batches`, and `get_metadata`. The `_previous_batch_size = None` initialization creates a latent false-cache-hit if `_process_tensor_args(None)` is ever called with non-empty tensor args.
dali/python/nvidia/dali/experimental/dynamic/_op_builder.py	Generated `__init__` now separates tensor args from scalar args for reader operators, stores them in `_raw_tensor_args`/`_tensor_arg_names`, and validates against `Batch` usage. The `__call__` path correctly merges and deduplicates constructor tensor args with call-time args.
dali/python/nvidia/dali/ops/_signatures.py	Reader `__init__` type stub updated to include tensor keyword args (constrained to non-DataNode/non-Batch types). Removal of `include_inputs=False` and `include_kwarg_inputs=False` could expose positional inputs in `__init__` stubs for any non-reader CamelCase class operators.
dali/test/python/experimental_mode/test_reader_decoder.py	Adds three new tests: tensor arg via `ndd.tensor()` in `next_epoch`, partial scalar+call-time arg split, and duplicate-arg error guard. Fixes a pre-existing `glob` pattern missing wildcard in the shards error test.

Sequence Diagram

sequenceDiagram
    participant User
    participant Init as "Reader.__init__ (generated)"
    participant Base as "Reader base __init__"
    participant PTA as "_process_tensor_args"
    participant Backend as "_init_backend"
    participant Run as "super()._run"

    User->>Init: "Reader(filenames=..., resize_x=ndd.tensor(w), resize_y=192)"
    Init->>Init: "Separate tensor args from scalar args"
    Note over Init: "tensor_args = {resize_x: Tensor}, scalar kwargs = {resize_y: 192}"
    Init->>Base: "__init__(max_batch_size, name, resize_y=192, ...)"
    Base->>Base: "Set _raw_tensor_args={}, _tensor_args={}, _previous_batch_size=None"
    Base-->>Init: "return"
    Init->>Init: "Override _raw_tensor_args={resize_x: Tensor}, _tensor_arg_names={resize_x, resize_y}"

    User->>Init: "reader.next_epoch(batch_size=4)"
    Init->>PTA: "_process_tensor_args(batch_size=4)"
    PTA->>PTA: "Cache miss (4 != None) → _process_params"
    PTA-->>Init: "{resize_x: Batch(tensor, 4)}"
    Init->>Backend: "_init_backend(ctx, (), {resize_x: Batch})"

    loop "Each batch in epoch"
        Init->>PTA: "_process_tensor_args(4)"
        PTA-->>Init: "cached {resize_x: Batch}"
        Init->>Run: "_run(ctx, batch_size=4, resize_x=Batch)"
        Run-->>User: "yield Batch outputs"
    end

Comments Outside Diff (2)

dali/python/nvidia/dali/experimental/dynamic/_ops.py, line 610 (link)

None initial value creates false cache hit in _process_tensor_args

_previous_batch_size is initialized to None on line 610. The cache check in _process_tensor_args is:
```
if self._previous_batch_size == batch_size:
    return self._tensor_args   # _tensor_args is still {} at this point
```
If _process_tensor_args(None) is called while _raw_tensor_args is non-empty, None == None is True and the method silently returns an empty {} instead of processing the tensor arguments.

In the current code this doesn't manifest — _samples sets _actual_batch_size = 1 before calling _process_tensor_args, _batches rejects None batch_size outright, and get_metadata is typed int. However, the initialization is fragile: any future caller that passes batch_size=None (e.g., calling get_metadata(None) despite the type hint) would receive silently incorrect results.

A safer sentinel value would avoid the ambiguity:

Or alternatively, guard the initial state explicitly in _process_tensor_args.
dali/python/nvidia/dali/ops/_signatures.py, line 797-806 (link)

include_inputs=False removal may expose positional inputs in non-reader class stubs

The previous code explicitly passed include_inputs=False and include_kwarg_inputs=False to _call_signature for the __init__ stub of all class-based operators. The new code relies on defaults (include_inputs=True, include_kwarg_inputs=True) while constraining annotations via allow_data_node_kwargs=False and allow_batch_kwargs=False.

For readers (which have MinNumInput() == 0) this is fine — no positional inputs appear. But for any non-reader CamelCase operator that has positional inputs, the generated __init__ stub would now incorrectly include those inputs, even though the actual runtime build_constructor skips tensor args for non-readers (if is_tensor and not is_reader: continue).

If all CamelCase class operators in the dynamic API are exclusively readers, this is a non-issue. But if that assumption breaks in the future, the generated .pyi stubs would mislead type-checkers and IDEs into accepting inputs in __init__ that are rejected at runtime. Consider confirming that _gen_dynamic_cls_signature is only reachable for reader schemas, or restoring an explicit guard.

_{Last reviewed commit: e145ef5}

dali/python/nvidia/dali/experimental/dynamic/_op_builder.py

dali/python/nvidia/dali/experimental/dynamic/_invocation.py

rostan-t · 2026-03-11T11:14:30Z

!build

dali/python/nvidia/dali/experimental/dynamic/_op_builder.py

dali/python/nvidia/dali/experimental/dynamic/_ops.py

dali-automaton · 2026-03-11T11:20:30Z

CI MESSAGE: [45875629]: BUILD STARTED

dali/python/nvidia/dali/experimental/dynamic/_ops.py

dali/test/python/experimental_mode/test_reader_decoder.py

dali/python/nvidia/dali/experimental/dynamic/_ops.py

dali-automaton · 2026-03-11T19:46:06Z

CI MESSAGE: [45875629]: BUILD FAILED

dali/python/nvidia/dali/experimental/dynamic/_invocation.py

mzient · 2026-03-13T07:44:21Z

dali/python/nvidia/dali/experimental/dynamic/_op_builder.py

        kwargs = {k: _scalar_decay(v) for k, v in kwargs.items()}
        op_class.__base__.__init__(self, max_batch_size, name, **kwargs)
+        if is_reader:
+            self._tensor_args = {k: v for k, v in tensor_kwargs.items() if v is not None}


This is potentially quite inefficient, as we'd aggressively promote arguments to tensors - and we should avoid it in __init__. We should only add tensor types as tensors - and possibly even demote 0D tensors back to scalar values. This will have considerable benefits at run-time, as we won't have to pass the tensors to each call (which, in case of readers, will mean many, many times).

Also - if we get a foreign tensor in kwargs, we should make a copy - otherwise user might get a nasty surprise.

shape = np.array([480, 640]) r = readers.Numpy(files=[...], shape=shape) shape[0] = 320 # oops, the reader will now see that

mzient · 2026-03-13T07:50:43Z

dali/python/nvidia/dali/experimental/dynamic/_ops.py

+        _, kwargs = self._process_params(
+            self._backend, self._device, batch_size, **self._tensor_args
+        )


I'd recommend at least caching the processed kwargs when the batch size is constant.

mzient

To change:

make a copy tensor arguments
do not promote non-tensor arguments to tensors
simplify caller_depth logic

dali/python/nvidia/dali/experimental/dynamic/_ops.py

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

… traces Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

…essing Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

…al tensors Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

dali/python/nvidia/dali/experimental/dynamic/_op_builder.py

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

rostan-t · 2026-03-16T10:09:39Z

the init stub change must be scoped to reader operators only

@greptileai This is not necessary. Readers are now the only operators publicly exposed as classes.

greptile-apps · 2026-03-16T10:10:06Z

That makes sense - if readers are the only operators exposed as classes, then the change is already naturally scoped correctly. Thanks for the clarification!

rostan-t · 2026-03-16T10:11:42Z

!build

dali-automaton · 2026-03-16T10:15:55Z

CI MESSAGE: [46240320]: BUILD STARTED

dali-automaton · 2026-03-16T20:49:04Z

CI MESSAGE: [46240320]: BUILD PASSED

rostan-t added the Dynamic Mode label Mar 11, 2026

greptile-apps bot reviewed Mar 11, 2026

View reviewed changes

dali/python/nvidia/dali/experimental/dynamic/_op_builder.py Outdated Show resolved Hide resolved

dali/python/nvidia/dali/experimental/dynamic/_invocation.py Outdated Show resolved Hide resolved

rostan-t force-pushed the ndd-reader-tensor-args branch 2 times, most recently from ed9a066 to c498545 Compare March 11, 2026 11:08

greptile-apps bot reviewed Mar 11, 2026

View reviewed changes

dali/python/nvidia/dali/experimental/dynamic/_op_builder.py Outdated Show resolved Hide resolved

dali/python/nvidia/dali/experimental/dynamic/_ops.py Show resolved Hide resolved

dali-automaton assigned mzient and szkarpinski Mar 11, 2026

greptile-apps bot reviewed Mar 11, 2026

View reviewed changes

dali/python/nvidia/dali/experimental/dynamic/_ops.py Show resolved Hide resolved

dali/test/python/experimental_mode/test_reader_decoder.py Show resolved Hide resolved

dali/python/nvidia/dali/experimental/dynamic/_ops.py Show resolved Hide resolved

greptile-apps bot reviewed Mar 11, 2026

View reviewed changes

dali/python/nvidia/dali/experimental/dynamic/_ops.py Outdated Show resolved Hide resolved

mzient reviewed Mar 13, 2026

View reviewed changes

dali/python/nvidia/dali/experimental/dynamic/_invocation.py Outdated Show resolved Hide resolved

mzient reviewed Mar 13, 2026

View reviewed changes

mzient requested changes Mar 13, 2026

View reviewed changes

greptile-apps bot reviewed Mar 13, 2026

View reviewed changes

dali/python/nvidia/dali/experimental/dynamic/_ops.py Show resolved Hide resolved

rostan-t added 11 commits March 13, 2026 15:23

Support constructing readers with tensor args

c57a706

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Detect when default values are passed when invoking a reader

59d4bde

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Fix call stack depth for reader invocations when reconstructing stack…

e5e11e5

… traces Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Add tests passing tensor arguments

f3500d5

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Disallow constructing a reader with batch kwargs

eda2558

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Update signature of reader constructors to allow tensor arguments

8dec527

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Update NumpyReader example to pass ROI in the reader constructor

78e15fe

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Apply suggestions from review

a8aa714

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Prevent processing again tensor args when not necessary in batch proc…

3735f44

…essing Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Fix test_video_resize_tensor_args_partial

0c3120a

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Cache processed tensor args passed in the constructor

f737d8b

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

rostan-t added 2 commits March 13, 2026 15:24

Fix caller_depth handling. Remove special case for readers

999dc0b

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Pass scalar arguments directly to reader constructors and copy extern…

f283da0

…al tensors Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

rostan-t force-pushed the ndd-reader-tensor-args branch from 51fb904 to f283da0 Compare March 13, 2026 17:00

github-advanced-security bot found potential problems Mar 13, 2026

View reviewed changes

dali/python/nvidia/dali/experimental/dynamic/_op_builder.py Dismissed Show dismissed Hide dismissed

greptile-apps bot reviewed Mar 13, 2026

View reviewed changes

dali/python/nvidia/dali/experimental/dynamic/_op_builder.py Outdated Show resolved Hide resolved

rostan-t added 6 commits March 13, 2026 17:37

Set _raw_tensor_args instead of _tensor_args in reader constructor

eb76840

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Fix tensor arg tracking in reader op constructor

f4c4ffa

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Properly use reader tensor args in TorchData integration

13a4967

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Fix signature of reader constructors

821902b

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Fix tensor arg handling in reader op constructor

5b79b58

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

Fix typos

e145ef5

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

rostan-t requested a review from mzient March 16, 2026 10:11

Conversation

rostan-t commented Mar 11, 2026

Category:

Description:

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

Uh oh!

review-notebook-app bot commented Mar 11, 2026

Uh oh!

greptile-apps bot commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (2)

Uh oh!

Uh oh!

Uh oh!

rostan-t commented Mar 11, 2026

Uh oh!

Uh oh!

Uh oh!

dali-automaton commented Mar 11, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dali-automaton commented Mar 11, 2026

Uh oh!

Uh oh!

mzient Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

mzient Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

mzient Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

mzient left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rostan-t commented Mar 16, 2026

Uh oh!

greptile-apps bot commented Mar 16, 2026

Uh oh!

rostan-t commented Mar 16, 2026

Uh oh!

dali-automaton commented Mar 16, 2026

Uh oh!

dali-automaton commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

greptile-apps bot commented Mar 11, 2026 •

edited

Loading