Skip to content

Pipeline building in C API#6253

Open
mzient wants to merge 11 commits intoNVIDIA:mainfrom
mzient:c_api2_pipeline_building
Open

Pipeline building in C API#6253
mzient wants to merge 11 commits intoNVIDIA:mainfrom
mzient:c_api2_pipeline_building

Conversation

@mzient
Copy link
Contributor

@mzient mzient commented Mar 12, 2026

Category:

New feature (non-breaking change which adds functionality)

Description:

This PR adds an ability to build pipelines in C API by defining operator descriptors and adding those to the pipeline via daliPipelineAddOperator.

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [45989005]: BUILD STARTED

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 12, 2026

Greptile Summary

This PR adds programmatic pipeline building to the C API, providing C-callable counterparts to the C++ Pipeline::AddExternalInput, Pipeline::AddOperator, and Pipeline::SetOutputDescs methods. It introduces five new public types (daliBackend_t, daliIODesc_t, daliArgDesc_t, daliArgInputDesc_t, daliOperatorDesc_t) and three new functions (daliPipelineAddExternalInput, daliPipelineAddOperator, daliPipelineSetOutputs), all implemented in pipeline.cc with thorough argument validation. Validation helpers in validation.h are extended with a new Validate(daliPipelineIODesc_t) overload and lazy evaluation overloads for CheckArg/CheckNotNull. Tests cover the counter + TestOp graph, external source feeding, argument inputs (per-sample resize), and checkpointing through both the serialized and builder code paths.

Key points:

  • All previously-flagged issues (negative count validation, unconditional NOT_NULL on outputs, DALI_FLOAT64 documentation, dtype validation gating, empty-output SetOutputDescs forwarding) have been addressed in this revision.
  • The daliIODesc_t::device_type field name is inconsistent with the pre-existing daliPipelineIODesc_t::device field — both are daliStorageDevice_t but named differently, which could confuse callers working with both struct types in a single file.
  • pipeline_builder_test.cc:AddExternalInput erroneously uses device_id = 0 (forces GPU initialization) for a test that exclusively uses CPU tensors and should use std::nullopt (CPU_ONLY_DEVICE_ID), causing the test to fail on CPU-only machines.
  • pipeline_test_utils.h correctly adds inline to free functions defined in a header, fixing a latent ODR issue.

Confidence Score: 4/5

  • Safe to merge after fixing the GPU device ID in AddExternalInput test; the implementation and validation logic are solid.
  • The core implementation is well-structured with thorough validation (negative counts, null pointers, device types) and the previously-reported issues have all been addressed. The one functional bug is in the test suite: AddExternalInput uses device_id = 0 for a CPU-only test, which will fail on CPU-only CI machines. The public API naming inconsistency (device_type vs device) is a design concern before the API is frozen but not a runtime bug. Otherwise the changes are straightforward and well-tested.
  • dali/c_api_2/pipeline_builder_test.cc (wrong device_id in AddExternalInput test) and include/dali/dali.h (device field naming before API freeze).

Important Files Changed

Filename Overview
include/dali/dali.h Adds daliBackend_t, daliIODesc_t, daliArgDesc_t, daliArgInputDesc_t, and daliOperatorDesc_t struct/enum types, plus three new public functions (daliPipelineAddExternalInput, daliPipelineAddOperator, daliPipelineSetOutputs). The field name for the storage device in the new daliIODesc_t (device_type) is inconsistent with daliPipelineIODesc_t (device) — a minor but potentially confusing API surface issue before the API is released.
dali/c_api_2/pipeline.cc Implements the three new public API functions plus internal helpers BackendToString, AddArgToSpec, and MakeOpSpec. Validation of counts, null pointers, and device types is thorough. The previous issues with negative counts, unconditional NULL checks, and unvalidated device types have all been addressed in this revision.
dali/c_api_2/validation.h Adds Validate(daliPipelineIODesc_t), lazy CheckArg (callable overload), and lazy CheckNotNull (callable overload). The dtype validation is now correctly gated by dtype_present, addressing the previously reported issue. Implementation looks correct.
dali/c_api_2/pipeline_builder_test.cc New test file with unit tests for daliPipelineAddOperator and daliPipelineAddExternalInput. Good coverage for the counter+TestOp pipeline and argument inputs. However, AddExternalInput incorrectly uses device_id = 0 (GPU) for a CPU-only external source test, which will fail on CPU-only machines.
dali/c_api_2/op_test/complex_pipeline_test.cc Extends existing tests to cover the C API builder path alongside the serialized pipeline path. Adds ReaderDecoderCApiPipe helper and corresponding CAPI2_PipelineBuilderTest test cases including checkpointing and argument-input (per-sample resize sizes). The refactoring to extract RunCheckpointingTest is clean. No issues found.
dali/c_api_2/pipeline_test_utils.h Minor fix adding inline to free functions defined in the header (CompareTensorLists, ComparePipelineOutput, ComparePipelineOutputs) to avoid ODR violations when the header is included in multiple translation units. Straightforward and correct.

Sequence Diagram

sequenceDiagram
    participant C as C Caller
    participant API as C API (pipeline.cc)
    participant Val as Validate (validation.h)
    participant Pipe as Pipeline (C++)

    C->>API: daliPipelineCreate(&h, &params)
    API->>Pipe: new Pipeline(params)
    Pipe-->>API: pipeline object
    API-->>C: daliPipeline_h

    C->>API: daliPipelineAddExternalInput(h, &input_desc)
    API->>Val: Validate(*input_desc)
    Val-->>API: ok / throws
    API->>Pipe: AddExternalInput(name, device, dtype, ndim, layout)

    C->>API: daliPipelineAddOperator(h, &op_desc)
    API->>API: MakeOpSpec(op_desc)
    Note over API: Validates counts, nulls, backend<br/>Builds OpSpec with inputs/outputs/<br/>args/arg_inputs
    API->>Val: Validate(device_type) per I/O
    Val-->>API: ok / throws
    API->>Pipe: AddOperator(spec [, instance_name])

    C->>API: daliPipelineSetOutputs(h, n, outputs)
    API->>Val: Validate(outputs[i]) per output
    Val-->>API: ok / throws
    API->>Pipe: SetOutputDescs(descs)

    C->>API: daliPipelineBuild(h)
    API->>Pipe: Build()
    Pipe-->>API: ready
    API-->>C: DALI_SUCCESS
Loading

Last reviewed commit: 681bc56

@NVIDIA NVIDIA deleted a comment from dali-automaton Mar 13, 2026
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [46046566]: BUILD STARTED

@mzient mzient force-pushed the c_api2_pipeline_building branch from 4f37b46 to 14c69cd Compare March 13, 2026 10:05
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [46051079]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [46064404]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [46065411]: BUILD STARTED

Comment on lines +593 to +609
case DALI_INT_VEC: {
check_arr();
auto *d = static_cast<const int *>(arg.arr);
spec.AddArg(name, std::vector<int>(d, d + arg.size));
break;
}
case DALI_FLOAT_VEC: {
check_arr();
auto *d = static_cast<const float *>(arg.arr);
spec.AddArg(name, std::vector<float>(d, d + arg.size));
break;
}
case DALI_BOOL_VEC: {
check_arr();
auto *d = static_cast<const bool *>(arg.arr);
spec.AddArg(name, std::vector<bool>(d, d + arg.size));
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Null-pointer arithmetic UB for zero-size vector args

When arg.size == 0 and arg.arr == nullptr (a valid combination — check_arr() only requires non-NULL when arg.size > 0), the expression d + arg.size where d == nullptr performs pointer arithmetic on a null pointer. This is technically undefined behaviour under the C++14/17 standard (a pointer must point to an array element or one-past-the-end to support arithmetic). In practice all major compilers produce correct code for nullptr + 0, but the UB could be silently miscompiled under aggressive optimisation.

The same issue exists on the equivalent lines for DALI_FLOAT_VEC (line 602) and DALI_BOOL_VEC (line 607).

A minimal fix is to guard the vector constructor call, matching the pattern already used for DALI_STRING_VEC:

case DALI_INT_VEC: {
  check_arr();
  auto *d = static_cast<const int *>(arg.arr);
  spec.AddArg(name, arg.size > 0 ? std::vector<int>(d, d + arg.size)
                                 : std::vector<int>());
  break;
}

Or equivalently, skip the pointer arithmetic entirely when the size is zero.

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [46065411]: BUILD FAILED

mzient added 11 commits March 17, 2026 16:30
Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
…guments.

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
@mzient mzient force-pushed the c_api2_pipeline_building branch from c493a73 to 681bc56 Compare March 17, 2026 15:31
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [46347401]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [46347401]: BUILD PASSED

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants