Skip to content

AssertionError: no app_id collisions expected` when scheduling JobGroup with multiple executables (Local Executor) #403

@bzantium

Description

@bzantium

Describe the bug

When running a pipeline locally (e.g., executor="none") that involves a JobGroup or multiple bundled tasks in a single step, the execution fails with an AssertionError: no app_id collisions expected.

This occurs because the local scheduler in nemo_run (via torchx) does not correctly handle the dryrun_info for multiple executables within a single group. Instead of maintaining unique information for each executable, it appears to overwrite the dry-run information during iteration, causing the underlying scheduler to detect an App ID collision when it attempts to schedule the tasks.

Steps/Code to reproduce bug

  1. Define a simple experiment with multiple tasks bundled together.
  2. configure the run to use the local executor.
  3. Execute the experiment.

Here is a minimal example snippet:

import nemo_run as run

# Define two simple tasks
task1 = run.Script(inline='echo "Hello Task 1"')
task2 = run.Script(inline='echo "Hello Task 2"')

# Create an experiment
with run.Experiment("local_collision_test") as exp:
    # Add tasks as a bundle/group (this creates a JobGroup internally)
    exp.add([task1, task2], name="my_job_group")

    # Run locally
    # This triggers the AssertionError
    exp.run()

Expected behavior

The local executor should be able to accept a list of tasks (a JobGroup), generate unique App IDs for each task, and execute them sequentially or in parallel without crashing due to ID collisions.

Additional context

Traceback:

  File "/.../site-packages/nemo_run/run/torchx_backend/schedulers/local.py", line 106, in schedule
    app_id = super().schedule(dryrun_info=dryrun_info)
  File "/.../site-packages/torchx/schedulers/local_scheduler.py", line 791, in schedule
    app_id not in self._apps
AssertionError: no app_id collisions expected since uuid4 suffix is used

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions