datadeps: Implement an optimizing scheduler #592

jpsamaroo · 2025-03-31T19:23:48Z

At its core, this PR implements an numerical optimizer-based scheduler for Datadeps. This scheduler uses JuMP to implement the scheduler designed by @pszufe and documented at https://github.com/pszufe/DagScheduler. The idea of this scheduler is to aggressively, ahead-of-time optimize a Datadeps DAG based on all available information. This scheduler, by its nature, has the ability to make nearly-optimal scheduling decisions - this is different from our existing JIT-style schedulers, which don't optimize over the entire DAG, but only look at a few tasks currently in front of them.

To make this scheduler work, some additional improvements were made:

A new library, MetricsTracker.jl, was implemented to make it easy to declaratively configure which metrics to collect during task scheduling and execution. It also provides mechanisms to efficiently search through collected metric values for those matching a certain combination of target keys, like selected processor, task signature, and more. This is used by the scheduler to lookup information relevant to each task, like estimated execution time and transfer costs.
Schedules generated by Datadeps are now cached and reused, when possible, within the same session. Submitted DAGs are compared for similarity, and if a match is found, the previously-generated schedule is reused. This allows potentially expensive scheduling operations to be amortized when Datadeps operations are being called repeatedly.

Todo:

Think about a solution for stale metrics when reusing schedules
Add tests
Add docs

jpsamaroo added 5 commits December 11, 2024 15:56

MetricsTracker: Add metrics tracking package

f1617f4

Sch: Use MetricsTracker

db5eecd

datadeps: AOT scheduling, add JuMP scheduler

c52b7d6

datadeps: Add logic to compare similar DAGs

e33216c

Sch: Bifurcate signature metrics on processor

2888fa4

jpsamaroo added needs tests needs docs datadeps labels Mar 31, 2025

jpsamaroo marked this pull request as draft March 31, 2025 19:23

jpsamaroo and others added 4 commits March 31, 2025 12:41

datadeps: Support within-DAG deps for AOT scheduler

82df8f5

datadeps: Hide argument type instability

a10e23b

submission: Add asserts to detect self-syncdep/unscheduled task

cdb3253

datadeps: Remove launch_wait

256057d

jpsamaroo force-pushed the jps/datadeps-opt-sched branch from c85eed4 to 256057d Compare March 31, 2025 19:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

datadeps: Implement an optimizing scheduler #592

datadeps: Implement an optimizing scheduler #592

Uh oh!

jpsamaroo commented Mar 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

datadeps: Implement an optimizing scheduler #592

Are you sure you want to change the base?

datadeps: Implement an optimizing scheduler #592

Uh oh!

Conversation

jpsamaroo commented Mar 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants