Manage concurrency and dependency of executable content #2413

agahkarakuzu · 2025-11-08T00:27:09Z

Need

In some projects, the output of one notebook serves as the input for another. The default parallel execution can break such dependency chains.
When multiple executable resources are present, parallel execution can overwhelm available computational resources.

Proposed solution

This PR introduces batching and sequencing logic to manage execution order based on the toc definition in myst.yml. The implementation allows users to control both concurrency and dependency ordering of executable documents.

Adding the documentation snippet for clarity:

How to manage the order of execution?

Implicit TOC

If no table of contents (toc) is defined in your myst.yml, all executable sources are run in parallel by default.

Explicit TOC

Managing concurrency without dependency order

By default, executable files are processed concurrently in batches of 5.

You can modify this behavior by passing the --execute-concurrency <n> option to your build command, where <n> specifies how many executable documents should run simultaneously.

You can pass --execute-concurrency <n> to your build command to change the number of executable documents that will be executed together.

Defining a specific execution order

To define a sequential execution order, use the execution_order field within the toc element. For example:

  toc:
  - file: paper.md
  - file: evidence/figure_1.ipynb
    execution_order: 0
  - file: evidence/figure_2.ipynb
    execution_order: 1
  - file: evidence/figure_3.ipynb
    execution_order: 1

In this example, figure_2.ipynb and figure_3.ipynb will both wait for figure_1.ipynb to finish before being executed concurrently.

Warning

If a notebook that other notebooks depend on fails during execution, the build process will continue by default. To stop the build whenever an error occurs (including for notebooks without dependencies) pass the --strict flag to your build command.

Tests

I tested this locally across several builds and appears to function correctly. The current implementation provides a straightforward mechanism for managing execution order. It is not a full-fledged workflow-like approach, but I believe still a meaningful improvement for build control.

I am (and @fwkoch) not sure if toc is the right place to define this logic, but it serves as a reasonable starting point for now.

Relates to: #1794, #2055

changeset-bot · 2025-11-08T00:27:13Z

⚠️ No Changeset found

Latest commit: 21af2c8

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

agoose77 · 2025-11-08T17:15:35Z

@agahkarakuzu it's funny that you looked at this — I was just thinking about this over the JupyterCon flights.

I like the concurrency limit, and the idea of a hard-coded ordering. Here are some of my unstructured thoughts:

Relative vs Absolute Ordering

When I was thinking about this, I considered the idea of relative ordering using before and after in addition to absolute ordering (at the ToC level). I wonder whether it makes sense to implement ordering in a relative sense, because at the notebook level we are often thinking "this needs to execute after XXX", rather than "this needs to execute fifth in the order".

There are pros/cons to both approaches, and I'm easily convinced either way.

Render vs Execution Ordering

As it stands, this PR changes the rendering ordering. This is a problem if people use continuous numbering, which will break with this PR if the ordering disagrees.

I think we need to introduce a higher level abstraction for execution (an ExecutionOrchestrator) that exposes an async locking interface. This would let us define the execution strategy (such as batching, and batch order), but keep it invisible in the execution transform. Something like

function executionTransform(mdast, vfile, frontmatter, session) {
   // Wait for our turn
   await session.executionOrchestrator.wantsExecution(frontmatter.path);
}

where this promise is resolved by the orchestrator once the previous notebooks have executed.

I'd love to work with you on this, and welcome any other thoughts about things like ordering from the team!

fwkoch · 2025-11-08T17:54:48Z

@agoose77 - regarding your second point, I'm less worried about the ordering - this concurrency management only applies to the first processMdast phase of project processing. This function operates on individual files and requires no shared state or assumptions about ordering. Enumeration happens after this (in selectPageReferenceStates, specifically order is maintained by this map; https://github.com/jupyter-book/mystmd/blob/main/packages/myst-cli/src/process/site.ts#L340).

Now, I haven't actually tested this - we should probably have some end-to-end tests that combine (1) batched execution and (2) page enumeration.

(@agoose77 - I think your code snippet is how we would need to do this once we streamline into a single "better" processing pipeline, as we prototyped in #1699)

fwkoch · 2025-11-08T18:19:37Z

Regarding how we define this ordering - that question feels fuzzier and more flexible.

Where should it go (toc vs. page vs. elsewhere in the project frontmatter) and how should it be defined (number vs. before/after vs. something else...)? Currently it's in the toc, which is nice and centralized. But not all projects necessarily have an explicit toc. We could also define it on the individual pages - this makes sense for named before/after dependencies (we probably just need after I think?). Spreading ordering-by-number across different pages feels not-so-good.

We probably don't want to allow multiple ways to do this - then we get into resolving conflicts... 😕

My vague preference is individual pages keep track of their execution_dependencies and we combine that with --execution-concurrency, no changes to the toc. But my opinions are not very strong.

agoose77 · 2025-11-09T11:47:25Z

@agoose77 - regarding your second point, I'm less worried about the ordering - this concurrency management only applies to the first processMdast phase of project processing. This function operates on individual files and requires no shared state or assumptions about ordering. Enumeration happens after this (in selectPageReferenceStates, specifically order is maintained by this map; https://github.com/jupyter-book/mystmd/blob/main/packages/myst-cli/src/process/site.ts#L340).

Now, I haven't actually tested this - we should probably have some end-to-end tests that combine (1) batched execution and (2) page enumeration.

(@agoose77 - I think your code snippet is how we would need to do this once we streamline into a single "better" processing pipeline, as we prototyped in #1699)

Ah, of course. I didn't test that yesterday when I was musing about this!

agahkarakuzu · 2025-11-09T19:44:51Z

@fwkoch @agoose77 thank you so much for looking into this! Sorry for the late response, been navigating the insane air travelling conditions in the states.

Currently it's in the toc, which is nice and centralized. But not all projects necessarily have an explicit toc.

Indeed, it does not take effect when TOC is implicit.

Switching to a DAG-based model with explicit dependencies (instead of numerical expressions, dependencies can be defined as notebook or page names) would bring more sophistication. Maybe using p-graph would be a good option for that, converging to orchestration approach a bit more.

Then again, I was not really sure about defining these at the page level from the get go. If you see a better middle ground between impeccable orchestration and this simple implementation, happy to give it a stab.

bsipocz · 2025-11-10T07:42:15Z

Could we separate out the execution-concurrency for which I don't see any objections or questions about from execution_order?

(Having a control over the parallelism would solve #1831 and potential resource limit hits, etc.)

agahkarakuzu · 2025-11-10T20:28:05Z

@bsipocz by separating out, do you mean addressing it in another PR?

agoose77 · 2025-11-10T22:58:53Z

@agahkarakuzu yes, I believe so.

Let's do that, if it's not too much work. It's a bit more effort overall, but makes it easier to review and avoid blocking the useful fix!

agahkarakuzu · 2025-11-11T03:33:41Z

@agoose77 no worries, sent it separately at #2428.

bsipocz · 2025-11-12T12:12:42Z

@bsipocz by separating out, do you mean addressing it in another PR?

I'm sorry for getting back to this just now, but Angus has answered exactly of what I meant. Overall it's just very helpful separating out new features/fixes/etc into separate PRs, you can expect a much quicker turnaround as it's both cleaner/easier to review such contributions but also they are not blocking each other to get in when it's only parts are under discussion.

agahkarakuzu added 3 commits November 7, 2025 15:18

[ENH] Add concurrency limiting option

09151eb

[ENH] Toc & plim based execution order

4dfecec

Add documentation about concurrency and dependency

c0f333d

github-actions bot added the documentation Improvements or additions to documentation label Nov 8, 2025

fwkoch added enhancement New feature or request and removed documentation Improvements or additions to documentation labels Nov 8, 2025

github-actions bot added the documentation Improvements or additions to documentation label Nov 8, 2025

fwkoch mentioned this pull request Nov 8, 2025

🤖 Stop over-eager documentation labeling #2417

Merged

agoose77 removed the documentation Improvements or additions to documentation label Nov 9, 2025

agoose77 added 3 commits November 9, 2025 12:20

Merge remote-tracking branch 'origin/main' into concurndepend

4dc1949

chore: run linter

36c1b50

fix: batch

21af2c8

agahkarakuzu mentioned this pull request Nov 11, 2025

🌍↔️🐍 Limit the number of simultaneous executions #2428

Open

jupyter-book-pr-triage-bot bot added this to PR triage (experimental) Nov 29, 2025

davidorme mentioned this pull request Dec 1, 2025

Split the GIS practical up into different pages ImperialCollegeLondon/living_planet_eco_evo_data#2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Manage concurrency and dependency of executable content #2413

Manage concurrency and dependency of executable content #2413

agahkarakuzu commented Nov 8, 2025

Uh oh!

changeset-bot bot commented Nov 8, 2025 •

edited

Loading

Uh oh!

agoose77 commented Nov 8, 2025

Uh oh!

fwkoch commented Nov 8, 2025

Uh oh!

fwkoch commented Nov 8, 2025

Uh oh!

agoose77 commented Nov 9, 2025

Uh oh!

agahkarakuzu commented Nov 9, 2025

Uh oh!

bsipocz commented Nov 10, 2025

Uh oh!

agahkarakuzu commented Nov 10, 2025

Uh oh!

agoose77 commented Nov 10, 2025

Uh oh!

agahkarakuzu commented Nov 11, 2025

Uh oh!

bsipocz commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Manage concurrency and dependency of executable content #2413

Are you sure you want to change the base?

Manage concurrency and dependency of executable content #2413

Conversation

agahkarakuzu commented Nov 8, 2025

Need

Proposed solution

How to manage the order of execution?

Implicit TOC

Explicit TOC

Managing concurrency without dependency order

Defining a specific execution order

Tests

Uh oh!

changeset-bot bot commented Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

agoose77 commented Nov 8, 2025

Relative vs Absolute Ordering

Render vs Execution Ordering

Uh oh!

fwkoch commented Nov 8, 2025

Uh oh!

fwkoch commented Nov 8, 2025

Uh oh!

agoose77 commented Nov 9, 2025

Uh oh!

agahkarakuzu commented Nov 9, 2025

Uh oh!

bsipocz commented Nov 10, 2025

Uh oh!

agahkarakuzu commented Nov 10, 2025

Uh oh!

agoose77 commented Nov 10, 2025

Uh oh!

agahkarakuzu commented Nov 11, 2025

Uh oh!

bsipocz commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

changeset-bot bot commented Nov 8, 2025 •

edited

Loading