Skip to content

[BUG] Benchmarks workflow failing on every PR and master— broken imports after Monte Carlo transport refactoring #3443

@sakirr05

Description

@sakirr05

Summary

The benchmarks / build (pull_request_target) GitHub Actions workflow has been failing on every PR and every master push since December 28, 2025 (~2 months). This means benchmark comparisons are completely non-functional and every PR shows a failing check regardless of what the PR actually changes (including trivial changes like adding an ORCID ID in CITATION.cff).

Root Cause

The benchmark Python files were not updated after the major Monte Carlo transport refactoring introduced in PRs #3423 and #3427. The benchmark suite imports symbols that no longer exist or have changed signatures in the current codebase.

Failing Workflow

  • Workflow: benchmarks / build (pull_request_target)
  • Fails after: ~7 minutes
  • Affects: ALL PRs + master pushes
  • First broken: Dec 28, 2025 (post-release 2025.12.28)

Why It's Hard to Spot

ASV (airspeed velocity) exits with code 0 even when benchmark discovery fails due to ImportErrors. So Step 9 "Run benchmarks" appears green in the workflow logs, but benchmarks.json is never created because no benchmarks were discovered. Step 10 "Generate Graphs and HTML" then crashes with the misleading error:

Benchmark list file .asv/results/benchmarks.json missing!
Use `asv run --bench just-discover` to regenerate benchmarks.json

The real errors are ImportErrors happening silently inside ASV's discovery phase.

The 3 Broken Files and Errors

Error 1 — benchmarks/transport_montecarlo_single_packet_loop.py

# Line 8 — BROKEN IMPORT
from tardis.transport.montecarlo import single_packet_loop
# ImportError: cannot import name 'single_packet_loop'

single_packet_loop was replaced by packet_propagation in tardis.transport.montecarlo.modes.classic.packet_propagation. Function signature also changed — now takes separate estimators_bulk and estimators_line instead of a combined object.

Error 2 — benchmarks/transport_montecarlo_main_loop.py

# Line 8 — BREAKS ON OLDER COMMITS
from tardis.transport.montecarlo.modes.classic.montecarlo_transport import montecarlo_transport
# ModuleNotFoundError: No module named 'tardis.transport.montecarlo.modes'

The modes subpackage didn't exist in commits before the refactoring. Since ASV tests against the last 4 commits, older commits fail discovery. Additionally, montecarlo_transport no longer accepts radfield_mc_estimators.

Error 3 — benchmarks/benchmark_base.py

# Lines 22-26 — BROKEN SIGNATURES
from tardis.transport.montecarlo.estimators import (
    init_estimators_bulk,
    init_estimators_line,
    init_estimators_continuum,
)
# ImportError: cannot import name 'init_estimators_bulk'

All three factory functions changed signatures:

  • Old: init_estimators_bulk(mean_intensity_total=..., mean_frequency=...)
  • New: init_estimators_bulk(n_cells: int) -> EstimatorsBulk
  • Old: init_estimators_line(keyword args)
  • New: init_estimators_line(n_lines_by_n_cells_tuple: tuple[int, int])
  • Old: init_estimators_continuum(keyword args)
  • New: init_estimators_continuum(n_levels_bf_species_by_n_cells_tuple, n_cells)

Timeline

Date Event
Dec 22, 2025 ✅ Last successful benchmark run
Dec 28, 2025 ❌ First failure (post-release 2025.12.28)
Feb 10, 2026 ❌ PR #3423 "Excise the estimators" — made it worse
Feb 17, 2026 ❌ PR #3427 "New Monte Carlo transport modes" — made it worse
Feb 18, 2026 ❌ Current state — all runs fail

What Needs to Be Fixed

Part A — Fix benchmark files (needs MC transport expertise)

  • benchmark_base.py — update estimator factory call signatures
  • transport_montecarlo_single_packet_loop.py — update import + function call args
  • transport_montecarlo_main_loop.py — remove radfield_mc_estimators from call

I need confirmation from someone with Monte Carlo transport knowledge that packet_propagation is the correct replacement for single_packet_loop and that the benchmark logic still measures the right thing before I raise a fix PR.

Part B — Add path filtering to benchmarks.yml (easy win, can be done independently)

Metadata-only PRs (ORCID, CITATION, docs) should not trigger benchmarks:

on:
  pull_request_target:
    paths:
      - 'tardis/**'
      - 'benchmarks/**'
      - 'setup.cfg'
      - 'setup.py'
      - 'pyproject.toml'

Question for Maintainers

  1. Can someone confirm packet_propagation is the correct replacement for single_packet_loop in the benchmark context?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions