Skip to content

Comments

enh: reduce per-task overhead#858

Open
satra wants to merge 1 commit intomainfrom
enh/profiling
Open

enh: reduce per-task overhead#858
satra wants to merge 1 commit intomainfrom
enh/profiling

Conversation

@satra
Copy link
Contributor

@satra satra commented Feb 21, 2026

closes #850

Five targeted optimisations identified via cProfile on the audio-file benchmark from issue #850:

  1. Etelemetry sentinel (job.py, submitter.py): Replace the None initial value of Job._etelemetry_version_data with a distinct _ETELEMETRY_UNCHECKED sentinel. Previously a failed network check returned None, keeping the is None guard True forever and re-issuing the HTTP request on every task (~93 ms each).

  2. Function bytes cache (hash.py): Add a module-level _function_bytes_cache keyed by (module, qualname, mtime_ns). inspect.getsource() + ast.parse() now runs at most once per function per session; subsequent calls cost only a single os.stat().

  3. Skip hash-change check for non-FileSet tasks (job.py): Call TypeParser.contains_type(FileSet, ...) on each input field; if none match, skip the expensive full hash recomputation in _check_for_hash_changes(). Scalar/pure-Python values cannot mutate under Pydra.

  4. In-memory result cache (job.py): Store the completed Result on self._cached_result at the end of run() so same-process callers (e.g. Submitter.__call__ with DebugWorker) do not need to deserialise it back from disk. The field is excluded from __getstate__ so subprocess/Slurm workers still use the disk path.

  5. Once-per-location PersistentCache.clean_up() (hash.py): Track which cache locations have already been scanned in a class-level set (_session_cleanups_done). The O(n) iterdir() + stat() loop no longer runs after every task. path.unlink(missing_ok=True) makes concurrent cleanup by multiple Slurm nodes on shared NFS safe.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Summary

Checklist

  • I have added tests to cover my changes (if necessary)
  • I have updated documentation (if necessary)

Five targeted optimisations identified via cProfile on the audio-file
benchmark from issue #850:

1. **Etelemetry sentinel** (`job.py`, `submitter.py`): Replace the
   `None` initial value of `Job._etelemetry_version_data` with a
   distinct `_ETELEMETRY_UNCHECKED` sentinel.  Previously a failed
   network check returned `None`, keeping the `is None` guard True
   forever and re-issuing the HTTP request on every task (~93 ms each).

2. **Function bytes cache** (`hash.py`): Add a module-level
   `_function_bytes_cache` keyed by `(module, qualname, mtime_ns)`.
   `inspect.getsource()` + `ast.parse()` now runs at most once per
   function per session; subsequent calls cost only a single `os.stat()`.

3. **Skip hash-change check for non-FileSet tasks** (`job.py`): Call
   `TypeParser.contains_type(FileSet, ...)` on each input field; if
   none match, skip the expensive full hash recomputation in
   `_check_for_hash_changes()`.  Scalar/pure-Python values cannot
   mutate under Pydra.

4. **In-memory result cache** (`job.py`): Store the completed `Result`
   on `self._cached_result` at the end of `run()` so same-process
   callers (e.g. `Submitter.__call__` with DebugWorker) do not need to
   deserialise it back from disk.  The field is excluded from
   `__getstate__` so subprocess/Slurm workers still use the disk path.

5. **Once-per-location PersistentCache.clean_up()** (`hash.py`): Track
   which cache locations have already been scanned in a class-level set
   (`_session_cleanups_done`).  The O(n) `iterdir()` + `stat()` loop no
   longer runs after every task.  `path.unlink(missing_ok=True)` makes
   concurrent cleanup by multiple Slurm nodes on shared NFS safe.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codecov
Copy link

codecov bot commented Feb 21, 2026

Codecov Report

❌ Patch coverage is 71.42857% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 34.58%. Comparing base (876be56) to head (cab9af2).

Files with missing lines Patch % Lines
pydra/utils/hash.py 67.34% 9 Missing and 7 partials ⚠️
pydra/engine/job.py 84.61% 2 Missing ⚠️

❌ Your patch check has failed because the patch coverage (71.42%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

❗ There is a different number of reports uploaded between BASE (876be56) and HEAD (cab9af2). Click for more details.

HEAD has 46 uploads less than BASE
Flag BASE (876be56) HEAD (cab9af2)
52 6
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #858       +/-   ##
===========================================
- Coverage   88.35%   34.58%   -53.77%     
===========================================
  Files          88       66       -22     
  Lines       18198    11316     -6882     
  Branches     3565     1504     -2061     
===========================================
- Hits        16079     3914    -12165     
- Misses       1737     6975     +5238     
- Partials      382      427       +45     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

reduce pydra overhead

1 participant