Skip to content

Do not set different cache directories per job#4460

Merged
connorjward merged 4 commits intoreleasefrom
connorjward/unset-per-job-cache-dir
Jul 30, 2025
Merged

Do not set different cache directories per job#4460
connorjward merged 4 commits intoreleasefrom
connorjward/unset-per-job-cache-dir

Conversation

@connorjward
Copy link
Contributor

@connorjward connorjward commented Jul 24, 2025

This should not be necessary as the disk caches are now safer in how they handle race conditions. Using a unified cache directory should improve performance.

This should not be necessary as the disk caches are now safer in how
they handle race conditions. Using a unified cache directory should
improve performance.
@connorjward connorjward marked this pull request as ready for review July 24, 2025 15:25
@connorjward
Copy link
Contributor Author

This appears to work fine. The latest failure is just the macro solve one.

@connorjward connorjward requested a review from JHopeCollins July 30, 2025 11:04
@JHopeCollins
Copy link
Member

  1. What were the changes to the caching?
  2. Do those changes improve the behaviour for race conditions between different firedrake instances, not just between different ranks in the same firedrake instance?

@connorjward
Copy link
Contributor Author

  1. What were the changes to the caching?

9fa112d mostly. Though I have touched caching.py a couple of times since firedrake-run-split-tests went in and they may also have impacted this.

  1. Do those changes improve the behaviour for race conditions between different firedrake instances, not just between different ranks in the same firedrake instance?

I think so. This is all very hard to reason about but I don't think there should be any issues provided that each process/rank can only add cache entries (i.e. I imagine one process running firedrake-clean would totally ruin everything).

With our test suite there is a lot of contention for disk access between processes so I think we are stress testing this quite well. When I originally added this functionality we were reliably getting hangs so the fact that this hasn't been an issue for a couple of runs bodes well. This change also only affects CI so we can be a little more gung ho about things.

@JHopeCollins
Copy link
Member

With our test suite there is a lot of contention for disk access between processes so I think we are stress testing this quite well. When I originally added this functionality we were reliably getting hangs so the fact that this hasn't been an issue for a couple of runs bodes well. This change also only affects CI so we can be a little more gung ho about things.

Ok, this looks ok then. Is sounds like if this change does end up harming things then we'll see if very quickly at least. I'm happy for this to go in once the stokes fix is merged in and CI is passing.

@connorjward connorjward merged commit b948d8f into release Jul 30, 2025
6 of 7 checks passed
@connorjward connorjward deleted the connorjward/unset-per-job-cache-dir branch July 30, 2025 15:03
pbrubeck pushed a commit that referenced this pull request Jul 30, 2025
* Do not set different cache directories per job

This should not be necessary as the disk caches are now safer in how
they handle race conditions. Using a unified cache directory should
improve performance.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants