-
-
Notifications
You must be signed in to change notification settings - Fork 338
CacheDir Improvements
Mats Wichmann edited this page Oct 4, 2024
·
14 revisions
Here are some collected thoughts on improving the benefits the SCons derived-file cache brings to the table. At the moment, there is no ordering: where things appear on this page should not imply anything about priority, usefulness, ease of implementation, or any other metric. That would be useful to add.
- Cache management support: SCons has no cache size limit or file count limit, and does not prune the cache. This should be provided. If SCons itself implemented a (settable) size limit, it would have to implement a policy for when to evict old entries (e.g. LRU). A separate cache management utility would leave the SCons code simpler and might be desirable for performance reasons as well. There is a trick that was contributed to this wiki that adds some support for size limiting: LimitCacheSizeWithProgress
- Automatic caching: would it be desirable to enable caching by default, with an option to disable entirely or selectively? Various other systems do this, others like SCons are opt-in. It might be a surprise that builds cause disk space consumption, but it might produce better performance out of the box.
- Set the cache directory from the command line. This has an issue filed, see 3618
- Produce cache logs in a machine-parsable format. See 3696. For an illustration,
ccachehas these command-line options:--show-statsto produce a readable statistics output,--print-statsto produce a predictable TSV format. - Enhanced/persistent cache statistics: at the moment, cache requests and hits are collected for the duration of an SCons run, and emitted only if
--cache-debugis requested. Cache information could be stored in the cache to allow examination of longer-term cache statistics, plus some more general information could be shown like cache disk/file size, ages, etc.) - Cache compression: should there be an option (or default) to compress cache entries?
- Remote cache storage. This is a highly desired topic. As developers mostly work on individual machines, sharing cache across a team needs a network mode. This comes with a bunch of considerations, and will likely add a bunch of options. A few bullet points, see discussion below for more details.
- Cache policy: how a local and remote cache interact needs to be defined. There could be a settable policy (SCons uses a "policy" approach for options like
--duplicateand--decider), or the policy can be derived from the combination of flags. This needs to address: is this run L-only, R-only, L+R; if both, which to prefer; does L-miss + R-hit cause a write to L. - Remote backends: a proposed implementation 3971 is written to a specific back-end style, that used by bazel-remote, which is a WebDAV style and uses two storage buckets (named
/acand/cas), because bazel also caches its' idea of Actions. A more general approach would be to allow any key-value store like Redis/Valkey, Memcached, various cloud options, etc. - Remote configuration: there should be a way to set attributes for use of the remote, such as considering it read-only. Should account for local settings and possibly remote - to follow the example, we could decide to treat the remote as read-only, or the remote could decide that it's read-only to us. Not very nice if we think we can update the remote and fail a build because it won't let us, but we also don't want to keep trying and ignore the failures, that would be a considerable resource waste.
- Network retrieval is "slow" in computer terms, SCons currently does cache retrieval in a blocking fashion, but that wouldn't be desirable for remote. How should the asynchronous fetching of (possibly) network-cached files be implemented?
- Invocation: adding more cache-related command-line options means more typing (or more configuring in wrappers, IDEs, etc.). The suspicion is developers will use the cache the same way (almost) every time, so there should probably be a way to make the settings more permanent. At the very least, they need to be settable via SetOption.
- Cleanup: SCons doesn't currently prune a local cache (that's a top-level bullet on this page). It doesn't make much sense for a local SCons to try to manage a remote storage set up to presumably serve many users.
- Cache policy: how a local and remote cache interact needs to be defined. There could be a settable policy (SCons uses a "policy" approach for options like
Existing saved discussions to be pasted