Skip to content

ENH Cache mime type info to reduce memory usage#702

Closed
lozcalver wants to merge 1 commit intosilverstripe:3from
lozcalver:performance
Closed

ENH Cache mime type info to reduce memory usage#702
lozcalver wants to merge 1 commit intosilverstripe:3from
lozcalver:performance

Conversation

@lozcalver
Copy link
Copy Markdown
Member

@lozcalver lozcalver commented Jan 26, 2026

Description

This is a possible solution to #700 for discussion. The cache key includes the file hash, and I think it’s reasonable to expect that if someone updates the file in the filesystem manually, they’ll also update the hash in the database as there’s a bunch of logic throughout core that depends on that.

Manual testing steps

XHProf, this snippet should be a reasonable benchmark:

<% loop $List('SilverStripe\Assets\Image').Limit(50) %>
	{$ScaleWidth('500')}<br />
<% end_loop %>

You should see a difference in wall time, memory usage and the number of function calls.

Issues

Pull request checklist

  • The target branch is correct
  • All commits are relevant to the purpose of the PR (e.g. no debug statements, unrelated refactoring, or arbitrary linting)
    • Small amounts of additional linting are usually okay, but if it makes it hard to concentrate on the relevant changes, ask for the unrelated changes to be reverted, and submitted as a separate PR.
  • The commit messages follow our commit message guidelines
  • The PR follows our contribution guidelines
  • Code changes follow our coding conventions
  • This change is covered with tests (or tests aren't necessary for this change)
  • Any relevant User Help/Developer documentation is updated; for impactful changes, information is added to the changelog for the intended release
  • CI is green

@GuySartorelli
Copy link
Copy Markdown
Member

Please do split these into separate PRs - I see no reason to delay merging the straight forward enhancement while we discuss the caching of the mimetype.

With the CMS 6.2 beta release just a few weeks away, it'd be nice to get the easy win in even if it takes a while to discuss the other one.

@lozcalver lozcalver changed the title Performance improvements ENH Cache mime type info to reduce memory usage Jan 27, 2026
Copy link
Copy Markdown
Member

@GuySartorelli GuySartorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for splitting that out.

Cache type

My immediate thought was that the database feels like a more suitable place for this - we already, but I immediately discounted that when I realised that won't work for some variants (e.g. $file->Convert('webp') will create a variant with a different mimetype than the file stored in the db). So this is probably the correct type of cache for this purpose.

Actual improvement

Do you have any benchmarks for this approach? In the issue you showed the difference between the current implementation of getIsImage() and simply returning true. This change while obviously more appropriate also still has more going on than that, so I'd like to see there is a speed improvement that outweighs the increased memory usage.

Your own questions

You posed these questions in the issue:

(A) how big might that cache end up being?
(B) presumably we’d need to flush it whenever the file hash changed (or at the very least include the file hash in the key)?

Do you have answers to these?

For B I see you've included the hash in the key - though of course that approach means the cache size will grow whenever that hash changes, so the answer to A seems important here.

Comment on lines +1290 to +1292
/** @var CacheInterface $cache */
$cache = Injector::inst()->get(CacheInterface::class . '.FlysystemAssetStore');
$cache->clear();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the cache adapters not get flushed on their own? (I genuinely can't remember)

{
// Flysystem calculates the mime type from the file contents, which is memory intensive, so we cache the result
$cache = $this->getCache();
$key = md5($filename . $hash . $variant);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
$key = md5($filename . $hash . $variant);
$key = 'mimetype__' . md5($filename . $hash . $variant);

This is a cache for the whole store - we aren't using it for anything right now, but we easily could. Because of that we should namespace the cache key.

@lozcalver
Copy link
Copy Markdown
Member Author

You posed these questions in the issue:

(A) how big might that cache end up being?
(B) presumably we’d need to flush it whenever the file hash changed (or at the very least include the file hash in the key)?

Do you have answers to these?
For B I see you've included the hash in the key - though of course that approach means the cache size will grow whenever that hash changes, so the answer to A seems important here.

I was thinking about this a bit more earlier, it’s really difficult to even estimate. It depends on many things, including:

  • The number of files/images on the site
  • How often they're referenced in various code paths
  • How many variants of each file there are
  • How often assets are replaced

My main concern actually isn’t the size of the cache – each entry is a simple string, so is only a handful of bytes – but more how that cache is handled. Some hosts (particularly shared hosting platforms) have restricted numbers of inodes, so if the underlying cache implementation creates one tiny file per file/variant then we risk eating those limits quickly. We can’t code around unknown theoretical limits on shared hosting environments of course, but it’s something to be mindful of anyway.

Sticking them all in one big array that we then save into the cache risks the opposite problem: if a site has millions of files/variants, that one array could be a big memory hog.

One to think about...

@lozcalver lozcalver marked this pull request as draft January 28, 2026 16:56
@lozcalver
Copy link
Copy Markdown
Member Author

Closing this because I can’t think of a solution I’m happy with

@lozcalver lozcalver closed this Feb 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants