feat(slices): add in-memory caching for slice content and stitched output #39335

kdichev · 2025-07-27T11:10:36Z

This PR adds two simple in-memory caches to speed up the HTML stitching process:

sliceCache - stores the raw HTML of each slice after it's read from disk

stitchedSliceCache - stores the fully-stitched version of each slice to avoid re-processing

With 5,000+ pages being stitched, we were re-reading and re-stitching the same slices many times. Most of our slices are reused heavily, so caching makes a big difference.

On local benchmarks, this reduced stitching time for 5000 pages, 8 slices into every page from ~300s to ~80s.

next steps:
by introducing workers, cold builds speeds improve from ~300s to ~35s

kdichev · 2025-07-27T11:19:14Z

From my perspective, I’m working with a fairly large Gatsby site, around 5,000 active pages (originally 25,000, but I’ve removed outdated content). Each page includes 8 unique slices.

Before making changes, the slice processing step alone took about 300 seconds, which was surprisingly close to the time required for optimizing 6,000 images. I reviewed the code and introduced some performance improvements that brought the slice processing time down from ~300s to ~80s.

I also tested whether the regex logic was a bottleneck, but replacing it with a custom parser didn’t yield any speed gains. However, moving the slice queue from fastq to worker threads brought cold build time down further to 35s, and hot builds (with slice changes) to around 50s.

I noticed that my machine’s build resources weren’t being fully utilized before, but after introducing workers, usage hit full capacity.

This could be a solid further general improvement. Let me know if this is something you'd consider viable, and I’ll be happy to create add it.

pieh · 2025-08-07T09:39:29Z

This is overall reasonable change. The main thing I worry here is that the cached content is strongly held in memory and with setup as-is it might result in out-of-memory errors in scenarios that was not happening before with sufficiently large number of slice variants and/or large slice variants content.

As this is an optimization attempt and source of data is still in files I think some kind of lru-cache OR wrapping content in WeakRef would be advised to protect against unbound growth of strongly referenced content that would prevent allocated memory from ever being reclaimed in cases of memory pressure

kdichev · 2025-08-13T14:38:05Z

@pieh This is reasonable feedback, thanks for noting the memory issue! I honestly didn’t think about that at all. I’ve got a fairly powerful machine, so I guess it hasn’t been high on my list of concerns, heh. I’ll work on the suggestions and see how it goes. If there’s already something similar in the repo, and it’s not a burden, I’d appreciate a link so I can take inspiration and stay aligned with accepted practices here.

Update stitching.ts

cad6824

gatsbot bot added the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label Jul 27, 2025

serhalp added the topic: performance Related to runtime & build performance label Aug 4, 2025

serhalp assigned pieh Aug 4, 2025

serhalp added status: needs core review Currently awaiting review from Core team member topic: core Relates to Gatsby's core (e.g. page loading, reporter, state machine) and removed status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer labels Aug 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(slices): add in-memory caching for slice content and stitched output #39335

feat(slices): add in-memory caching for slice content and stitched output #39335

Uh oh!

kdichev commented Jul 27, 2025

Uh oh!

kdichev commented Jul 27, 2025

Uh oh!

pieh commented Aug 7, 2025 •

edited

Loading

Uh oh!

kdichev commented Aug 13, 2025

Uh oh!

Uh oh!

feat(slices): add in-memory caching for slice content and stitched output #39335

Are you sure you want to change the base?

feat(slices): add in-memory caching for slice content and stitched output #39335

Uh oh!

Conversation

kdichev commented Jul 27, 2025

Uh oh!

kdichev commented Jul 27, 2025

Uh oh!

pieh commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kdichev commented Aug 13, 2025

Uh oh!

Uh oh!

pieh commented Aug 7, 2025 •

edited

Loading