Transactional cache API, missing some options by cceckman-at-fastly · Pull Request #471 · fastly/Viceroy

cceckman-at-fastly · 2025-04-22T20:11:06Z

Start implementing the transactional side of the core cache API.

Transactions begin with a transaction_lookup or transaction_lookup_async. When the CacheBusyHandle is awaited, it either returns a result immediately; returns an obligation indicating "you should fulfill this"; or blocks, until the obligatee abandons or fulfills their obligation.

NB, the implementation of cancel here matches a recent change to Fastly's Compute Platform. Previously, close(BusyHandle) would block until the obligation was fulfilled or granted to the closer; now, close doesn't block. This shouldn't really matter to anyone other than making e.g. timeouts to origin fail in better ways.

As I look into it more, I don't think generation IDs or the inner lock are needed for the transactional API. The outer lock should suffice.

`watch::Sender` sets us up to have an asynchronous transactional API. The non-transactional `lookup` and `insert` can use the `Sender` side, acting just like a read/write mutex. But now, `insert` will automatically ping any waiters when the modification is done...and those waiters can wait `async`hronously.

The request headers are needed for insert...but for a transactional insert, they carry along from the original request's headers. Make `request_headers` an explicit additional argument for `insert` rather than part of `WriteOptions`, so we can use `WriteOptions` for other sorts of inserts.

An entry can be: - Present but not obligated (e.g. fresh) - Obligated but not present (request-collapsing lookup) - Neither obligatated nor present (request-collapsing lookup that failed to complete) - Obligated and present (stale-while-revalidate, request-collapsing revalidation) Make these two separate fields of CacheValue. Also, CacheValue doesn't need to be Arc<>'d; remove a layer of indirection.

Note that this implements a recently-changed behavior in the compute platform. Previously, a `PendingTransaction::drop` would *block* on whichever request got the `GoGet` obligation; more recently (and in Viceroy), the `PendingTransaction::drop` does not block on any other transaction.

lib/src/cache/store.rs

Before returning from `transaction_lookup_async`, we attempt to retrieve data + an obligation from the `CacheKeyObject`, but without blocking for any outstanding obligations. Only if this fails do we generate a background task to wait for other obligations to complete. The net result is that `BusyHandle::pending` will in some cases immediately be `true` upon return of the `BusyHandle`, e.g. when the cache needs filling.

aturon

Here's another batch of comments / questions.

lib/src/cache/store.rs

aturon · 2025-05-02T19:09:42Z

lib/src/cache/store.rs

+
+                // Done dealing with other obligations; now we just deal with results that we have.
+                // Pick a fresh result if we now have it:
+                // fresh or stale.


I think this comment line is a leftover?

Not leftover so much as unclear, given that we don't actually deal with stale entries at this point.

This also made me note a weirdness in the logic, that we short-circuit on obligations without returning data. I 2bf2139 I reordered these blocks to:

Scan for fresh data

Scan for stale data that can be revalidated (for now, "just" existing obligations)

Generate a new obligation

which I think will lend itself better to SWR.

lib/src/cache/store.rs

aturon · 2025-05-02T19:23:13Z

lib/src/cache/store.rs

-        response_object.transactional = TransactionState::Present(object);
-        response_object.generation += 1;
+            if !cache_key_objects.vary_rules.contains(meta.vary_rule()) {
+                // Insert at the front, run through the rules in order, so we tend towards fresher


Should we also move the rule to the front if it already exists?

Yes, definitely, oversight on my part. Done in fc09643.

I also realize this could be a Vec instead of a VecDeque, since we're only ever working at one end. But I find it easier to read as push_front and iter rather than push and iter().rev() - I'd definitely mess up "use the reverse iterator" at some point.

lib/src/cache/store.rs

aturon · 2025-05-02T19:40:40Z

lib/src/cache/store.rs

+                    return false;
+                }
+
+                // Finally, generate an obligation based on the most recent vary rule


Have we confirmed this is the behavior in production? I'm thinking of a scenario like:

Two requests come in for the same URL, but with different headers

Cache contains:

Vary rules: [["header1"], ["header2"]]

Stale responses keyed as: "header1: foo" and "header2: bar"

First request has header1: foo and header2: bar; we mark the header1: foo object as obligated

Second request has no header1, but has header2: bar

Thus, we generate a second obligation, even though the one for the second request may well apply.

We could reasonably expect these two requests to be collapses, instead.

Good thought! I'll add a test for this case, double-check some of the cache code, and circle back.

Phrasing this slightly differently as I work through it:

If our heuristic for "which variant will apply to the next request" is "what is the last Vary rule received", we would not expect for these requests to be collapsed - because they have distinct header1 values. We'd be erring on the side of "less latency, more requests" by not collapsing them (i.e. not blocking the second request on the fulfillment of the first).

(Per your other comment, my implementation isn't actually using the "most recent received" heuristic, but that's somewhat separate.)

Wrote a test in 04d9b0e that matches your example (header1 and header2 are swapped, but I'm pretty sure it's the same behaviorally.) I used the cceckman/outside-test branch to run it on compute platform.

This implementation matches the compute platform behavior: the requests are not collapsed.

Presumably the cache code authors of ECP had the "most recent Vary is right" heuristic in mind as well? I can see it getting pretty bad if there's an old Vary: Accept or something that trumps a Vary: Authorization or something, and causes ~all requests to serialize.

Thanks for checking this out -- that makes good sense to me! 🙏

lib/src/cache/store.rs

lib/src/wiggle_abi/cache.rs

aturon

Fantastic work! Very happy to see this land ✨

cceckman-at-fastly force-pushed the cceckman/cache-headers branch from c873e1b to 7fe318a Compare April 23, 2025 14:58

cceckman-at-fastly force-pushed the cceckman/cache-transact branch 3 times, most recently from a0ce2c7 to a0186ee Compare April 23, 2025 15:41

cceckman-at-fastly changed the title ~~Initial transactional cache API~~ Transactional cache API, missing some options Apr 23, 2025

Base automatically changed from cceckman/cache-headers to main April 24, 2025 15:39

cceckman-at-fastly added 15 commits April 24, 2025 11:51

Remove some completed TODOs

4556ead

Remove some unnecessary structure in CacheValue

6567b35

As I look into it more, I don't think generation IDs or the inner lock are needed for the transactional API. The outer lock should suffice.

Inner parts of transactional API: get_or_obligate

9e5f8b0

Transactional insert, part 1

4982999

Basic test for obligation

3cc21c8

Transactional insert

0775f23

Add more tests

3cc6710

Transactional lookup in ABI

a464113

Transactional insert wiring to ABI

efec647

Additional tests, hostcalls for transactional API

b3a0b6a

Migrate to new error type

ef5d38c

cceckman-at-fastly force-pushed the cceckman/cache-transact branch from a0186ee to 481ce8c Compare April 24, 2025 15:52

cceckman-at-fastly requested a review from aturon April 24, 2025 15:52

cceckman-at-fastly added 3 commits April 24, 2025 11:53

Revert logging change (was meant to be printf debugging)

506504b

Include a missing optimization: skip lock on explicit complete

95e2ad9

Rename test variables for legibility

83a287c

aturon reviewed Apr 30, 2025

View reviewed changes

lib/src/cache/store.rs Show resolved Hide resolved

cceckman-at-fastly requested a review from aturon April 30, 2025 19:19

Merge branch 'main' into cceckman/cache-transact

cf37f76

aturon reviewed May 2, 2025

View reviewed changes

cceckman-at-fastly added 6 commits May 5, 2025 10:09

Add comment on awaitable variable

54d0a1f

Reorder blocks and add comment on write lock notifications

2bf2139

Add more comments

525a1f9

Use std::mem::take instead of ::swap

2cdc0d6

Add test for collapsing across vary rules

04d9b0e

Freshen position of used vary rule at insertion

fc09643

cceckman-at-fastly requested a review from aturon May 5, 2025 16:10

aturon approved these changes May 5, 2025

View reviewed changes

cceckman-at-fastly merged commit ae7eb16 into main May 6, 2025
15 checks passed

cceckman-at-fastly deleted the cceckman/cache-transact branch May 6, 2025 12:13

Conversation

cceckman-at-fastly commented Apr 22, 2025

Uh oh!

Uh oh!

aturon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aturon May 2, 2025

Choose a reason for hiding this comment

Uh oh!

cceckman-at-fastly May 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aturon May 2, 2025

Choose a reason for hiding this comment

Uh oh!

cceckman-at-fastly May 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aturon May 2, 2025

Choose a reason for hiding this comment

Uh oh!

cceckman-at-fastly May 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cceckman-at-fastly May 5, 2025

Choose a reason for hiding this comment

Uh oh!

cceckman-at-fastly May 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aturon May 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aturon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cceckman-at-fastly May 5, 2025 •

edited

Loading

cceckman-at-fastly May 5, 2025 •

edited

Loading

cceckman-at-fastly May 5, 2025 •

edited

Loading