-
Notifications
You must be signed in to change notification settings - Fork 7
PG-1963: Implement caching for HTTP requests #22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This commit implements a basic cache using the dshash infrasturcture in PostgreSQL: we simply cache the HTTP results based on the Cache-Control property sent by the server. The cache itself is relatively simple, most of the impelementation is a should-be-safe C++ wrapper around dshash, and infrastructure to handle pg_try properly everywhere. No proper tests added yet, as Keycloak decided to don't allow caching for issuer and jwks requests at all, so the cache is not used in our rspec tests. We likely have to add a more complex setup there with an nginx proxy in between.
19ff6ce to
8455d7a
Compare
Adding everything to a single execution confuses it and it starts reporting nonsense errors because of that
AndersAstrand
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good in general! However I think pg_suppress_errors needs to go. It seems dangerous to me.
We should probably not require the user to put us in shared_preload_libraries just to be able to use a shmem hook. This is not performance critical, so using DSM instead should be fine.
Our handling of Cache-Control is extremely primitive here. Have you verified with any of the providers we support that it's enough to care about max-age, or is it a guess? :)
src/dshash_wrapper.hpp
Outdated
| static void init_shared_memory(const char* shmem_name, const char* tranche_name) { | ||
| pg_try([&]() { | ||
| bool found = false; | ||
| auto* state = static_cast<shared_state*>(ShmemInitStruct(shmem_name, shared_mem_size(), &found)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there really any reason to do anything else than sizeof(shared_state) for size here? ShmemAlloc() does CACHELINEALIGN(size) anyways which is much bigger than MAXALIGN().
Does some documentation that I've missed say that we're supposed to align this size?
If ShmemInitStruct() requires that we have pre requested the memory (and thus that we need to be in shared_preload_libraries, maybe we should just use GetNamedDSMSegment() instead?
src/dshash_wrapper.hpp
Outdated
| } | ||
|
|
||
| ~dshash() { | ||
| pg_suppress_errors( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm.... I wonder what the consequences of just eating errors from here is.
I understand we can't throw exceptions from a destructor of course, but having to do this probably means that we should have an explicit detach function rather than relying on a destructor.
src/dshash_wrapper.hpp
Outdated
| entry_ref& operator=(entry_ref&& other) noexcept { | ||
| if (this != &other) { | ||
| if (entry_ && owner_ && owner_->table_) { | ||
| pg_suppress_errors([&]() { dshash_release_lock(owner_->table_, entry_); }, "entry_ref move assignment"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suppressing errors like this seems very fishy to me. We should probably do something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What else? This is one of the places where you normally wouldn't even add error handling; you assume that it works. But I used the convention of "adding a pg_try/catch block for every pg call", so here we are. I don't think any error will happen here, but if something does, at least we log it. Other than calling terminate() or a fatal error and intentionally crashing, there's not much we could do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If someone calls elog(ERROR) they expect error cleanup to happen. just suppressing that seems like a recipe for leaks.
Maybe use a regular function that can throw exceptions instead of overloading the operator?
src/pg_oidc_validator.cpp
Outdated
|
|
||
| if (!process_shared_preload_libraries_in_progress) { | ||
| elog(WARNING, | ||
| "pg_oidc_validator should be included in shared_preload_libraries, otherwise certain features like caching " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is for the shmem startup hook. But do we really need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I completely forgot that we can now do this entirely dynamically.
src/pg_oidc_validator.cpp
Outdated
| prev_shmem_request_hook(); | ||
| } | ||
|
|
||
| RequestAddinShmemSpace(pg::http_cache::shared_mem_size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this? We only request extra memory for a tiny struct? Most of our shared memory use is for the actual cache anyways I mean.
All of our usage was related to pg methods that should never throw errors, so now instead we make this explicit with the naming.
Unfortunately this needs an ugly hack with a static local variable because we can't pass arguments to the init callback, but at least now we don't need shared preload libraries
This commit implements a basic cache using the dshash infrasturcture in PostgreSQL: we simply cache the HTTP results based on the Cache-Control property sent by the server.
The cache itself is relatively simple, most of the impelementation is a should-be-safe C++ wrapper around dshash, and infrastructure to handle pg_try properly everywhere.
No proper tests added yet, as Keycloak decided to don't allow caching for issuer and jwks requests at all, so the cache is not used in our rspec tests. We likely have to add a more complex setup there with an nginx proxy in between.