-
Notifications
You must be signed in to change notification settings - Fork 691
Add Git index commit batching #12593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
I couldn't figure out how to get this to work with `diesel_async` pipelining: no matter what combination of ways to capture the connection I could cook up, something always broke. So, instead, we'll get a shittier, harder to use API: you have to create the `GitIndexSyncQueueItem` record explicitly, then call `enqueue` normally on the `SyncToGitIndex` unit struct. Not ideal. The other option here would be to add a new function or meta-job that does all the plumbing for both indices.
|
This part is indeed a bit tricky. From what I can tell, this is caused by us using the query pipelining mode of Diesel (aka The alternative AFAICT would be to rewrite It's a bit unfortunate that Diesel is making this so hard due to the fact that it requires mutable connections, which fundamentally conflicts with sharing them between multiple queries, but fixing this is obviously not trivial 😅 |
| LIMIT | ||
| $1 | ||
| ) | ||
| DELETE FROM git_index_sync_queue USING batch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note:
using a postgres table as a queue like this has one problem:
if our handler code fails, the entries are still removed from the queue.
Also, multiple workers arent supported.
If these things are an issue, you could do it the same way we do it in docs.rs:
- start a transaction
SELECT [...] LIMIT 100 FOR UPDATE SKIP LOCKED- do your work
DELETEthe entries- commit the transaction
Through this,
- other workers just skip over these locked records, and
- and in case of an error, your transaction is rolled back, and the records are unlocked again.
Note to reviewers: I'm going to rebase this down to a single commit before merging, but there's a specific change I want feedback on, so I've left my messy local commit history on the branch for now just in case it's helpful.
Also, sorry for the braindump below, but I've paged this in and out of my memory at least three times in the last two months, and I need a shortcut to page it back in that's not me muttering "what the fuck" under my breath over the course of an hour or two while I hammer rust-analyzer and my laptop fan runs at 100% each time I actually have a couple of spare hours to pick this back up.
This PR implements batching for Git index updates in an attempt to prevent the causes of the incidents that resulted in #12281. This is implemented similarly to Cloudfront CDN invalidations: namely, by using a new
git_index_sync_queuetable to gather the crates that require updates, and then by having the job action those changes in batches (currently, of 100 crates), which means we only need to fetch and push the repo once per batch. The job itself now doesn't have any parameters, and is represented as a simple unit struct in Rust, since it will simply grab the next batch of crates from the new table when it runs.So far, so good.
There's one aspect of this I don't like, which is the API around this. There are several places in the codebase where we need to trigger Git index updates, which now means that we need to insert a new record into
git_index_sync_queue, then enqueue theSyncToGitIndexjob. This feels potentially error-prone, since both of these things have to happen for the sync to work correctly.Initially, I implemented a function that looked like this:
I naïvely assumed that we could then use this as a drop-in replacement for the
<SyncToGitIndex as BackgroundJob>::enqueue()calls that we have in various places withintokio::try_join!()macros to take advantage ofAsyncPgConnectionpipelining.I was, obviously, wrong, and have come to the conclusion that I fundamentally don't understand something about Diesel and/or Tokio and/or mutable reference semantics in Rust.
Things I've tried:
conncan't be reused multiple times. (Which is fine, but why does it work with the code we have right now?)BackgroundJobjust so it can overrideenqueueand therefore look like a normal job, which fails because the expectedBoxFutureis tied to the lifetime of the struct and not the connection, which is obviously what we need to be able to insert a record intogit_index_sync_queue.&mut AsyncPgConnectionnot being able to be used in multiple functions within thetry_join!()).So this is what we get for now. Maybe it's fine, and my docblock note on
SyncToGitIndexis good enough, and I should stop worrying about it. If so, we can just review and merge this, test it some more in staging, and deploy it as per normal.But the whole thing feels fragile, and I'm missing something, and I don't like it.
Anyway, in theory, this fixes #12281.