-
Notifications
You must be signed in to change notification settings - Fork 776
safekeeper: lift decoding and interpretation of WAL to the safekeeper #9746
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
474c03a
to
55f5e89
Compare
6941 tests run: 6633 passed, 0 failed, 308 skipped (full report)Code coverage* (full report)
* collected from Rust tests only The comment gets automatically updated with the latest test results
685df74 at 2024-11-25T17:12:43.488Z :recycle: |
55f5e89
to
a08f5f8
Compare
a08f5f8
to
c4ae5f8
Compare
c4ae5f8
to
25db978
Compare
7a7031f
to
f51ff0a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done a high-level pass, flushing some comments. Looks good overall.
I need to do another pass to examine the details more closely, will do that when it's ready for final review.
## Problem We want to serialize interpreted records to send them over the wire from safekeeper to pageserver. ## Summary of changes Make `InterpretedWalRecord` ser/de. This is a temporary change to get the bulk of the lift merged in #9746. For going to prod, we don't want to use bincode since we can't evolve the schema. Questions on serialization will be tackled separately.
f51ff0a
to
8c4deee
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, provided we replace the wire protocol encoding before production.
Let's also enable this protocol in a few tests, to get some rudimentary coverage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, LGTM! A few minor comments.
Also would be great to remove duplication between send_wal.rs and wal_reader_stream.rs (obviously ok separately).
…pret-wal Conflicts: pageserver/src/config.rs
…#9821) ## Problem #9746 lifted decoding and interpretation of WAL to the safekeeper. This reduced the ingested amount on the pageservers by around 10x for a tenant with 8 shards, but doubled the ingested amount for single sharded tenants. Also, #9746 uses bincode which doesn't support schema evolution. Technically the schema can be evolved, but it's very cumbersome. ## Summary of changes This patch set addresses both problems by adding protobuf support for the interpreted wal records and adding compression support. Compressed protobuf reduced the ingested amount by 100x on the 32 shards `test_sharded_ingest` case (compared to non-interpreted proto). For the 1 shard case the reduction is 5x. Sister change to `rust-postgres` is [here](neondatabase/rust-postgres#33). ## Links Related: #9336 Epic: #9329
Problem
For any given tenant shard, pageservers receive all of the tenant's WAL from the safekeeper.
This soft-blocks us from using larger shard counts due to bandwidth concerns and CPU overhead of filtering
out the records.
Summary of changes
This PR lifts the decoding and interpretation of WAL from the pageserver into the safekeeper.
A customised PG replication protocol is used where instead of sending raw WAL, the safekeeper sends
filtered, interpreted records. The receiver drives the protocol selection, so, on the pageserver side, usage
of the new protocol is gated by a new pageserver config:
wal_receiver_protocol
.More granularly the changes are:
before sending them over
to what we already have for raw wal (minus decoding and interpreting).
Notes
Links
Related: #9336
Epic: #9329