-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Improve serialization of translog operations #134243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve serialization of translog operations #134243
Conversation
… recycler_byte_improvements
… recycler_byte_improvements
… recycler_byte_improvements
… recycler_byte_improvements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, left more comments.
server/src/main/java/org/elasticsearch/index/translog/TranslogHeaderWriter.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/translog/TranslogHeaderWriter.java
Outdated
Show resolved
Hide resolved
|
||
int variableLengthSize = (int) (buffer.position() - variableLengthStart); | ||
int sizeOfOperation = FIXED_INDEX_HEADER_SIZE - Integer.BYTES + variableLengthSize + sourceLength + Integer.BYTES; | ||
ByteUtils.writeIntBE(sizeOfOperation, bytes, off); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we assert that it is identical to a slow header, perhaps rarely?? You probably have tests for it though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do have tests in TranslogHeaderWriterTests
which runs 30 checks each run randomizing to one page and cross-page.
buffer.writeString(indexOperation.id()); | ||
|
||
int sourceLength = indexOperation.source().length(); | ||
buffer.writeVInt(indexOperation.source().length()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to check but am guessing we may not need this. I am ok to keep it though, but think a comment on why we have it is appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I'm not sure what you mean. We serialize it here to calculate the checksum. I actually don't love using a vInt here. It's more historical from how we serialize BytesReferences in stream outputs.
Fixed width values would be nicer but we already have variable length values for id and routing so I did not feel like it was worth it to change this.
out.writeLong(seqNo); | ||
out.writeLong(primaryTerm); | ||
} else { | ||
out.writeLong(version); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this and writeSlowIndexHeader
share code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I combined these.
server/src/main/java/org/elasticsearch/index/translog/TranslogWriter.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/index/translog/TranslogTests.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Can we run a relevant nightly test using the new buildkite ability to run nightly tests on PRs? Both to verify performance and ensure it does not expose anything.
server/src/test/java/org/elasticsearch/index/translog/TranslogTests.java
Outdated
Show resolved
Hide resolved
Buildkite benchmark this with tsdb-metricsgen-270m please |
Buildkite benchmark this with http_logs-1n-1g please |
When I ran the http logs it deleted the tsdb metrics gen comment. I also ran that: ⏳ Build in-progress Buildkite Build Result: https://esbench-metrics.kb.us-east-2.aws.elastic-cloud.com:9243/app/r/s/lwXvc |
💚 Build Succeeded
This build ran two http_logs-1n-1g benchmarks to evaluate performance impact of this PR. History |
Buildkite benchmark this with nyc_taxis-1n-8g please |
NYC Taxis 💚 Build Succeeded
This build ran two nyc_taxis-1n-8g benchmarks to evaluate performance impact of this PR. History |
Buildkite benchmark this with elastic-logs please |
💚 Build Succeeded
This build ran two elastic-logs benchmarks to evaluate performance impact of this PR. History |
Logs |
Currently translog operations are serialized in a way that is relatively
inefficient. This commit improves performance by re-ordering the
operations to allow most of the header to be serialized without bounds
checks. Additionally, instead of incrementally calculating the checksum
we do a single pass at the end. Finally, we no longer copy the source
twice. Instead serialize it directly into the translog writer.