Skip to content

Conversation

niuyueyang1996
Copy link

@k8s-ci-robot
Copy link

Hi @niuyueyang1996. Thanks for your PR.

I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@serathius
Copy link
Member

serathius commented Jul 28, 2025

Any performance improvement should be confirmed using benchmarks. I'm concerned here that switching to SharedBufReadTxnMode will reduce write throughput. We need data to confirm impact of this change on all operations.

@niuyueyang1996
Copy link
Author

This optimization is limited to functions using newHeader, such as lease and auth operations. It will improve write performance for these types of requests.

@ahrtr ahrtr self-requested a review July 30, 2025 08:09
@ahrtr
Copy link
Member

ahrtr commented Jul 30, 2025

/ok-to-test

Copy link

codecov bot commented Jul 30, 2025

Codecov Report

❌ Patch coverage is 50.00000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 69.15%. Comparing base (472662f) to head (9564aff).
⚠️ Report is 69 commits behind head on main.

⚠️ Current head 9564aff differs from pull request most recent head c6002af

Please upload reports for the commit c6002af to get more accurate results.

Files with missing lines Patch % Lines
server/storage/mvcc/kv_view.go 50.00% 1 Missing ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
server/storage/mvcc/kv_view.go 80.00% <50.00%> (ø)

... and 50 files with indirect coverage changes

@@            Coverage Diff             @@
##             main   #20411      +/-   ##
==========================================
- Coverage   70.09%   69.15%   -0.95%     
==========================================
  Files         399      416      +17     
  Lines       34099    34707     +608     
==========================================
+ Hits        23902    24000      +98     
- Misses       8814     9311     +497     
- Partials     1383     1396      +13     

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 472662f...c6002af. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@niuyueyang1996 niuyueyang1996 force-pushed the main branch 2 times, most recently from d7c41e2 to 2184a5d Compare July 30, 2025 09:01
@ahrtr
Copy link
Member

ahrtr commented Aug 3, 2025

Thanks for the contribution! The benchmark result looks impressive.

Actually I think we should use SharedBufReadTxMode for both FirstRev and Rev below. They are just super short-live transactions, their purpose is just to get a rev or compactRev. There is no reason to use ConcurrentReadTxMode at all to copy all the read buffer.

Can you please run the benchmark test (i.e. txn-put, range etc) ?

func (rv *readView) FirstRev() int64 {
tr := rv.kv.Read(ConcurrentReadTxMode, traceutil.TODO())
defer tr.End()
return tr.FirstRev()
}
func (rv *readView) Rev() int64 {
tr := rv.kv.Read(ConcurrentReadTxMode, traceutil.TODO())
defer tr.End()
return tr.Rev()
}

@niuyueyang1996

This comment was marked as outdated.

@ahrtr
Copy link
Member

ahrtr commented Aug 4, 2025

Sorry, probably I did not say it clearly in #20411 (comment). I was saying to change to use SharedBufReadTxMode for both FirstRev and Rev, and afterwards run the benchmark ( txn-put, range).

Comment on lines 24 to 41
type readViewRTxMode struct {
KV
rTxMode ReadTxMode
}

func ReadViewRTxMode(rv KV, rTxMode ReadTxMode) *readViewRTxMode {
return &readViewRTxMode{
KV: rv,
rTxMode: rTxMode,
}
}

func (rv *readViewRTxMode) Rev() int64 {
tr := rv.KV.Read(rv.rTxMode, traceutil.TODO())
defer tr.End()
return tr.Rev()
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this change after using SharedBufReadTxMode for both FirstRev and Rev.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@niuyueyang1996
Copy link
Author

ok. I'll make the changes.

@serathius
Copy link
Member

serathius commented Aug 4, 2025

Note, I'm not worried about performance impact on LeaseGrant or Put by themselves. But how after this change LeaseGrant request will impact Put performance. So that would require mix of LeaseGrant and Put requests, which is not covered by any benchmark.

After some thought, I think this could cause speedup in this scenario, because it will be cheaper to just read Rev when copy the buffer, so the read lock is held for shorter time. ConcurrentReadTxMode might only make sense when cost of copying the response is higher than copying buffer. In case of Rev and FirstRev it should be always better to use SharedBufReadTxMode as copying int64 should be always faster than copying write buffer.

@ahrtr
Copy link
Member

ahrtr commented Aug 5, 2025

@niuyueyang1996 can you provide benchmark requests before and after the PR so that we know how much performance improvement we've achieved with this PR? thx

@niuyueyang1996

This comment was marked as outdated.

@niuyueyang1996

This comment was marked as outdated.

@ahrtr
Copy link
Member

ahrtr commented Aug 7, 2025

thx for the report.

  • We need to double confirm & understand the performance degradation of range
  • Can you also run txn-mixed?

@niuyueyang1996

This comment was marked as outdated.

@niuyueyang1996

This comment was marked as outdated.

@niuyueyang1996

This comment was marked as outdated.

@niuyueyang1996
Copy link
Author

​Benchmark Observations:​​

​Single-key write scenarios:​​
Range query performance exhibits a marginal degradation. This is likely attributable to the sparse buffer data volume, where ​copying from the buffer proves more efficient than direct lock acquisition.

​Concurrent multi-key write scenarios:​​
Both read and write operations demonstrate measurable performance gains. Presumably because substantial buffer data density makes ​direct lock acquisition outperform buffer-copying operations.

@serathius
Copy link
Member

Sorry if I'm asking too much, bit it would really help if you could combine the results into a table and highlight differences.
For example #17563

@niuyueyang1996
Copy link
Author

niuyueyang1996 commented Aug 8, 2025

benchmark
-conns 100 --clients 1000 txn-mixed

with_pr value_zie key-space-size total write Requests/sec write latency 99%ile read Requests/sec read latency 99%ile
before 8 1 5000000 40486.6901 0.0206 secs. 40494.2291 0.0257 secs.
after 8 1 5000000 39931.5350 0.0225 secs. 39909.2895 0.0268 secs.
before 256 1 5000000 38915.3091 0.0221 secs. 38951.3557 0.0273 secs.
after 256 1 5000000 38255.4904 0.0231 secs. 38208.6825 0.0277 secs.
before 256 10000 100000 1937.6933 0.2137 secs. 1930.8202 1.1106 secs.
after 256 10000 100000 2022.1316 0.2029 secs. 2007.3767 1.0831 secs.

@niuyueyang1996
Copy link
Author

func with_pr value_zie key-space-size total Requests/sec latency 99%ile
lease-grant before 1000000 39802.2602 0.0450 secs.
after 1000000 74498.9989 0.0308 secs.
range 1 key before 256 1 5000000 96561.9716 0.0419 secs.
after 256 1 5000000 86895.1024 0.0491 secs.
txn-put before 256 1 5000000 74339.9964 0.0251 secs.
after 256 1 5000000 75058.3729 0.0249 secs.
txn-put before 256 100000 5000000 67207.0894 0.0272 secs.
after 256 100000 5000000 67809.7672 0.0273 secs.

@serathius
Copy link
Member

Results look good for me. One thing that I don't think range 1 key test brings much value, it's pretty unrealistic scenario that might be worst case degradation.

@niuyueyang1996
Copy link
Author

Agreed. For Kubernetes-like multi-key usage scenarios, we should adopt txn-mixed benchmarking. The benchmark results in this PR suggest potential performance improvements.

@ahrtr
Copy link
Member

ahrtr commented Aug 8, 2025

Overall looks good to me.

​Single-key write scenarios:​​
Range query performance exhibits a marginal degradation. This is likely attributable to the sparse buffer data volume, where ​copying from the buffer proves more efficient than direct lock acquisition.

This PR shouldn't have any impact on the range requests at all. (It's easy to verify. Just to add logs something like below. They aren't called at all). The performance difference might be caused by OS factors, i.e. cache etc. Do you always get the same results? Probably changing the order (i.e. try "after" first, then "before" later) might an opposite result?

diff --git a/server/storage/mvcc/kv_view.go b/server/storage/mvcc/kv_view.go
index 56260e759..9ba349c5d 100644
--- a/server/storage/mvcc/kv_view.go
+++ b/server/storage/mvcc/kv_view.go
@@ -16,6 +16,7 @@ package mvcc
 
 import (
        "context"
+       "fmt"
 
        "go.etcd.io/etcd/pkg/v3/traceutil"
        "go.etcd.io/etcd/server/v3/lease"
@@ -24,12 +25,14 @@ import (
 type readView struct{ kv KV }
 
 func (rv *readView) FirstRev() int64 {
+       fmt.Println("################## (rv *readView) FirstRev()")
        tr := rv.kv.Read(ConcurrentReadTxMode, traceutil.TODO())
        defer tr.End()
        return tr.FirstRev()
 }
 
 func (rv *readView) Rev() int64 {
+       fmt.Println("################## (rv *readView) Rev()")
        tr := rv.kv.Read(ConcurrentReadTxMode, traceutil.TODO())
        defer tr.End()
        return tr.Rev()

Concurrent multi-key write scenarios:​​
Both read and write operations demonstrate measurable performance gains. Presumably because substantial buffer data density makes ​direct lock acquisition outperform buffer-copying operations.

Similarly, this PR shouldn't have any impact on write or TXN. Your test result also confirmed this.

@niuyueyang1996
Copy link
Author

You're absolutely correct. To validate consistency, ​I rebuilt the binaries twice on bare-metal servers and conducted controlled comparisons. The observed variance remains ​within the margin of error for standard benchmark fluctuations.

@ahrtr
Copy link
Member

ahrtr commented Aug 8, 2025

Please also add a changelog something like below (either in this PR or in a separate PR) under https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.7.md#etcd-server, thx

Improves performance of lease and user/role operations (up to 2x) by updating `(*readView) Rev()` to use `SharedBufReadTxMode`

It's be great if you can also verify the performance improvement on users/role operations. But the benchmark tool doesn't support users/roles, so you will need to improve the tool first.

@ahrtr
Copy link
Member

ahrtr commented Aug 8, 2025

It's be great if you can also verify the performance improvement on users/role operations. But the benchmark tool doesn't support users/roles, so you will need to improve the tool first.

It's optional (It's unlikely that users frequently create/remove roles & users in production environment). It's OK to take care of this separately.

@ahrtr ahrtr changed the title newHeader use SharedBufReadTxMode to get rv. Use SharedBufReadTxMode for (*readView) Rev() and (*readView) FirstRev() Aug 8, 2025
Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

thx

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahrtr, niuyueyang1996, serathius

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ahrtr
Copy link
Member

ahrtr commented Aug 11, 2025

pls squash the commits, thx

@serathius serathius merged commit b94310b into etcd-io:main Aug 11, 2025
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants