Use `SharedBufReadTxMode` for `(readView) Rev()` and `(readView) FirstRev()` #20411

niuyueyang1996 · 2025-07-28T09:45:39Z

#20184

k8s-ci-robot · 2025-07-28T09:45:48Z

Hi @niuyueyang1996. Thanks for your PR.

I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

serathius · 2025-07-28T18:04:14Z

Any performance improvement should be confirmed using benchmarks. I'm concerned here that switching to SharedBufReadTxnMode will reduce write throughput. We need data to confirm impact of this change on all operations.

niuyueyang1996 · 2025-07-30T02:28:20Z

This optimization is limited to functions using newHeader, such as lease and auth operations. It will improve write performance for these types of requests.

ahrtr · 2025-07-30T08:09:09Z

/ok-to-test

codecov · 2025-07-30T08:31:59Z

Codecov Report

❌ Patch coverage is 50.00000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 69.15%. Comparing base (472662f) to head (9564aff).
⚠️ Report is 69 commits behind head on main.

⚠️ Current head 9564aff differs from pull request most recent head c6002af

Please upload reports for the commit c6002af to get more accurate results.

Files with missing lines	Patch %	Lines
server/storage/mvcc/kv_view.go	50.00%	1 Missing ⚠️

Additional details and impacted files

Files with missing lines	Coverage Δ
server/storage/mvcc/kv_view.go	`80.00% <50.00%> (ø)`

... and 50 files with indirect coverage changes

@@            Coverage Diff             @@
##             main   #20411      +/-   ##
==========================================
- Coverage   70.09%   69.15%   -0.95%     
==========================================
  Files         399      416      +17     
  Lines       34099    34707     +608     
==========================================
+ Hits        23902    24000      +98     
- Misses       8814     9311     +497     
- Partials     1383     1396      +13

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 472662f...c6002af. Read the comment docs.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ahrtr · 2025-08-03T19:00:48Z

Thanks for the contribution! The benchmark result looks impressive.

Actually I think we should use SharedBufReadTxMode for both FirstRev and Rev below. They are just super short-live transactions, their purpose is just to get a rev or compactRev. There is no reason to use ConcurrentReadTxMode at all to copy all the read buffer.

Can you please run the benchmark test (i.e. txn-put, range etc) ?

etcd/server/storage/mvcc/kv_view.go

Lines 26 to 36 in f072712

    
           func (rv *readView) FirstRev() int64 { 
        
           	tr := rv.kv.Read(ConcurrentReadTxMode, traceutil.TODO()) 
        
           	defer tr.End() 
        
           	return tr.FirstRev() 
        
           } 
        
           func (rv *readView) Rev() int64 { 
        
           	tr := rv.kv.Read(ConcurrentReadTxMode, traceutil.TODO()) 
        
           	defer tr.End() 
        
           	return tr.Rev() 
        
           }

ahrtr · 2025-08-04T09:35:18Z

Sorry, probably I did not say it clearly in #20411 (comment). I was saying to change to use SharedBufReadTxMode for both FirstRev and Rev, and afterwards run the benchmark ( txn-put, range).

ahrtr · 2025-08-04T09:36:39Z

server/storage/mvcc/kv_view.go

+type readViewRTxMode struct {
+	KV
+	rTxMode ReadTxMode
+}
+
+func ReadViewRTxMode(rv KV, rTxMode ReadTxMode) *readViewRTxMode {
+	return &readViewRTxMode{
+		KV:      rv,
+		rTxMode: rTxMode,
+	}
+}
+
+func (rv *readViewRTxMode) Rev() int64 {
+	tr := rv.KV.Read(rv.rTxMode, traceutil.TODO())
+	defer tr.End()
+	return tr.Rev()
+}
+


We don't need this change after using SharedBufReadTxMode for both FirstRev and Rev.

niuyueyang1996 · 2025-08-04T09:39:21Z

ok. I'll make the changes.

serathius · 2025-08-04T12:21:56Z

Note, I'm not worried about performance impact on LeaseGrant or Put by themselves. But how after this change LeaseGrant request will impact Put performance. So that would require mix of LeaseGrant and Put requests, which is not covered by any benchmark.

After some thought, I think this could cause speedup in this scenario, because it will be cheaper to just read Rev when copy the buffer, so the read lock is held for shorter time. ConcurrentReadTxMode might only make sense when cost of copying the response is higher than copying buffer. In case of Rev and FirstRev it should be always better to use SharedBufReadTxMode as copying int64 should be always faster than copying write buffer.

ahrtr · 2025-08-05T09:05:13Z

@niuyueyang1996 can you provide benchmark requests before and after the PR so that we know how much performance improvement we've achieved with this PR? thx

ahrtr · 2025-08-07T17:59:33Z

thx for the report.

We need to double confirm & understand the performance degradation of range
Can you also run txn-mixed?

niuyueyang1996 · 2025-08-08T03:05:05Z

Benchmark Observations:

Single-key write scenarios:
Range query performance exhibits a marginal degradation. This is likely attributable to the sparse buffer data volume, where copying from the buffer proves more efficient than direct lock acquisition.

Concurrent multi-key write scenarios:
Both read and write operations demonstrate measurable performance gains. Presumably because substantial buffer data density makes direct lock acquisition outperform buffer-copying operations.

serathius · 2025-08-08T06:57:33Z

Sorry if I'm asking too much, bit it would really help if you could combine the results into a table and highlight differences.
For example #17563

niuyueyang1996 · 2025-08-08T07:20:35Z

benchmark
-conns 100 --clients 1000 txn-mixed

with_pr	value_zie	key-space-size	total	write Requests/sec	write latency 99%ile	read Requests/sec	read latency 99%ile
before	8	1	5000000	40486.6901	0.0206 secs.	40494.2291	0.0257 secs.
after	8	1	5000000	39931.5350	0.0225 secs.	39909.2895	0.0268 secs.
before	256	1	5000000	38915.3091	0.0221 secs.	38951.3557	0.0273 secs.
after	256	1	5000000	38255.4904	0.0231 secs.	38208.6825	0.0277 secs.
before	256	10000	100000	1937.6933	0.2137 secs.	1930.8202	1.1106 secs.
after	256	10000	100000	2022.1316	0.2029 secs.	2007.3767	1.0831 secs.

niuyueyang1996 · 2025-08-08T07:33:48Z

func	with_pr	value_zie	key-space-size	total	Requests/sec	latency 99%ile
lease-grant	before			1000000	39802.2602	0.0450 secs.
	after			1000000	74498.9989	0.0308 secs.
range 1 key	before	256	1	5000000	96561.9716	0.0419 secs.
	after	256	1	5000000	86895.1024	0.0491 secs.
txn-put	before	256	1	5000000	74339.9964	0.0251 secs.
	after	256	1	5000000	75058.3729	0.0249 secs.
txn-put	before	256	100000	5000000	67207.0894	0.0272 secs.
	after	256	100000	5000000	67809.7672	0.0273 secs.

serathius · 2025-08-08T07:38:27Z

Results look good for me. One thing that I don't think range 1 key test brings much value, it's pretty unrealistic scenario that might be worst case degradation.

niuyueyang1996 · 2025-08-08T07:51:29Z

Agreed. For Kubernetes-like multi-key usage scenarios, we should adopt txn-mixed benchmarking. The benchmark results in this PR suggest potential performance improvements.

ahrtr · 2025-08-08T10:13:47Z

Overall looks good to me.

Single-key write scenarios:
Range query performance exhibits a marginal degradation. This is likely attributable to the sparse buffer data volume, where copying from the buffer proves more efficient than direct lock acquisition.

This PR shouldn't have any impact on the range requests at all. (It's easy to verify. Just to add logs something like below. They aren't called at all). The performance difference might be caused by OS factors, i.e. cache etc. Do you always get the same results? Probably changing the order (i.e. try "after" first, then "before" later) might an opposite result?

diff --git a/server/storage/mvcc/kv_view.go b/server/storage/mvcc/kv_view.go
index 56260e759..9ba349c5d 100644
--- a/server/storage/mvcc/kv_view.go
+++ b/server/storage/mvcc/kv_view.go
@@ -16,6 +16,7 @@ package mvcc
 
 import (
        "context"
+       "fmt"
 
        "go.etcd.io/etcd/pkg/v3/traceutil"
        "go.etcd.io/etcd/server/v3/lease"
@@ -24,12 +25,14 @@ import (
 type readView struct{ kv KV }
 
 func (rv *readView) FirstRev() int64 {
+       fmt.Println("################## (rv *readView) FirstRev()")
        tr := rv.kv.Read(ConcurrentReadTxMode, traceutil.TODO())
        defer tr.End()
        return tr.FirstRev()
 }
 
 func (rv *readView) Rev() int64 {
+       fmt.Println("################## (rv *readView) Rev()")
        tr := rv.kv.Read(ConcurrentReadTxMode, traceutil.TODO())
        defer tr.End()
        return tr.Rev()

Concurrent multi-key write scenarios:
Both read and write operations demonstrate measurable performance gains. Presumably because substantial buffer data density makes direct lock acquisition outperform buffer-copying operations.

Similarly, this PR shouldn't have any impact on write or TXN. Your test result also confirmed this.

niuyueyang1996 · 2025-08-08T10:40:15Z

You're absolutely correct. To validate consistency, I rebuilt the binaries twice on bare-metal servers and conducted controlled comparisons. The observed variance remains within the margin of error for standard benchmark fluctuations.

ahrtr · 2025-08-08T11:09:00Z

Please also add a changelog something like below (either in this PR or in a separate PR) under https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.7.md#etcd-server, thx

Improves performance of lease and user/role operations (up to 2x) by updating `(*readView) Rev()` to use `SharedBufReadTxMode`

It's be great if you can also verify the performance improvement on users/role operations. But the benchmark tool doesn't support users/roles, so you will need to improve the tool first.

ahrtr · 2025-08-08T15:38:51Z

It's be great if you can also verify the performance improvement on users/role operations. But the benchmark tool doesn't support users/roles, so you will need to improve the tool first.

It's optional (It's unlikely that users frequently create/remove roles & users in production environment). It's OK to take care of this separately.

ahrtr

LGTM

thx

k8s-ci-robot · 2025-08-09T19:32:16Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahrtr, niuyueyang1996, serathius

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [ahrtr,serathius]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ahrtr · 2025-08-11T07:26:18Z

pls squash the commits, thx

Signed-off-by: niuyueyang <[email protected]>

k8s-ci-robot added needs-ok-to-test size/M labels Jul 28, 2025

niuyueyang1996 force-pushed the main branch from 2c38159 to 4b622b8 Compare July 28, 2025 09:53

ahrtr self-requested a review July 30, 2025 08:09

k8s-ci-robot added ok-to-test and removed needs-ok-to-test labels Jul 30, 2025

k8s-ci-robot added size/S and removed size/M labels Jul 30, 2025

niuyueyang1996 force-pushed the main branch 2 times, most recently from d7c41e2 to 2184a5d Compare July 30, 2025 09:01

This comment was marked as outdated.

Sign in to view

ahrtr reviewed Aug 4, 2025

View reviewed changes

k8s-ci-robot added size/XS and removed size/S labels Aug 5, 2025

This comment was marked as outdated.

Sign in to view

ahrtr changed the title ~~newHeader use SharedBufReadTxMode to get rv.~~ Use SharedBufReadTxMode for (*readView) Rev() and (*readView) FirstRev() Aug 8, 2025

ahrtr approved these changes Aug 9, 2025

View reviewed changes

k8s-ci-robot added the approved label Aug 9, 2025

serathius approved these changes Aug 9, 2025

View reviewed changes

newHeader use SharedBufReadTxMode to get rv.

c6002af

Signed-off-by: niuyueyang <[email protected]>

niuyueyang1996 force-pushed the main branch from 42f7b3e to c6002af Compare August 11, 2025 07:36

serathius merged commit b94310b into etcd-io:main Aug 11, 2025
30 checks passed

Use SharedBufReadTxMode for (*readView) Rev() and (*readView) FirstRev() #20411

Use SharedBufReadTxMode for (*readView) Rev() and (*readView) FirstRev() #20411

Uh oh!

Conversation

niuyueyang1996 commented Jul 28, 2025

Uh oh!

k8s-ci-robot commented Jul 28, 2025

Uh oh!

serathius commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

niuyueyang1996 commented Jul 30, 2025

Uh oh!

ahrtr commented Jul 30, 2025

Uh oh!

codecov bot commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ahrtr commented Aug 3, 2025

Uh oh!

This comment was marked as outdated.

ahrtr commented Aug 4, 2025

Uh oh!

ahrtr Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

serathius Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

niuyueyang1996 commented Aug 4, 2025

Uh oh!

serathius commented Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ahrtr commented Aug 5, 2025

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

ahrtr commented Aug 7, 2025

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

niuyueyang1996 commented Aug 8, 2025

Uh oh!

serathius commented Aug 8, 2025

Uh oh!

niuyueyang1996 commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

niuyueyang1996 commented Aug 8, 2025

Uh oh!

serathius commented Aug 8, 2025

Uh oh!

niuyueyang1996 commented Aug 8, 2025

Uh oh!

ahrtr commented Aug 8, 2025

Uh oh!

niuyueyang1996 commented Aug 8, 2025

Uh oh!

ahrtr commented Aug 8, 2025

Uh oh!

ahrtr commented Aug 8, 2025

Uh oh!

ahrtr left a comment

Choose a reason for hiding this comment

Uh oh!

k8s-ci-robot commented Aug 9, 2025

Uh oh!

ahrtr commented Aug 11, 2025

Uh oh!

Uh oh!

Uh oh!

Use `SharedBufReadTxMode` for `(readView) Rev()` and `(readView) FirstRev()` #20411

Use `SharedBufReadTxMode` for `(readView) Rev()` and `(readView) FirstRev()` #20411

serathius commented Jul 28, 2025 •

edited

Loading

codecov bot commented Jul 30, 2025 •

edited

Loading

serathius commented Aug 4, 2025 •

edited

Loading

niuyueyang1996 commented Aug 8, 2025 •

edited

Loading