Skip to content

Conversation

@cuiweixie
Copy link
Contributor

@cuiweixie cuiweixie commented Nov 30, 2025

goos: darwin
goarch: arm64
pkg: github.com/ethereum/go-ethereum/common/bitutil
cpu: Apple M1 Pro
             │    old.txt    │               new.txt               │
             │    sec/op     │   sec/op     vs base                │
FastOR1KB-10    415.95n ± 0%   20.38n ± 2%  -95.10% (p=0.000 n=10)
FastOR2KB-10    825.95n ± 1%   35.55n ± 3%  -95.70% (p=0.000 n=10)
FastOR4KB-10   1697.00n ± 4%   66.98n ± 6%  -96.05% (p=0.000 n=10)
geomean          835.4n        36.47n       -95.63%

             │   old.txt    │               new.txt               │
             │     B/op     │    B/op     vs base                 │
FastOR1KB-10   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
FastOR2KB-10   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
FastOR4KB-10   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                   ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

             │   old.txt    │               new.txt               │
             │  allocs/op   │ allocs/op   vs base                 │
FastOR1KB-10   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
FastOR2KB-10   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
FastOR4KB-10   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                   ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

@fjl
Copy link
Contributor

fjl commented Dec 2, 2025

Something to note: the implementation here was copied from crypto/subtle, and it has some preconditions which are missing here. Importantly, the assembly code assumes the input slices do not have partial overlap. I don't know exactly why this is required, but we need to be careful about using assembly with arbitrary memory, since it can lead to memory corruption issues.

@cuiweixie
Copy link
Contributor Author

cuiweixie commented Dec 2, 2025

@fjl Let me explain:
try to run this test on master code base:

func TestXORBytesInexactOverlap(t *testing.T) {
	// Error case 1: dst and x have inexact overlap
	data := []byte{0x01, 0x02, 0x03, 0x04, 0x05}
	y := []byte{0xFF, 0xFF, 0xFF}
	fmt.Println("before test data:", data)

	result := make([]byte, 3)
	XORBytes(result, data[0:3], y)

	XORBytes(data[2:5], data[0:3], y)

	fmt.Println("expected data:", result)
	fmt.Println("overlap caused error, got data:", data[2:5]) 
}

result in:

=== RUN   TestXORBytesInexactOverlap
before test data: [1 2 3 4 5]
expected data: [254 253 252]
overlap caused error, got data: [254 253 1]
--- PASS: TestXORBytesInexactOverlap (0.00s)
PASS

overlap will cause some corrupt. even in current code base, I don't know, But i think event these assembly pr do not being accept. may be I can send other pr to try to do something about overlap issue.

@cuiweixie
Copy link
Contributor Author

cuiweixie commented Dec 2, 2025

@fjl here is the ORBytes case with master codebase, same as this pr:

func TestXORBytesInexactOverlap(t *testing.T) {
	// Error case 1: dst and x have inexact overlap
	data := []byte{0x01, 0x02, 0x03, 0x04, 0x05}
	y := []byte{0x0, 0x0, 0x0}
	fmt.Println("before test data:", data)

	result := make([]byte, 3)
	ORBytes(result, data[0:3], y)

	ORBytes(data[2:5], data[0:3], y)

	fmt.Println("expected data:", result)
	fmt.Println("overlap caused error, got data:", data[2:5])
}
before test data: [1 2 3 4 5]
expected data: [1 2 3]
overlap caused error, got data: [1 2 1]

@cuiweixie
Copy link
Contributor Author

cuiweixie commented Dec 2, 2025

The pr try to keep acting like the not asm version before this pr.

@fjl
Copy link
Contributor

fjl commented Dec 3, 2025

I still think it's bad. At least we should document the requirements more clearly. And possibly there should be checks on it. When using subtle.XORBytes, it will panic if the slices overlap.

@cuiweixie
Copy link
Contributor Author

I still think it's bad. At least we should document the requirements more clearly. And possibly there should be checks on it. When using subtle.XORBytes, it will panic if the slices overlap.

Add both doc and check. pleast take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants