Conversation
|
All contributors have signed the CLA ✍️ ✅ |
|
I have read the CLA Document and I hereby sign the CLA |
|
recheck |
81cb3e2 to
46f5de9
Compare
0625d27 to
2dbda56
Compare
1b4e159 to
8a24b0e
Compare
|
After this comment I will make this PR ready to be pulled in without stopping any development on Tuweni v1 as I'm leaving the current I will squash all changes so far and will create an entire new package |
8a24b0e to
1c641ff
Compare
|
recheck |
cc29c50 to
dff599a
Compare
macfarla
left a comment
There was a problem hiding this comment.
do we need to add a consensys copyright statement? Are both v1 and v2 tests running?
bytes/src/main/java/org/apache/tuweni/v2/bytes/ArrayWrappingBytes.java
Outdated
Show resolved
Hide resolved
bytes/src/main/java/org/apache/tuweni/v2/bytes/ArrayWrappingBytes.java
Outdated
Show resolved
Hide resolved
|
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
bytes/src/jmh/java/org/benchmark/BytesMegamorphicBenchmarkV1.java
Outdated
Show resolved
Hide resolved
bytes/src/jmh/java/org/benchmark/BytesMegamorphicBenchmarkV2.java
Outdated
Show resolved
Hide resolved
| if (offset == 0 && length == bytes.length) { | ||
| return bytes; | ||
| } | ||
| return Arrays.copyOfRange(bytes, offset, offset + length); |
There was a problem hiding this comment.
I understand to toArray() is removed, correct ?
There was a problem hiding this comment.
Yes, I'm not convinced it's needed. At least not needed for tuweni or discovery so far
|
|
||
| @Override | ||
| protected void and(byte[] bytesArray, int offset, int length) { | ||
| Utils.and(this.bytes, this.offset, bytesArray, offset, length); |
There was a problem hiding this comment.
It would be nice to have jmh benchmarks on and other binary operations below. I think it will show a pretty good improvement. This is not a blocking comment, but a nice to have.
There was a problem hiding this comment.
let's do that in another PR
| * <p>This class may be used to create more types that represent bytes, but need a different name | ||
| * for business logic. | ||
| */ | ||
| public class DelegatingBytes extends Bytes { |
There was a problem hiding this comment.
shiftLeft and shiftRight doesn't exist anymore in this implementation, is that intended ?
There was a problem hiding this comment.
yes, I have moved all to the MutableBytes. Now can do multiple operations on Bytes and avoid copying and we are explicit that we are creating copies
| protected void xor(byte[] bytesArray, int offset, int length) { | ||
| Utils.xor(this.bytes, this.offset, bytesArray, offset, length); | ||
| } | ||
|
|
There was a problem hiding this comment.
Does it make sens to add shiftLeft and shiftRight here ?
| import java.util.Random; | ||
|
|
||
| /** A {@link Bytes} value that is guaranteed to contain exactly 32 bytes. */ | ||
| public final class Bytes32 extends DelegatingBytes { |
There was a problem hiding this comment.
I suggest to rename this class DelegatingBytes32.
There was a problem hiding this comment.
Why though? That's an implementation detail
| import java.util.Random; | ||
|
|
||
| /** A {@link Bytes} value that is guaranteed to contain exactly 48 bytes. */ | ||
| public final class Bytes48 extends DelegatingBytes { |
There was a problem hiding this comment.
I suggest renaming this class DelegatingBytes48.
| return bytes48; | ||
| } | ||
| checkArgument(slice.size() == SIZE, "Expected %s bytes but got %s", SIZE, slice.size()); | ||
| return new Bytes48(slice.getImpl()); |
There was a problem hiding this comment.
I see, because of DelegatingBytes. I wonder if is not simpler to have another overloaded method with DelegatingBytes as a parameter and delete getImpl() in Bytes.java.
There was a problem hiding this comment.
Correct, because of DelegatingBytes so we need to get straight access to the implementation to avoid stacked method call that just loops.
Not sure I understand the suggestion - do you mean casting Bytes to DelegatingBytes ?
bytes/src/main/java/org/apache/tuweni/v2/bytes/ConcatenatedBytes.java
Outdated
Show resolved
Hide resolved
| for (Bytes value : values) { | ||
| int size = value.size(); | ||
| try { | ||
| totalSize = Math.addExact(totalSize, size); |
There was a problem hiding this comment.
Instead of throwing and then catching the exception, I suggest replacing with the code of Math.addExact :
int result = totalSize + size;
// HD 2-12 Overflow iff both arguments have the opposite sign of the result
if (((totalSize ^ result) & (size ^ result)) < 0) {
throw new IllegalArgumentException(
"Combined length of values is too long (> Integer.MAX_VALUE)");
}
totalSize = result;
There was a problem hiding this comment.
but are you sure the method is not going to be inlined anyway? I would suggest creating a benchmark to prove that
|
I can confirm that the V2 implementations has better performance for megamorphic use case and similar performances for the non megamorphic one |
| public static Bytes keccak256(Bytes input) { | ||
| try { | ||
| return (Bytes32) digestUsingAlgorithm(input, KECCAK_256); | ||
| return digestUsingAlgorithm(input, KECCAK_256); |
There was a problem hiding this comment.
In this case, we're loosing the information that the returned Bytes from keccak256 is 32 bytes. I wonder if there're edge cases to this, as Bytes can store more than 32 bytes.
There was a problem hiding this comment.
I agree you loose the developer friendliness. Though for performance might be justified. Anyway, we are leaving both versions of the code in anyways so we have time to decide after this PR, IMO
| public static Bytes sha3_256(Bytes input) { | ||
| try { | ||
| return (Bytes32) digestUsingAlgorithm(input, SHA3_256); | ||
| return digestUsingAlgorithm(input, SHA3_256); |
|
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
|
Closed as inactive. Feel free to reopen if this PR is still being worked on. |
| result[i] = this.ints[i] & other; | ||
| } | ||
| return new UInt256(result); | ||
| } |
There was a problem hiding this comment.
Bug: Bitwise Operation Errors in UInt Classes
The and, or, and xor methods in both UInt256 and UInt384 classes contain a byte indexing error when operating on Bytes or Bytes32 parameters. The byte offset calculation incorrectly uses the ints array index (i) instead of the bytes parameter index (j), leading to incorrect bitwise results and potential IndexOutOfBoundsException.
Additional Locations (4)
units/src/main/java/org/apache/tuweni/v2/units/bigints/UInt384.java#L457-L464units/src/main/java/org/apache/tuweni/v2/units/bigints/UInt384.java#L487-L494units/src/main/java/org/apache/tuweni/v2/units/bigints/UInt384.java#L519-L526units/src/main/java/org/apache/tuweni/v2/units/bigints/UInt256.java#L637-L644
| fun create(keyPair: SECP256K1.KeyPair, now: Long): Packet { | ||
| val expiration = expirationFor(now) | ||
| val sigHash = createSignature( | ||
| PacketType.ENRRESPONSE, |
There was a problem hiding this comment.
| try { | ||
| SHA256Hash.Hash result = SHA256Hash.hash(shaInput); | ||
| try { | ||
| return SHA256Hash.hash(shaInput).bytes(); |
There was a problem hiding this comment.
Bug: Double Hashing Causes Memory Leak
The sha2_256 methods (both byte[] and Bytes overloads) compute the hash twice when Sodium is available. The first hash object is stored and destroyed, but a new hash object is created and returned from a second computation, which is then leaked.
Additional Locations (1)
|
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
| checkElementIndex(offset, bytes.length); | ||
| } | ||
| checkLength(bytes, offset); | ||
| return new ArrayWrappingBytes(bytes, offset, bytes.length); |
There was a problem hiding this comment.
Bug: Array Wrapper Overruns Bounds: Incorrect Length
The fromArray method creates an ArrayWrappingBytes with bytes.length as the length parameter, which attempts to wrap more bytes than available when offset > 0. This causes the wrapper to extend beyond the array bounds. The length should be SIZE (32 bytes) to correctly wrap exactly 32 bytes starting from the offset, matching the method's documented behavior and the validation performed by checkLength.
| @Override | ||
| public void appendTo(ByteBuffer byteBuffer) { | ||
| byteBuffer.put(this.byteBuffer); | ||
| } |
There was a problem hiding this comment.
Bug: Append Ignores Slice Boundaries
The appendTo method puts the entire byteBuffer into the target buffer, ignoring the offset and size() of this wrapper. When a ByteBufferWrappingBytes represents a slice (constructed with offset and length), this appends the wrong bytes. The method transfers from the buffer's current position/limit rather than the intended slice range defined by offset and size().
ad67e19 to
a4def2a
Compare
| private static final int SIZE = 32; | ||
|
|
||
| /** A {@code Bytes32} containing all zero bytes */ | ||
| public static final Bytes ZERO = fromByte((byte) 0); |
There was a problem hiding this comment.
Bug: Type Mismatch: ZERO Constant Threatens Integrity
The ZERO constant is declared as type Bytes instead of Bytes32, causing a type inconsistency. The javadoc states it's a Bytes32 containing all zero bytes, but the actual type doesn't match. This breaks type safety and could cause issues when code expects a Bytes32 instance but receives a generic Bytes reference.
| public static final int SIZE = 48; | ||
|
|
||
| /** A {@code Bytes48} containing all zero bytes */ | ||
| public static final Bytes ZERO = fromByte((byte) 0); |
There was a problem hiding this comment.
Bug: ZERO Constant: Type Inconsistency Threatens Safety
The ZERO constant is declared as type Bytes instead of Bytes48, causing a type inconsistency. The javadoc states it's a Bytes48 containing all zero bytes, but the actual type doesn't match. This breaks type safety and could cause issues when code expects a Bytes48 instance but receives a generic Bytes reference.
| byte[] newBytesArray = new byte[otherSize]; | ||
| System.arraycopy(bytesArray, 0, newBytesArray, otherSize - size, size); | ||
| bytesArray = newBytesArray; | ||
| size = otherSize; |
There was a problem hiding this comment.
Bug: Immutability Contract Violated: Stale Hashcodes
The and, or, and xor methods mutate the size field after object construction when resizing the internal array. Since size is inherited from the immutable parent Bytes class and used for hashcode caching, this mutation breaks the immutability contract and can cause cached hashcodes to become stale, leading to incorrect behavior in hash-based collections.
| * | ||
| * @return The number of bytes this value represents. | ||
| */ | ||
| public int size() { |
| return delegate.trimLeadingZeros(); | ||
| } | ||
|
|
||
| @Override |
There was a problem hiding this comment.
Bug: DelegatingBytes missing trimTrailingZeros override
The DelegatingBytes class overrides trimLeadingZeros() to delegate to its delegate (line 206-208), but is missing an override for trimTrailingZeros(). This means calls to trimTrailingZeros() will use the base Bytes implementation instead of delegating to the wrapped delegate, breaking the delegation pattern and potentially causing inconsistent behavior compared to other delegated methods.
This is a collapsed view of the previous old commits: * 0929d61 Merge remote-tracking branch 'upstream/main' into flatten-Bytes-class-hierarchy * | 6d79491 Miscelaneous fixes * | a4def2a Fix memleak in Hash and bug in devp2p Packet creation * | 10e6e81 fix bug in and/or/xor operations with UInt256/UInt384 * | 578e6bb Move bound checks for Bytes implementation out of constructors * | b9c6b05 Merge branch 'main' into flatten-Bytes-class-hierarchy * 4100d8b review feedback * afba2b9 change megamorphic benchmarks to slice and optimize size() * b2066fc JMH integration fixes * e40d796 remove InterfaceCall and VirtualCall benchmarks * dff599a fix ConnectTwoServersTest - missed package changes and port bindings * 5aba562 add missing package-info.java files * 1c641ff fix failure in DefaultDiscoveryV5ServiceTest about reusing ports * 279acd6 bring in old tuweni v1 after move * ace7910 move all other touched files to org.apache.tuweni.v2 package * 65f217b mv org.apache.tuweni.bytes.v2 to org.apache.tuweni.v2.bytes * ebaa8bd Implement Tuweni v2 Signed-off-by: Luis Pinto <luis.pinto@consensys.net>
Signed-off-by: Luis Pinto <luis.pinto@consensys.net>
This is for easy reviewing by making the PR much shorter. Signed-off-by: Luis Pinto <luis.pinto@consensys.net>
cddbcca to
a646271
Compare
|
I reverted changes made to other projects than bytes in order to make it more reviewable. Will issue separate PR's for other projects |
| @Override | ||
| public void appendTo(ByteBuffer byteBuffer) { | ||
| byteBuffer.put(this.byteBuffer); | ||
| } |
There was a problem hiding this comment.
Bug: appendTo ignores offset and size in ByteBuffer wrapper
The appendTo(ByteBuffer) method in ByteBufferWrappingBytes ignores the offset and size fields, putting the entire source ByteBuffer instead of just the wrapped slice. All other methods in this class (like get, getInt, getLong, and, or, xor, equals, computeHashcode, toArray) correctly use offset + i when accessing data. However, appendTo just calls byteBuffer.put(this.byteBuffer) which will write from the source buffer's position to its limit, disregarding the stored offset entirely. This contrasts with BufferWrappingBytes and ByteBufWrappingBytes which slice their buffers in the constructor, and with ArrayWrappingBytes which correctly passes offset and size() to the put operation.
| @Override | ||
| public byte get(int i) { | ||
| return this.value; | ||
| } |
There was a problem hiding this comment.
Bug: ConstantBytesValue.get missing required bounds check
The get(int i) method in ConstantBytesValue returns the constant value without checking bounds, violating the documented contract of the abstract Bytes.get(int i) method which specifies it should throw IndexOutOfBoundsException if i < 0 or i >= size(). All other implementations (ArrayWrappingBytes, ConcatenatedBytes, etc.) either explicitly call checkElementIndex or rely on the underlying buffer to throw. This means Bytes.repeat((byte) 0, 5).get(100) would incorrectly return 0 instead of throwing an exception.
| } | ||
| } | ||
|
|
||
| public static void checkElementIndex(int index, int size) { |
There was a problem hiding this comment.
could be rewritten using checkArgument like for checkLength with a formatted string highlighting the arugments
|
|
||
| static final int MAX_UNSIGNED_SHORT = (1 << 16) - 1; | ||
| static final long MAX_UNSIGNED_INT = (1L << 32) - 1; | ||
| static final long MAX_UNSIGNED_LONG = Long.MAX_VALUE; |
There was a problem hiding this comment.
the name is misleading, because the value is max signed long.
Moreover, it is not used internally, so could be removed since not public
| @Override | ||
| public Bytes slice(int offset, int length) { | ||
| checkArgument(length >= 0, "Invalid negative length"); | ||
| if (size() > 0) { |
There was a problem hiding this comment.
should we allow 0 sized constant bytes in the first place?
| if (this.size() != other.size()) { | ||
| return false; | ||
| } | ||
|
|
There was a problem hiding this comment.
IDK if it is worth to add another check that if the other is of the same type then just compare the single value
| @Override | ||
| protected int computeHashcode() { | ||
| int result = 1; | ||
| for (int i = 0; i < size(); i++) { |
There was a problem hiding this comment.
is there really the need to cycle to have a good hash code?
| return EMPTY; | ||
| } | ||
| if (count == values.size()) { | ||
| return new ConcatenatedBytes(values.toArray(new Bytes[0]), totalSize); |
There was a problem hiding this comment.
using new Bytes[count] you can save one allocation
| return new ConcatenatedBytes(concatenated, totalSize); | ||
| } | ||
|
|
||
| static Bytes create(List<Bytes> values) { |
There was a problem hiding this comment.
it seems there are no unit tests for this method
| * A Bytes value with just one constant value throughout. Ideal to avoid allocating large byte | ||
| * arrays filled with the same byte. | ||
| */ | ||
| class ConstantBytesValue extends Bytes { |
| } | ||
|
|
||
| @Override | ||
| public Bytes getImpl() { |
| return new Bytes32(fromArray(bytes, offset)); | ||
| } | ||
|
|
||
| public static Bytes fromArray(byte[] bytes, int offset) { |
There was a problem hiding this comment.
Should not this return a Bytes32?
Same for similar methods in the class
| return new Bytes48(fromByte(b)); | ||
| } | ||
|
|
||
| public static Bytes fromByte(byte b) { |
There was a problem hiding this comment.
same as for Bytes32 should not these methos all return Bytes48?
| * @param buffer The buffer to wrap. | ||
| * @return A {@link Bytes} value. | ||
| */ | ||
| public static Bytes wrapBuffer(Buffer buffer) { |
There was a problem hiding this comment.
just wondering if it is cleaner to remove the type suffix for Buffer and ByteBuf variants and just always use wrap for all the following methods
| * @return a value filled with a fixed byte | ||
| */ | ||
| public static Bytes repeat(byte b, int size) { | ||
| return new ConstantBytesValue(b, size); |
| } | ||
| } | ||
|
|
||
| /** |
| return toFastHex(false); | ||
| } | ||
|
|
||
| public String toEllipsisHexString() { |
| this.byteBuf = byteBuf; | ||
| } | ||
|
|
||
| ByteBufWrappingBytes(ByteBuf byteBuf, int offset, int length) { |
There was a problem hiding this comment.
what about all the checks that are present in v1?
| * Create a new mutable bytes value. | ||
| * | ||
| * @param size The size of the returned value. | ||
| * @return A {@link MutableBytes} value. |
| */ | ||
| public static MutableBytes fromArray(byte[] value) { | ||
| checkNotNull(value); | ||
| if (value.length == 0) { |
There was a problem hiding this comment.
curious of why this check here and non in the constructor
| * @return This mutable bytes instance. | ||
| */ | ||
| public MutableBytes reverse() { | ||
| byte[] reverse = new byte[size]; |
There was a problem hiding this comment.
reverse should be possible w/o allocating a new array
| * A value made of bytes. | ||
| * | ||
| * <p>This class makes no thread-safety guarantee, and a {@link Bytes} value is generally not thread | ||
| * safe. However, specific implementations may be thread-safe, e.g., MutableBytes. |
|
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
|
Closed as inactive. Feel free to reopen if this PR is still being worked on. |
| @Override | ||
| public byte get(int i) { | ||
| return this.value; | ||
| } |
There was a problem hiding this comment.
Missing bounds check in ConstantBytesValue.get() method
High Severity
The get(int i) method in ConstantBytesValue returns this.value for any index without validating that the index is within bounds. According to the Bytes.get() contract (which specifies @throws IndexOutOfBoundsException if i < 0 or i >= size()), invalid indices should throw an exception. Other implementations like ArrayWrappingBytes.get() correctly call checkElementIndex(i, size()) before returning. This bug allows silent access with any index value, which could mask errors in calling code.
| @Override | ||
| public byte get(int i) { | ||
| return byteBuffer.get(offset + i); | ||
| } |
There was a problem hiding this comment.
Missing bounds validation in ByteBufferWrappingBytes accessor methods
Medium Severity
The get(int i), getInt(int i), and getLong(int i) methods don't validate that the index is within the slice bounds (0 <= i < size()). They only rely on the underlying ByteBuffer bounds check which validates against the full buffer capacity, not the wrapper's offset and size. This allows reading data outside the intended slice when offset > 0. For example, with offset=10 and size=5, calling get(50) would access buffer position 60 instead of throwing. The getInt and getLong overrides also bypass the base class's bounds checking logic.
Additional Locations (2)
| @Override | ||
| public void appendTo(ByteBuffer byteBuffer) { | ||
| byteBuffer.put(this.byteBuffer); | ||
| } |
There was a problem hiding this comment.
ByteBufferWrappingBytes.appendTo ignores offset and size
Medium Severity
The appendTo(ByteBuffer) method uses byteBuffer.put(this.byteBuffer) which reads from the source buffer's position() to limit(), completely ignoring the wrapper's offset and size() fields. When wrapping a slice (where offset != 0 or size() != buffer.limit()), this will write incorrect bytes. Compare with ArrayWrappingBytes.appendTo() which correctly calls byteBuffer.put(bytes, offset, size()) to respect the slice boundaries.
| */ | ||
| public static Bytes repeat(byte b, int size) { | ||
| return new ConstantBytesValue(b, size); | ||
| } |
There was a problem hiding this comment.
Missing negative size validation in Bytes.repeat factory method
Medium Severity
The Bytes.repeat(byte b, int size) method creates a ConstantBytesValue without validating that size is non-negative. Other factory methods like wrap, wrapBuffer, wrapByteBuf, and wrapByteBuffer all include checkArgument(size >= 0, "Invalid negative length"). Calling Bytes.repeat((byte)0, -1) creates an object with size() == -1, which leads to incorrect behavior: isZero() returns true without checking any bytes, isEmpty() returns false, and toArrayUnsafe() throws NegativeArraySizeException. This creates an invalid object state that causes confusing errors downstream.
| int totalSize = 0; | ||
|
|
||
| for (Bytes value : values) { | ||
| int size = value.size(); |
There was a problem hiding this comment.
Inconsistent null handling between wrap method overloads
Medium Severity
The ConcatenatedBytes.create(List<Bytes>) method calls value.size() on line 84 without checking if value is null, causing NPE for lists containing null elements. However, the varargs overload create(Bytes...) explicitly handles nulls by checking if (value == null) { continue; } on lines 33-36. This inconsistency means Bytes.wrap(List<Bytes>) crashes with null elements while Bytes.wrap(Bytes...) handles them gracefully. Users could encounter unexpected NPEs when using the List-based API.
PR description
This PR simplifies the Tuweni class hierarchy of the
bytesproject and attempts to get some performance gains by reducing memory churn, specially for copying byte arrays and Bytes instances. Bytes have now a much flatter class hierarchy that will hopefully improve inlining and reduce time onitablelookups which was identified as one of the problems in Besu for low performance. The implementation ofBytesused at runtime is now highly biased towardsArrayWrappingByteswhich makes it easier for JIT to inline and optimize.Also mutable operations wil be moved into
MutableBytesexclusively to show intention in copies. This avoids copies on sequences of logic operations, e.g.MutableBytes.create(10).or(Bytes.fromHex("0x11")).shiftRight(2)no further copies occur after initial instance creation.Other
Bytessubtypes should be immutable from now on to boost performance.Note
Introduces a new, flatter
org.apache.tuweni.v2.bytesAPI and adds benchmarking support.Bytes,MutableBytes,ArrayWrappingBytes,ConcatenatedBytes,ConstantBytesValue,DelegatingBytes,Utils, size-specificBytes32/Bytes48, and wrappers forByteBuffer/ByteBuf/Vert.xBufferMutableBytes; otherBytesimplementations are immutablesrc/jmh), with config, async-profiler support, and adds two benchmarks (BytesMegamorphicBenchmarkV1/V2)assertj-coretest dependencyWritten by Cursor Bugbot for commit a646271. This will update automatically on new commits. Configure here.