Skip to content

Conversation

Amxx
Copy link
Collaborator

@Amxx Amxx commented Jun 20, 2025

Fixes #5760
Includes (depends on) #5733

This probably deserves some assembly. It feels like the decoding could be optimized a lot. Note that base58 is notoriously loop (and memory ops) heavy.

PR Checklist

  • Tests
  • Documentation
  • Changeset entry (run npx changeset add)

Summary by CodeRabbit

  • New Features

    • On-chain Base58 encoding/decoding utility added.
    • Bytes utilities enhanced with in-place splice variants and helpers to count leading/consecutive bytes.
  • Documentation

    • Added Base58 to docs and clarified Base64 descriptions.
  • Tests

    • Added comprehensive Base58 tests (round-trip, vectors, invalid input).
    • Expanded Bytes tests to cover new helpers and splice parity with slicing.
    • Updated Base64 tests with improved assertion style.

@Amxx Amxx requested a review from a team as a code owner June 20, 2025 21:16
Copy link

changeset-bot bot commented Jun 20, 2025

🦋 Changeset detected

Latest commit: 8098fb2

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
openzeppelin-solidity Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR


bytes internal constant _TABLE = "123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz";
bytes internal constant _LOOKUP_TABLE =
hex"000102030405060708ffffffffffffff090a0b0c0d0e0f10ff1112131415ff161718191a1b1c1d1e1f20ffffffffffff2122232425262728292a2bff2c2d2e2f30313233343536373839";
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For anyone curious, here is how you build these lookup tables:

const { ethers } = require("ethers");

const max = (...values) => values.slice(1).reduce((x, y) => (x > y ? x : y), values.at(0));
const min = (...values) => values.slice(1).reduce((x, y) => (x < y ? x : y), values.at(0));

const buildLookup = (...tables) => {
    const bTables = tables.map(table => Array.from(ethers.toUtf8Bytes(table)));
    const MINIMUM = min(...bTables.flatMap(x => x));
    const MAXIMUM = max(...bTables.flatMap(x => x));
    const lookup = Uint8Array.from(Array.from({ length: MAXIMUM - MINIMUM + 1 }).map((_, i) => bTables.map(table => table.indexOf(i + MINIMUM)).find(x => x != -1) ?? 0xff));
    const valid = tables.every(table => Object.entries(table).every(([ i, c]) => i == lookup.at(c.charCodeAt(0) - MINIMUM)));
    return valid ? { tables, lookup: ethers.hexlify(lookup), MINIMUM, MAXIMUM } : undefined;
}

console.log(buildLookup(
    "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/", // base64
    "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_", // base64url
));

console.log(buildLookup(
    "123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz", // base58
));

for (uint256 i = 0; i < data.length; ++i) {
// get b58 char
uint8 chr = _mload8i(data, i);
require(chr > 48 && chr < 123, InvalidBase56Digit(chr));
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

48 and 123 are derived from the minimum and maximum values taken by b58 chars, see https://github.com/OpenZeppelin/openzeppelin-contracts/pull/5762/files#r2160061084

@Amxx Amxx force-pushed the feature/base58 branch from dbee56c to d4a573b Compare June 22, 2025 13:10
@Amxx Amxx force-pushed the feature/base58 branch from d4a573b to a25bd11 Compare June 22, 2025 13:13
@Amxx Amxx force-pushed the feature/base58 branch from f47355a to 855a1c6 Compare June 25, 2025 08:22
@Amxx Amxx added this to the 5.5 milestone Jun 27, 2025
}

// Start the output offset by an over-estimate of the length.
let outputLengthEstim := add(inputLeadingZeros, div(mul(sub(inputLength, inputLeadingZeros), 8351), 6115))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI:
8351 / 6115 = 1.365658217497956
is an estimation of
Math.log2(256) / Math.log2(58) = 1.365658237309761

Copy link

coderabbitai bot commented Aug 27, 2025

Walkthrough

Adds an on-chain Base58 library (encode/decode + InvalidBase58Digit), tests, and docs; extends Bytes with countLeading, countConsecutive, and in-place splice variants; minor Base64 doc wording edits and test assertion style changes; imports Base58 into a mock and adds changesets.

Changes

Cohort / File(s) Summary
Base58 library + docs + mock
contracts/utils/Base58.sol, contracts/mocks/Stateless.sol, contracts/utils/README.adoc
New Base58 library with _encode/_decode and public encode/decode wrappers; adds InvalidBase58Digit(bytes1) error; Stateless.sol imports Base58.sol; README updated to reference Base58.
Base58 tests
test/utils/Base58.t.sol, test/utils/Base58.test.js
Adds Solidity and JS tests for round-trip encode/decode, test vectors, and invalid-character revert behavior; Hardhat fixture deploys $Base58.
Bytes utilities + tests
contracts/utils/Bytes.sol, test/utils/Bytes.test.js
Adds countLeading, countConsecutive, and two in-place splice variants; tests updated to exercise $countConsecutive and assert parity between slice and new splice.
Base64 docs & tests
contracts/utils/Base64.sol, test/utils/Base64.test.js
Non-functional doc-comment wording edits in Base64; JS tests renamed and switched to promise-style assertions (chai-as-promised).
Changesets
.changeset/afraid-chicken-attack.md, .changeset/loose-lamps-bake.md, .changeset/thirty-pugs-pick.md
New changesets announcing Base58 addition and Bytes utility/splice additions and minor version bumps.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant Caller as Caller Contract / Test
  participant Base58 as Base58 Library
  Note over Caller,Base58: Base58 encode/decode flow

  User->>Caller: request encode(bytes)
  Caller->>Base58: encode(input)
  Base58-->>Caller: string b58
  Caller-->>User: b58

  User->>Caller: request decode(string)
  Caller->>Base58: decode(b58)
  alt valid input
    Base58-->>Caller: bytes output
    Caller-->>User: bytes
  else invalid digit
    Base58-->>Caller: revert InvalidBase58Digit(bytes1)
    Caller-->>User: error
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Assessment against linked issues

Objective Addressed Explanation
Add a Base58 library with encoding/decoding support [#5760]

Assessment against linked issues: Out-of-scope changes

Code Change Explanation
Add countLeading/countConsecutive and splice functions (contracts/utils/Bytes.sol) Bytes utilities are unrelated to the Base58 request (#5760).
Tests for Bytes utilities (test/utils/Bytes.test.js) New tests validate Bytes changes; outside the Base58 objective.
Import Base58.sol into contracts/mocks/Stateless.sol (contracts/mocks/Stateless.sol) Import into a mock contract is integration/test convenience, not required by the Base58 feature request.
Base64 doc wording edits and test assertion style changes (contracts/utils/Base64.sol, test/utils/Base64.test.js) Documentation and assertion-style tweaks unrelated to Base58 objective.

Poem

I nibble keys beneath the moon,
Base58 hops in—an encoded tune.
Bytes splice, counts line up true,
Tests sing back what I encode anew.
Carrots for commits, a rabbit's cue. 🥕🐇


📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 8652d20 and 8098fb2.

📒 Files selected for processing (1)
  • test/utils/Base58.t.sol (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/utils/Base58.t.sol
✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

for {
// first section is possibly smaller than 31 bytes
let i := mod(inputLength, 31)
// unfold first loop, with a different shift.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// unfold first loop, with a different shift.
// If inputLength is not a multiple of 31, unfold the first loop:
// Load the first i bytes from the input, right align them, and store them in scratch (at ptr) in a dedicated 32 bytes space.

mstore(0x3f, "Zabcdefghijkmnopqrstuvwxyz")

// Encoding the "input" part of the result.
// `output` point the the left part of the encoded string. we start from scratch, which means we have
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// `output` point the the left part of the encoded string. we start from scratch, which means we have
// `output` point the the left part of the encoded string. we start from the `scratch` pointer, which means we have

break
}

// base 58 arithmetic on the 248bits limbs
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// base 58 arithmetic on the 248bits limbs
// base 58 arithmetic on the 248bits limbs:
// we compute the remainder modulo 58 of the value encoded in the limbs,
// all while dividing the value encoded in the limbs by 58

} lt(i, ptr) {
i := add(i, 0x20)
} {
let acc := add(shl(248, carry), mload(i))
Copy link
Collaborator Author

@Amxx Amxx Aug 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let acc := add(shl(248, carry), mload(i))
// Load the next limb and add the carry (limbs are 248bits objects)
let acc := add(shl(248, carry), mload(i))

i := add(i, 0x20)
} {
let acc := add(shl(248, carry), mload(i))
mstore(i, div(acc, 58))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mstore(i, div(acc, 58))
// store the updated limb (divided by 58)
mstore(i, div(acc, 58))

} {
let acc := add(shl(248, carry), mload(i))
mstore(i, div(acc, 58))
carry := mod(acc, 58)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
carry := mod(acc, 58)
// carry the remainder (modulo 58)
carry := mod(acc, 58)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
contracts/utils/Base58.sol (1)

139-139: Fix typo in comment

"Be leave" should be "We leave".

-            // This is going to be our "scratch" workspace. Be leave enough room on the left to store length + encoded input.
+            // This is going to be our "scratch" workspace. We leave enough room on the left to store length + encoded output.
🧹 Nitpick comments (12)
test/utils/Base64.test.js (2)

29-29: Use latin1 to avoid environment-dependent behavior for non-ASCII strings.

'où' with 'ascii' can be implementation-dependent. Prefer 'latin1' for deterministic byte values.

-        const buffer = Buffer.from(input, 'ascii');
+        const buffer = Buffer.from(input, 'latin1');
-        const buffer = Buffer.from(input, 'ascii');
+        const buffer = Buffer.from(input, 'latin1');

Also applies to: 45-45


30-31: Avoid duplicate calls in expectations to reduce test time/noise.

Each pair calls the contract twice. Capture once, assert twice.

-        await expect(this.mock.$encode(buffer)).to.eventually.equal(ethers.encodeBase64(buffer));
-        await expect(this.mock.$encode(buffer)).to.eventually.equal(expected);
+        const actual = await this.mock.$encode(buffer);
+        expect(actual).to.equal(ethers.encodeBase64(buffer));
+        expect(actual).to.equal(expected);
-        await expect(this.mock.$encodeURL(buffer)).to.eventually.equal(base64toBase64Url(ethers.encodeBase64(buffer)));
-        await expect(this.mock.$encodeURL(buffer)).to.eventually.equal(expected);
+        const actual = await this.mock.$encodeURL(buffer);
+        expect(actual).to.equal(base64toBase64Url(ethers.encodeBase64(buffer)));
+        expect(actual).to.equal(expected);
-    await expect(mock.encode(buffer31)).to.eventually.equal(ethers.encodeBase64(buffer31));
-    await expect(mock.encode(buffer32)).to.eventually.equal(ethers.encodeBase64(buffer32));
+    expect(await mock.encode(buffer31)).to.equal(ethers.encodeBase64(buffer31));
+    expect(await mock.encode(buffer32)).to.equal(ethers.encodeBase64(buffer32));

Also applies to: 46-47, 56-57

contracts/utils/Bytes.sol (1)

124-151: In-place splice looks correct; overlap-safe via MCOPY.

Sanitization and length update are sound. Consider clarifying in NatSpec that the returned reference aliases the input buffer.

contracts/utils/Base58.sol (1)

125-126: Inconsistent memory safety annotation

The assembly block uses a different style of memory safety annotation compared to line 32. For consistency, use assembly ("memory-safe") instead of the Solidity comment.

-        /// @solidity memory-safe-assembly
-        assembly {
+        assembly ("memory-safe") {
.changeset/loose-lamps-bake.md (1)

5-5: Grammar suggestion

Consider rewording for better clarity.

-`Base58`: Add a library for encoding and decoding bytes buffers into base58 strings.
+`Base58`: Add a library for encoding and decoding byte buffers to/from base58 strings.
test/utils/Base58.t.sol (2)

21-23: Optional: bound fuzz input length to avoid heavy runs.

Base58 decode is loop/memory-heavy; large fuzz cases can slow CI. Drop pure to use vm.assume.

-    function testEncodeDecode(bytes memory input) external pure {
-        assertEq(Base58.decode(Base58.encode(input)), input);
-    }
+    function testEncodeDecode(bytes memory input) external {
+        vm.assume(input.length <= 384); // align with JS suite upper size
+        assertEq(Base58.decode(Base58.encode(input)), input);
+    }

24-24: Optional: add revert test for invalid digit (Foundry).

Mirrors the JS test for InvalidBase58Digit.

 }
+
+    function testDecodeInvalidDigit() external {
+        // Expect revert with the offending byte 'I'
+        vm.expectRevert(abi.encodeWithSelector(Base58.InvalidBase58Digit.selector, bytes1("I")));
+        Base58.decode("VYRWKpIpnN7");
+    }
test/utils/Base58.test.js (5)

17-19: Nit: fix comment grammar.

-      // this checks are very slow when running coverage, causing CI to timeout.
+      // these checks are very slow when running coverage, causing CI to timeout.

23-29: Optional: simplify assertions to avoid relying on eventually.

Using expect(await ...) reduces dependency on chai-as-promised.

-            await expect(this.mock.$encode(hex)).to.eventually.equal(b58);
-            await expect(this.mock.$decode(b58)).to.eventually.equal(hex);
+            expect(await this.mock.$encode(hex)).to.equal(b58);
+            expect(await this.mock.$decode(b58)).to.equal(hex);

36-43: Optional: add minimal-leading-zeros vector (0x00 -> 1).

Strengthens the “leading zeros -> '1's” invariant.

       for (const { raw, b58 } of [
         { raw: 'Hello World!', b58: '2NEpo7TZRRrLZSi2U' },
+        { raw: '0x00', b58: '1' },
         {
           raw: 'The quick brown fox jumps over the lazy dog.',
           b58: 'USm3fpXnKG5EUBx2ndxBDMPVciP5hGey2Jh4NDv6gmeo1LkMeiKrLJUUBk6Z',
         },
         { raw: '0x0000287fb4cd', b58: '11233QC4' },
       ])

48-50: Optional: same assertion simplification here.

-          await expect(this.mock.$encode(hex)).to.eventually.equal(b58);
-          await expect(this.mock.$decode(b58)).to.eventually.equal(hex);
+          expect(await this.mock.$encode(hex)).to.equal(b58);
+          expect(await this.mock.$decode(b58)).to.equal(hex);

54-61: Broaden invalid-character coverage.

Common confusions include 'O', 'l', and '0' in addition to 'I'. Expanding cases hardens the suite.

-      for (const chr of ['I', '-', '~'])
+      for (const chr of ['I', 'O', 'l', '0', '-', '~'])
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between b84db20 and 2696cd8.

📒 Files selected for processing (12)
  • .changeset/afraid-chicken-attack.md (1 hunks)
  • .changeset/loose-lamps-bake.md (1 hunks)
  • .changeset/thirty-pugs-pick.md (1 hunks)
  • contracts/mocks/Stateless.sol (1 hunks)
  • contracts/utils/Base58.sol (1 hunks)
  • contracts/utils/Base64.sol (1 hunks)
  • contracts/utils/Bytes.sol (2 hunks)
  • contracts/utils/README.adoc (2 hunks)
  • test/utils/Base58.t.sol (1 hunks)
  • test/utils/Base58.test.js (1 hunks)
  • test/utils/Base64.test.js (4 hunks)
  • test/utils/Bytes.test.js (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
test/utils/Base64.test.js (3)
test/utils/Base58.test.js (1)
  • mock (6-6)
test/utils/Bytes.test.js (1)
  • mock (6-6)
test/utils/Strings.test.js (1)
  • mock (7-7)
test/utils/Base58.test.js (2)
test/utils/Base64.test.js (2)
  • mock (10-10)
  • mock (52-52)
test/utils/Bytes.test.js (1)
  • mock (6-6)
🪛 LanguageTool
.changeset/loose-lamps-bake.md

[grammar] ~5-~5: There might be a mistake here.
Context: ...oding bytes buffers into base58 strings.

(QB_NEW_EN)

🔇 Additional comments (14)
.changeset/thirty-pugs-pick.md (1)

5-6: Changeset entry reads well; scope matches code changes.

Accurately documents the addition of countLeading and countConsecutive. No action needed.

contracts/utils/Base64.sol (1)

11-12: Docstring touch-ups are correct and consistent.

Terminology fixes (Base64/Base64Url) look good. No behavioral impact.

Also applies to: 18-19, 25-28

contracts/utils/README.adoc (2)

27-27: Nice addition.

Clear one-line description for Base58 fits the surrounding list.


109-110: Index entries added in the right section.

Cross-links render correctly in docs builds.

contracts/utils/Base58.sol (4)

40-41: Well-documented mathematical estimation

Good documentation of the length estimation formula. The factor 8351/6115 ≈ 1.3657 correctly approximates log₂(256)/log₂(58).


66-67: Efficient encoding table storage

Good optimization storing the Base58 alphabet directly in memory for quick access during encoding.


197-197: Ignore incorrect critique on calldatacopy usage

The calldatacopy(sub(output, inputLeadingZeros), calldatasize(), inputLeadingZeros) call is in fact intentionally copying past the end of calldata to write zero bytes into memory. In EVM semantics, out-of-bounds reads in calldatacopy produce zeros, making this a valid and gas-efficient way to fill memory with zeros. No change is needed here.

Likely an incorrect or invalid review comment.


142-145: Base58 decoding table is correct as implemented

The validation script confirms that every Base58 character maps to its intended value and all non-Base58 characters correctly map to 0xff. No discrepancies were found.

– File: contracts/utils/Base58.sol
– Lines: 142–145

No further changes required.

contracts/mocks/Stateless.sol (1)

10-10: LGTM! Import correctly positioned

The Base58 import is appropriately placed in alphabetical order within the utils imports section.

.changeset/afraid-chicken-attack.md (1)

5-5: LGTM! Clear changeset description

The changeset correctly documents the addition of in-place splice variants to the Bytes library.

test/utils/Bytes.test.js (2)

59-86: Well-structured test coverage for countConsecutive

The test suite comprehensively covers edge cases including empty buffers, no occurrences, single/multiple occurrences, and out-of-bounds offsets.


88-118: Good validation of slice/splice parity

The tests effectively verify that the new splice functions return the same results as the existing slice functions, ensuring backward compatibility.

test/utils/Base58.t.sol (1)

13-19: LGTM: good coverage of leading-zero edge cases.

Round-trip on all-zeros and near-zeros looks correct.

test/utils/Base58.test.js (1)

5-8: LGTM: fixture/deployment pattern is consistent with existing suites.

No issues.

Comment on lines +71 to +92
/**
* @dev Count number of occurrences of `search` at the beginning of `buffer`.
*/
function countLeading(bytes memory buffer, bytes1 search) internal pure returns (uint256) {
return countConsecutive(buffer, 0, search);
}

/**
* @dev Count number of occurrences of `search` in `buffer`, starting from position `offset`.
*/
function countConsecutive(bytes memory buffer, uint256 offset, bytes1 search) internal pure returns (uint256 i) {
uint256 length = Math.saturatingSub(buffer.length, offset);
assembly ("memory-safe") {
for {
let ptr := add(add(buffer, 0x20), offset)
i := 0
} and(iszero(shr(248, xor(mload(add(ptr, i)), search))), lt(i, length)) {
i := add(i, 1)
} {}
}
}

Copy link

@coderabbitai coderabbitai bot Aug 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Bug: incorrect byte comparison and potential OOB read in countConsecutive loop condition.

  • xor(mload(add(ptr, i)), search) xors the least-significant byte of the 32-byte word, but you later shr(248, ...) to examine the most-significant byte; this never compares against search and effectively counts leading zero bytes instead.
  • Using and(...) in the for condition evaluates both operands; at i == length this performs an out-of-bounds mload.

Fix by reading only when i < length and comparing the correct byte.

-        assembly ("memory-safe") {
-            for {
-                let ptr := add(add(buffer, 0x20), offset)
-                i := 0
-            } and(iszero(shr(248, xor(mload(add(ptr, i)), search))), lt(i, length)) {
-                i := add(i, 1)
-            } {}
-        }
+        assembly ("memory-safe") {
+            let ptr := add(add(buffer, 0x20), offset)
+            // Iterate while in-bounds; break on first non-match.
+            for { i := 0 } lt(i, length) { i := add(i, 1) } {
+                if iszero(eq(byte(0, mload(add(ptr, i))), search))) { break }
+            }
+        }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/**
* @dev Count number of occurrences of `search` at the beginning of `buffer`.
*/
function countLeading(bytes memory buffer, bytes1 search) internal pure returns (uint256) {
return countConsecutive(buffer, 0, search);
}
/**
* @dev Count number of occurrences of `search` in `buffer`, starting from position `offset`.
*/
function countConsecutive(bytes memory buffer, uint256 offset, bytes1 search) internal pure returns (uint256 i) {
uint256 length = Math.saturatingSub(buffer.length, offset);
assembly ("memory-safe") {
for {
let ptr := add(add(buffer, 0x20), offset)
i := 0
} and(iszero(shr(248, xor(mload(add(ptr, i)), search))), lt(i, length)) {
i := add(i, 1)
} {}
}
}
/**
* @dev Count number of occurrences of `search` in `buffer`, starting from position `offset`.
*/
function countConsecutive(bytes memory buffer, uint256 offset, bytes1 search) internal pure returns (uint256 i) {
uint256 length = Math.saturatingSub(buffer.length, offset);
assembly ("memory-safe") {
let ptr := add(add(buffer, 0x20), offset)
// Iterate while in-bounds; break on first non-match.
for { i := 0 } lt(i, length) { i := add(i, 1) } {
if iszero(eq(byte(0, mload(add(ptr, i))), search)) { break }
}
}
}
🤖 Prompt for AI Agents
In contracts/utils/Bytes.sol around lines 71 to 92, the inline assembly loop
both compares the wrong byte and can mload out-of-bounds when i == length;
update the for-condition to check i < length before doing any mload, and change
the byte comparison to extract and compare the correct byte (mask the loaded
word to 0xff or shift so the byte under test is compared to search) instead of
xor+shr that looks at the opposite byte; ensure you only perform mload after the
bounds check so no OOB loads occur.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: replacing iszero(shr(248, xor(mload(add(ptr, i)), search))) with iszero(eq(byte(0, mload(add(ptr, i))), search))) implies that the data in search is in the leftmost 8 bits as opposed to the rightmost 8 bits. Need to check (this is a bytes1, so maybe its the correct approach)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

You're absolutely correct! Here's the analysis:

  1. bytes1 values are left-aligned in Solidity - they occupy the leftmost (most significant) byte of a 32-byte word
  2. byte(0, x) extracts the leftmost byte from the word
  3. mload(add(ptr, i)) loads 32 bytes where the byte at position i becomes the leftmost byte

So my suggested fix byte(0, mload(add(ptr, i))) == search is correct - both operands have their data in the leftmost position.

The original code's shr(248, ...) was also targeting the leftmost byte (shifting it to the rightmost position), which confirms that the intention was always to compare the leftmost bytes.

However, I notice there's a syntax error in my suggestion - missing closing parenthesis. Here's the corrected version:

assembly ("memory-safe") {
    let ptr := add(add(buffer, 0x20), offset)
    // Iterate while in-bounds; break on first non-match.
    for { i := 0 } lt(i, length) { i := add(i, 1) } {
        if iszero(eq(byte(0, mload(add(ptr, i))), search)) { break }
    }
}

The key insight is that both byte(0, mload(add(ptr, i))) and search (bytes1) have their significant data in the leftmost 8 bits, making the comparison correct.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
contracts/utils/Base58.sol (1)

32-38: Make the leading-zero loop condition explicit (readability and safety)

The current condition lt(byte(...), lt(inputLeadingZeros, inputLength)) is clever but cryptic. Replace with a direct conjunction to avoid maintenance hazards.

-            for {} lt(byte(0, mload(add(add(input, 0x20), inputLeadingZeros))), lt(inputLeadingZeros, inputLength)) {} {
+            for {} and(
+                iszero(byte(0, mload(add(add(input, 0x20), inputLeadingZeros)))),
+                lt(inputLeadingZeros, inputLength)
+            ) {} {
                 inputLeadingZeros := add(inputLeadingZeros, 1)
             }
🧹 Nitpick comments (4)
contracts/utils/Base58.sol (4)

196-198: Use explicit conjunction when trimming compacted leading zeros (readability)

Same pattern as above; use and(iszero(...), lt(...)) for clarity.

-            for {} lt(byte(0, mload(output)), lt(output, scratch)) {} {
+            for {} and(iszero(byte(0, mload(output))), lt(output, scratch)) {} {
                 output := add(output, 1)
             }

73-83: Nit: fix typos in comments

Minor typos reduce readability.

-            // - `output` point the the left part of the encoded string. We start from scratch, which means we have
+            // - `output` points to the left part of the encoded string. We start from scratch, which means we have
             //   outputLengthEstim bytes to work with before hitting the FMP
             for {
                 let data := scratch
                 output := scratch
             } 1 {} {
-                // move past the fisrt (highest) zero limbs.
+                // Move past the first (highest) zero limbs.

43-45: Nit: clarify scratch workspace comment

Grammar/clarity.

-            // This is going to be our "scratch" workspace. We leave enough room after FMP to later store length + encoded output.
+            // This is our "scratch" workspace. We leave enough room after the free memory pointer (FMP) to later store length + encoded output.

143-145: Nit: fix typo in decode scratch comment

-            // This is going to be our "scratch" workspace. Be leave enough room on the left to store length + encoded input.
+            // This is our "scratch" workspace. We leave enough room on the left to store length + decoded output.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1736f38 and 8652d20.

📒 Files selected for processing (1)
  • contracts/utils/Base58.sol (1 hunks)
🔇 Additional comments (5)
contracts/utils/Base58.sol (5)

64-68: Sanity check: table addressing matches mload(carry) trick

The encoding table layout at 0x1f/0x3f correctly ensures mload(carry)'s low byte is the target character. No action needed.

Also applies to: 100-103


146-168: Decode lookup table and bitmask validation look correct

  • The 3-word table enables byte(0, mload(c)) to fetch the base58 index efficiently.
  • The bitmask gate on shl(c, 1) rejects invalid chars and reverts with InvalidBase58Digit.

LGTM.


200-205: Efficient zero-prefix fill via calldatacopy is fine

Using calldatacopy(..., calldatasize(), n) to write n zero bytes is a known gas optimization; safe here.


28-31: API surface and empty-input fast paths look good

encode/decode wrappers and early returns for empty inputs are appropriate.

Also applies to: 123-128


105-116: Bug: prefixing leading '1's overwrites output; write to the left of output

The current loop writes 32 bytes at output, clobbering the first encoded character(s). You want to fill to the left of output in 32-byte chunks.

Apply:

-            for {
-                let i := 0
-            } lt(i, inputLeadingZeros) {} {
-                i := add(i, 0x20)
-                mstore(sub(output, i), "11111111111111111111111111111111")
-            }
+            for { let i := 0 } lt(i, inputLeadingZeros) { i := add(i, 0x20) } {
+                // Write 32 '1' bytes ending at output - i.
+                mstore(sub(output, add(i, 0x20)), 0x3131313131313131313131313131313131313131313131313131313131313131)
+            }

No other changes needed: the subsequent pointer shift by add(inputLeadingZeros, 0x20) correctly accounts for both the written prefix and the length slot.

Likely an incorrect or invalid review comment.

Comment on lines +91 to +99
let i := content
} lt(i, ptr) {
i := add(i, 0x20)
} {
let acc := add(shl(248, carry), mload(i))
mstore(i, div(acc, 58))
carry := mod(acc, 58)
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Critical: undefined identifier "content" in encode loop causes compilation failure

The inner loop initializes i := content, but content is never defined. This prevents compilation and should be data (the current highest non-empty limb).

Apply:

-                for {
-                    let i := content
-                } lt(i, ptr) {
+                for {
+                    let i := data
+                } lt(i, ptr) {
                     i := add(i, 0x20)
                 } {
                     let acc := add(shl(248, carry), mload(i))
                     mstore(i, div(acc, 58))
                     carry := mod(acc, 58)
                 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let i := content
} lt(i, ptr) {
i := add(i, 0x20)
} {
let acc := add(shl(248, carry), mload(i))
mstore(i, div(acc, 58))
carry := mod(acc, 58)
}
// Inner loop: divide each 248-bit limb by 58, carrying the remainder
for {
let i := data
} lt(i, ptr) {
i := add(i, 0x20)
} {
let acc := add(shl(248, carry), mload(i))
mstore(i, div(acc, 58))
carry := mod(acc, 58)
}
🤖 Prompt for AI Agents
In contracts/utils/Base58.sol around lines 91 to 99, the inner loop initializes
i := content but there is no identifier named content; change the initialization
to i := data (the current highest non-empty limb) so the loop references the
defined variable, leaving the rest of the loop logic intact; ensure data is in
scope at this point and rebuild to confirm compilation succeeds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider adding a Base58 library
2 participants