Skip to content

Conversation

@tim-barry
Copy link
Contributor

@tim-barry tim-barry commented Sep 12, 2025

Updates the consensus follower only to use Pebble database as a backend.
Also includes some minor refactors and cleanup.

Summary by CodeRabbit

  • New Features

    • Switched consensus storage engine for improved reliability and performance.
    • Added graceful shutdown to ensure the database closes cleanly.
  • Bug Fixes

    • Rosetta transaction preprocessing now validates responses and fails fast on invalid data.
    • Clearer error messages when decoding seed-node keys.
  • Improvements

    • Enhanced database logging context for better observability.
  • Chores

    • Updated dependencies to compatible releases.
    • Makefile formatting cleanups (no functional changes).

@coderabbitai
Copy link

coderabbitai bot commented Sep 12, 2025

Walkthrough

Replaced Badger consensus storage with Pebble in state initialization, updated protocol DB wiring, added Pebble logging context and graceful shutdown handler, and adjusted seed-node key decode logs. Standardized Rosetta network_identifier usage and added preprocess response validation. Updated dependencies in go.mod; Makefile received indentation-only formatting updates.

Changes

Cohort / File(s) Summary
Consensus storage backend migration
state/state.go
Replaced Badger with Pebble: updated imports, created zerolog pebble context logger, opened DB via pebble.SafeOpen, converted to protocol DB via pebbleimpl.ToDB(db), added process.SetExitHandler to close DB on shutdown, and revised seed-node key decode error messages. Removed Badger-specific wiring/options.
Rosetta helpers and validation
rosetta_handler.py
Added module-level network_identifier = {"blockchain":"flow","network":"localnet"} and refactored helper calls to use it. preprocess_transaction now requires "options" in Rosetta response and raises RuntimeError(result) if missing. No public signatures changed.
Dependency updates
go.mod
Moved github.com/dgraph-io/badger/v2 to indirect; upgraded onflow/flow-go to v0.43.0-rc.2 and flow/protobuf/go/flow to v0.4.15; bumped Flow core contracts/templates and Flow NFT versions to newer releases; removed go-ds-badger2 indirect.
Build script formatting
Makefile
Whitespace/indentation normalization across targets and ensured trailing newline; no behavioral changes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Node
  participant Pebble as Pebble DB
  participant Logger
  participant ProtocolDB

  rect rgb(235, 245, 255)
    Note right of Node: Startup
    Node->>Logger: Create zerolog context ("pebbledb":"consensusFollower")
    Node->>Pebble: SafeOpen(path, Logger)
    Pebble-->>Node: db handle or error
    Node->>ProtocolDB: pebbleimpl.ToDB(db)
  end

  rect rgb(245, 235, 255)
    Note right of Node: Shutdown
    Node-->>Node: process.SetExitHandler(...)
    Node->>Pebble: Close()
    Pebble-->>Node: close status (log on error)
  end
Loading
sequenceDiagram
  autonumber
  participant Client
  participant Rosetta as Rosetta Handler
  participant RosettaAPI as Rosetta API

  rect rgb(235, 245, 255)
    Note over Rosetta: Centralized network_identifier
    Client->>Rosetta: preprocess_transaction(req)
    Rosetta->>RosettaAPI: /construction/preprocess (network_identifier)
    RosettaAPI-->>Rosetta: result
    alt result has "options"
      Rosetta-->>Client: options
    else missing "options"
      Rosetta-->>Client: RuntimeError(result)
    end
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • zhangchiqing
  • jordanschalm
  • peterargue

Poem

A pebble I found in the data stream,
Smooth as consensus, swift as a dream.
Badger sleeps while new logs gleam—
Network set, options checked, we team.
I thump and ship the code with vim and beam. 🐇💾

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "Update Rosetta to use new storage layer and Pebble backend" accurately and concisely describes the primary change in the diff—replacing the Badger-backed consensus DB with a Pebble-backed storage layer and accompanying refactors (see state/state.go and go.mod updates). It is specific, focused on the main change, and contains no noisy file lists or vague terms, so a reviewer scanning PR history will understand the intent.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch tim/badger-to-pebble

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (10)
cache/cache.go (4)

51-53: Clarify concurrency around DropAll or guard it.

Calling DropAll on the underlying Badger DB while reads/writes go through s.db may race with active readers/writers. Either:

  • Serialize DropAll with a Store-level mutex that also guards InterceptUnary writes, or
  • Document and enforce that DropAll is only called during quiescent periods.

Would you like a small mutex-based guard patch?


89-96: Tighten log messages for cache lookup errors.

These errors occur during cache key hashing/lookup, not “decoding.” Adjust wording for accurate diagnostics.

Apply:

-		log.Error().Msgf("Failed to encode the gRPC request for caching: %s", err)
+		log.Error().Msgf("Failed to encode the gRPC request for cache key: %s", err)
@@
-		log.Error().Msgf("Failed to hash the gRPC request for caching: %s", err)
+		log.Error().Msgf("Failed to hash the gRPC request for cache key: %s", err)
@@
-		log.Error().Msgf("Got unexpected error when decoding gRPC response for caching: %s", err)
+		log.Error().Msgf("Unexpected error when retrieving cached gRPC response: %s", err)

Also applies to: 123-125


66-79: Metrics attributes not updated after mutation of attrs.

You append error_response to attrs after creating mOpt, but don’t rebuild mOpt; the metric tags won’t include error_response.

Apply:

-	mOpt := metric.WithAttributeSet(attrSet)
+	mOpt := metric.WithAttributeSet(attrSet)
@@
-		attrs = append(attrs, trace.Bool("error_response", true))
+		attrs = append(attrs, trace.Bool("error_response", true))
+		// refresh metrics option with updated attributes
+		mOpt = metric.WithAttributeSet(attribute.NewSet(attrs...))

Also applies to: 131-135


149-154: Consider adding a small TTL-based guard for hot keys.

If the same request is hammered concurrently, multiple misses can force redundant invocations before the first write lands. A short-lived negative/positive memoization (or singleflight) would reduce thundering herd.

Want a singleflight-based patch?

state/state.go (2)

449-459: Fix typos in user-facing logs (“Acccess”).

Minor, but noisy in ops.

Apply:

-							"Unverifiable seal found in block %x at height %d: got seal for block %x via Acccess API, found seal for block %x via consensus",
+							"Unverifiable seal found in block %x at height %d: got seal for block %x via Access API, found seal for block %x via consensus",
@@
-							"Unverifiable execution result for block %x found in block %x at height %d: got %x via Acccess API, got %x via consensus",
+							"Unverifiable execution result for block %x found in block %x at height %d: got %x via Access API, got %x via consensus",

739-753: Check HTTP status codes in download.

Currently treats any response as success; handle non-2xx explicitly.

Apply:

-		resp, err := httpClient.Do(req)
+		resp, err := httpClient.Do(req)
 		if err != nil {
 			log.Error().Msgf("Failed to download %s: %s", src, err)
 			time.Sleep(time.Second)
 			continue
 		}
+		if resp.StatusCode < 200 || resp.StatusCode >= 300 {
+			_ = resp.Body.Close()
+			log.Error().Msgf("Failed to download %s: unexpected HTTP status %d", src, resp.StatusCode)
+			time.Sleep(time.Second)
+			continue
+		}
indexdb/indexdb.go (4)

282-292: Return (false, nil) on not-found in HasBalance.

Use a not-found to indicate absence rather than an error; improves API ergonomics.

Apply:

-	_, err := seeker.SeekLE(key[:9], key)
-	if err != nil {
+	_, err := seeker.SeekLE(key[:9], key)
+	if err != nil {
+		if err == storage.ErrNotFound {
+			return false, nil
+		}
 		return false, fmt.Errorf(
 			"indexdb: failed to check balance existence for account %x at height %d: %s",
 			acct, height, err,
 		)
-	}
-	return true, nil
+	}
+	return true, nil

683-696: Don’t swallow unexpected seek errors in balanceByHeight.

Treat non-not-found errors as real errors; only default to 0 on not-found.

Apply:

-	latestAsOfHeight, err := s.db.Reader().NewSeeker().SeekLE(key[:9], key)
-	if err == nil {
+	latestAsOfHeight, err := s.db.Reader().NewSeeker().SeekLE(key[:9], key)
+	if err == nil {
 		value, closer, err := s.db.Reader().Get(latestAsOfHeight)
 		if err != nil {
 			return 0, fmt.Errorf(
 				"indexdb: failed to get balance for account %x at height %d: %s",
 				acct, height, err,
 			)
 		}
 		defer closer.Close()
 		balance = binary.BigEndian.Uint64(value)
-	}
+	} else if err != storage.ErrNotFound {
+		return 0, fmt.Errorf("indexdb: failed to seek balance for account %x at height %d: %s", acct, height, err)
+	}

719-727: Guard against nil genesis in withParentInfo.

If s.genesis hasn’t been populated yet, this dereferences nil. Prefer using s.Genesis() or check for nil.

Proposed:

-	if height != s.genesis.Height {
+	gen := s.genesis
+	if gen == nil {
+		gen = s.Genesis()
+		if gen == nil {
+			return nil, fmt.Errorf("indexdb: genesis not initialized")
+		}
+	}
+	if height != gen.Height {

591-602: Stream deletions in ResetTo to reduce memory.

Collecting and sorting all keys can balloon memory on large datasets. Consider iterating and deleting in batches (fixed-size buffer) without a full slice, or commit multiple smaller batches.

I can draft a streaming-delete version if helpful.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 71bd8c2 and f08396e.

📒 Files selected for processing (3)
  • cache/cache.go (7 hunks)
  • indexdb/indexdb.go (22 hunks)
  • state/state.go (25 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
state/state.go (4)
config/config.go (3)
  • Consensus (91-97)
  • Chain (23-42)
  • Contracts (82-87)
log/log.go (3)
  • Error (67-69)
  • Warn (94-96)
  • Info (84-86)
cache/cache.go (1)
  • Store (50-53)
indexdb/indexdb.go (1)
  • Store (66-71)
indexdb/indexdb.go (3)
log/log.go (3)
  • Errorf (72-74)
  • Info (84-86)
  • Error (67-69)
cache/cache.go (1)
  • Store (50-53)
process/process.go (1)
  • SetExitHandler (41-46)
cache/cache.go (4)
indexdb/indexdb.go (1)
  • Store (66-71)
log/log.go (3)
  • Error (67-69)
  • Info (84-86)
  • Badger (21-23)
trace/trace.go (2)
  • EndSpanOk (87-90)
  • NewSpan (175-177)
process/process.go (1)
  • SetExitHandler (41-46)
🪛 GitHub Check: test
cache/cache.go

[failure] 6-6:
"errors" imported and not used


[failure] 185-185:
cannot use db (variable of type *"github.com/dgraph-io/badger/v3".DB) as *"github.com/dgraph-io/badger/v2".DB value in argument to badgerimpl.ToDB

🪛 GitHub Actions: CI
cache/cache.go

[error] 6-6: go test -v github.com/onflow/rosetta/state/... failed: cache/cache.go:6:2: 'errors' imported and not used

🔇 Additional comments (2)
cache/cache.go (1)

173-187: Log consistency and cleanup on close look good.

Using zerolog and setting the Badger logger via rosettalog.Badger is consistent with the rest of the PR.

indexdb/indexdb.go (1)

737-751: Good Pebble wiring and graceful shutdown.

SafeOpen + exit handler mirrors cache/index patterns; using pebbleimpl.ToDB is consistent.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
state/state.go (2)

641-652: Consensus Pebble DB: SafeOpen + exit handler looks good.

Nice—this ensures WAL flush and clean shutdown.


668-673: Use the seed node public key in error logs (don’t log raw bytes).

Finish aligning the message/arg to the public key string for consistency and clearer diagnostics.

-			log.Fatalf("Failed to decode the seed node key %q: %s", rawkey, err)
+			log.Fatalf("Failed to decode the seed node public key %q: %s", node.PublicKey, err)
🧹 Nitpick comments (4)
rosetta_handler.py (2)

14-17: Parameterize the network identifier instead of hard-coding "localnet".

Hard-coding increases the chance of mismatches when pointing the CLI at testnet/mainnet nodes.

Apply:

- network_identifier = {
-     "blockchain": "flow",
-     "network": "localnet",
- }
+ network_identifier = {
+     "blockchain": "flow",
+     "network": os.getenv("ROSETTA_NETWORK", "localnet"),
+ }

Also add:

- import json
+ import json
+ import os

40-43: Strengthen preprocess error handling for Rosetta errors.

Server errors come back as JSON; surface the message/code for easier debugging.

-    result = request_router(target_url, data)
-    if "options" not in result:
-        raise RuntimeError(result)
+    result = request_router(target_url, data)
+    if "options" not in result or result.get("options") is None:
+        msg = result.get("message") or result
+        code = result.get("code")
+        detail = f"{code}: {msg}" if code else f"{msg}"
+        raise RuntimeError(f"preprocess failed: {detail}")
go.mod (1)

265-266: Clean up comment clutter on dependency lines.

The extra historical versions in the comment are noisy; keep a single “// indirect”.

-	github.com/onflow/flow-core-contracts/lib/go/contracts v1.7.4-0.20250911124549-394bdee30b81 // indirect; v1.2.4-0.20230703193002-53362441b57d // indirect
-	github.com/onflow/flow-core-contracts/lib/go/templates v1.7.2-0.20250911124549-394bdee30b81 // indirect; v1.2.3 // indirect
+	github.com/onflow/flow-core-contracts/lib/go/contracts v1.7.4-0.20250911124549-394bdee30b81 // indirect
+	github.com/onflow/flow-core-contracts/lib/go/templates v1.7.2-0.20250911124549-394bdee30b81 // indirect
state/state.go (1)

641-643: Optional: enrich Pebble logger context with DB path.

Helps correlate logs across nodes.

-	dbLog := zerolog.With().Str("pebbledb", "consensusFollower").Logger()
+	dbLog := zerolog.With().
+		Str("pebbledb", "consensusFollower").
+		Str("path", dbDir).
+		Logger()
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f08396e and c4c43a5.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (4)
  • Makefile (4 hunks)
  • go.mod (3 hunks)
  • rosetta_handler.py (6 hunks)
  • state/state.go (3 hunks)
✅ Files skipped from review due to trivial changes (1)
  • Makefile
🧰 Additional context used
🧬 Code graph analysis (1)
state/state.go (2)
log/log.go (3)
  • Fatalf (78-81)
  • Infof (89-91)
  • Errorf (72-74)
process/process.go (1)
  • SetExitHandler (41-46)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: test
🔇 Additional comments (3)
rosetta_handler.py (1)

14-17: Centralizing network_identifier usage — LGTM; verification blocked

  • Nice consolidation; reduces duplication and drift.
  • Verification failed: ripgrep returned "No files were searched", so I cannot confirm there are no remaining inlined network identifiers.

Run and paste outputs from the repo root:

rg -nP --hidden -uu -C0 '"network_identifier"\s*:\s*\{' --glob '!**/vendor/**'
git grep -n -- '"network_identifier"\s*:\s*\{'

Also applies to: 31-31, 50-50, 60-60, 73-73, 101-101

go.mod (2)

16-17: Flow-Go upgrades: confirm API compatibility of follower/storage.
go mod download succeeded and the module graph resolves github.com/onflow/flow-go@v0.43.0-rc.1 and github.com/onflow/flow/protobuf/go/flow@v0.4.15. follower/storage/pebble imports appear in state/state.go and state/convert.go. Run go build ./... (or CI) to confirm these imports compile; watch for mixed protobuf versions in the graph (v0.4.15, v0.4.11, v0.4.3) which can cause generated-protobuf type conflicts.


51-51: Confirm intended badger version split: both cache and indexdb import v3; v2 is only indirect.

  • Findings: indexdb/indexdb.go and cache/cache.go import github.com/dgraph-io/badger/v3. go mod graph shows github.com/dgraph-io/badger/v2@v2.2007.4 is pulled in indirectly (via github.com/coinbase/rosetta-sdk-go and github.com/onflow/flow-go).
  • Action: if storage-layer must stay on v2, change indexdb to the v2 import and adjust code; otherwise accept both but ensure strict separation (no shared DB files or cross-version type usage).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
go.mod (2)

16-17: Release-candidate deps for Flow: confirm stability or pin to a stable tag.

github.com/onflow/flow-go v0.43.0-rc.2 and github.com/onflow/flow/protobuf/go/flow v0.4.15 look fine for testing but RCs can shift APIs. Prefer stable if available, or gate rollout.

If a stable v0.43.x exists now, consider:

-	github.com/onflow/flow-go v0.43.0-rc.2
+	github.com/onflow/flow-go v0.43.0

Otherwise, confirm you’ve smoke-tested Rosetta sync against mainnet/testnet with this RC.


265-266: Clean up confusing inline comment with a second version tag.

These lines include an extra version reference in the comment; it’s harmless but noisy and can confuse reviewers.

Apply:

-	github.com/onflow/flow-core-contracts/lib/go/contracts v1.8.0 // indirect; v1.2.4-0.20230703193002-53362441b57d // indirect
+	github.com/onflow/flow-core-contracts/lib/go/contracts v1.8.0 // indirect
-	github.com/onflow/flow-core-contracts/lib/go/templates v1.8.0 // indirect; v1.2.3 // indirect
+	github.com/onflow/flow-core-contracts/lib/go/templates v1.8.0 // indirect
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c4c43a5 and 8207d57.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (1)
  • go.mod (4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: test
🔇 Additional comments (4)
go.mod (4)

72-72: LGTM: Upgrading Flow NFT templates to v1.3.0 aligns with the Pebble migration.


269-269: LGTM: Flow NFT contracts bumped to v1.3.0.


51-51: Badger v3 is still imported — do NOT remove it

indexdb/indexdb.go:16 and cache/cache.go:9 import github.com/dgraph-io/badger/v3; removing the v3 require will break the code. If migrating to Pebble, refactor those imports first; run go mod graph to find which dependency pulls badger/v2 and address it accordingly.

Likely an incorrect or invalid review comment.


3-3: Fix go directive in go.mod — format is valid; ensure compatibility

The go directive can include patch versions (e.g. go 1.23.0); the issue is compatibility — Go toolchains refuse to use modules that declare a newer Go version, so set the go line to a version supported by your CI and that satisfies flow-go v0.43.0-rc.2. (tip.golang.org)

File: go.mod (line 3)

Likely an incorrect or invalid review comment.

Copy link
Member

@zhangchiqing zhangchiqing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👍

@tim-barry tim-barry merged commit 87d6a7e into main Sep 16, 2025
1 check passed
@tim-barry tim-barry deleted the tim/badger-to-pebble branch September 16, 2025 10:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants