Skip to content

fix(test): regenerate corrupted db fixtures and fix grpc server startup race#437

Merged
kariy merged 4 commits intomainfrom
fix/corrupted-db-fixtures
Feb 26, 2026
Merged

fix(test): regenerate corrupted db fixtures and fix grpc server startup race#437
kariy merged 4 commits intomainfrom
fix/corrupted-db-fixtures

Conversation

@kariy
Copy link
Member

@kariy kariy commented Feb 25, 2026

The katana-grpc tests in CI have been consistently failing. The root cause turned out to be two separate bugs, one masking the other.

Bug 1: Corrupted database fixtures

The generate_migration_db binary that produces the spawn_and_move.tar.gz and simple.tar.gz fixture archives was creating the database via Db::in_memory() which uses SyncMode::UtterlyNoSync — a mode that does not guarantee committed data is flushed to disk. The binary then archived the database files while the MDBX environment was still open and the node still running, capturing an inconsistent on-disk state. This manifested as:

thread 'test_get_state_update' panicked at crates/utils/src/node.rs:99:63:
failed to open database: failed to open db environment: MDBX_CORRUPTED: Database is corrupted

The fix switches generate_migration_db to use a persistent database directory with SyncMode::Durable, stops the node before archiving, and excludes the non-portable mdbx.lck lock file from archives. The fixture archives have been regenerated on Linux x86_64 inside the ghcr.io/dojoengine/katana-dev:latest Docker container.

Two regression tests (open_spawn_and_move_db_fixture, open_simple_db_fixture) verify the fixtures can be opened without corruption:

# Against OLD fixtures (from main):
        FAIL    katana-utils node::tests::open_simple_db_fixture
        FAIL    katana-utils node::tests::open_spawn_and_move_db_fixture
        fixture database is corrupted: MDBX_CORRUPTED: Database is corrupted

# Against NEW fixtures (regenerated with the fix):
        PASS    katana-utils node::tests::open_simple_db_fixture
        PASS    katana-utils node::tests::open_spawn_and_move_db_fixture

Bug 2: gRPC server startup race condition

Once the database corruption was fixed, the gRPC tests still failed with Connection refused. The gRPC server's start() method was spawning serve_with_shutdown on a background task and returning immediately — before the TCP port was bound. The test's GrpcClient::connect() fired before the server was listening. This was always a latent bug but was masked by the MDBX_CORRUPTED error which prevented the node from starting at all.

The fix binds the TcpListener eagerly before spawning the server task, then passes it via serve_with_incoming_shutdown. This guarantees the server is accepting connections when start() returns and correctly resolves port 0 to the actual assigned port — matching how the RPC server already works.

kariy and others added 3 commits February 25, 2026 17:01
Use a persistent database (SyncMode::Durable) instead of in-memory
(SyncMode::UtterlyNoSync) for generating test DB snapshots, and
properly stop the node before archiving.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The gRPC server was spawning `serve_with_shutdown` on a background task
and returning immediately, creating a race condition where clients could
attempt to connect before the server had bound its port. This was masked
by the MDBX_CORRUPTED error but became visible once the database fixtures
were fixed.

Bind the TCP listener eagerly before spawning the server task, then pass
the already-bound listener via `serve_with_incoming_shutdown`. This
guarantees the server is accepting connections when `start()` returns and
also correctly resolves port 0 to the actual assigned port.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@kariy kariy changed the title fix(test): regenerate corrupted database fixtures fix(test): regenerate corrupted db fixtures and fix grpc server startup race Feb 26, 2026
@kariy kariy merged commit 8b71ea9 into main Feb 26, 2026
8 of 9 checks passed
@kariy kariy deleted the fix/corrupted-db-fixtures branch February 26, 2026 21:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant