irmin-pack: error `unknown inode key`

Tezos is experiencing a periodic issue on their `testnet` (has not been observed on `mainnet`):
```
Sep 25 20:38:10.105:   Error:
Sep 25 20:38:10.105:     {"Direct":["CoUnviMUq75Qy4i43Y18zSYf9TTTrCyrMHRyif4hWQLwA3EjD5kz",22250291624,360]}: unknown inode key (find_value)
Sep 25 20:38:10.105: 
Sep 25 20:38:10.106: worker crashed [validator-peer]:
Sep 25 20:38:10.106:   Error:
Sep 25 20:38:10.106:     {"Direct":["CoUnviMUq75Qy4i43Y18zSYf9TTTrCyrMHRyif4hWQLwA3EjD5kz",22250291624,360]}: unknown inode key (find_value)
Sep 25 20:38:10.106: 
Sep 25 20:38:10.120: fetching branch of 3 blocks from peer idtw8Zm1g1bUJEPyqDP6tC8EvjFv4u
Sep 25 20:38:10.120: validation of block BLEzAQFH6kzPu3Q9wKpCK68UscAhfMQwGZHLrK7e3NSuzdwaHDr
Sep 25 20:38:10.120:   failed but precheck succeeded,
Sep 25 20:38:10.120:   Request pushed on 2023-09-25T20:38:10.120-00:00, treated in 4.988us, completed in 77.456us:
Sep 25 20:38:10.120:   {"Direct":["CoUnviMUq75Qy4i43Y18zSYf9TTTrCyrMHRyif4hWQLwA3EjD5kz",22250291624,360]}: unknown inode key (find_value)
Sep 25 20:38:10.120: unexpected error while fetching headers:
Sep 25 20:38:10.120:   {"Direct":["CoUnviMUq75Qy4i43Y18zSYf9TTTrCyrMHRyif4hWQLwA3EjD5kz",22250291624,360]}: unknown inode key (find_value)
Sep 25 20:38:10.122: request
Sep 25 20:38:10.122:   New branch BLm8Dx1TwWBN7DA2YfiULePTR7z5BaVAFDnvcqvLshe3GySoAeZ, locator length 423475
Sep 25 20:38:10.122:   failed
Sep 25 20:38:10.122:   (Request pushed on 2023-09-25T20:38:10.119-00:00, treated in 253us, completed in 2.519ms):
```

We do not yet have reproduction steps, but here is what we know:
1. Has only been observed on `testnet`
1. Happens roughly at the same time across nodes that are run by the foundation
1. These nodes regularly export snapshots for network backup
1. It is unclear whether the error is coming from the read-write instance or the read-only instance
1. Running irmin 3.7.2

When this occurred earlier this year, we guessed it might be related to GC and read-only instances not properly clearing the LRU. This was fixed in https://github.com/mirage/irmin/pull/2200.

We *think* there could be a bug in reading data that hasn't been flushed to disk since the append-only file [does not allow](https://github.com/mirage/irmin/blob/3f88570e94638da6ed6581a8c3062b5973b1f5c3/src/irmin-pack/unix/append_only_file.ml#L146-L147) reading beyond the last flushed offset, but this should never happen given higher level code in `pack_store` prevents this with the staging hash table (although this could use some inspection). Having a control file for the node at the time of this error would help us disprove this idea since we could compare the last written offset in the control file to the offset in the key that is logged.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

irmin-pack: error `unknown inode key` #2276

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

irmin-pack: error unknown inode key #2276

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

irmin-pack: error `unknown inode key` #2276