Skip to content

Secondary instance iteration stops prematurely (~26k keys) on FSx/XFS despite 300M+ keys in Primary #14429

@theolivenbaum

Description

@theolivenbaum

We're running a Primary/Secondary setup on AWS FSx (XFS), the secondary instance successfully opens the database but fails to iterate through the full dataset. In one column family containing >300M keys, the iteration sees only approximately ~26,000 keys and stops. No errors are reported in the RocksDB logs.

Environment

  • RocksDB Version: 10.4.2
  • OS: Amazon Linux / Ubuntu (AWS EC2)
  • File System: AWS FSx (XFS)

Expected behavior

The secondary instance should be able to iterate through all 300M+ keys present in the SST files, provided it has caught up with the manifest.

Actual behavior

The iterator terminates early. The LOG file shows no IO Error or Corruption. However, the following warning is present:
[WARN] [table/block_based/block_based_table_reader.cc:909] At least one SST file opened without unique ID to verify: 4511745.sst

Additional context

The Primary instance functions perfectly and can see all data. We suspect a potential issue with how the Secondary instance refreshes its view of the SST files on the FSx/XFS file system. We tried adding a call to catch up with primary before iterating, but that didn't fix the issue.

Is it possible the Secondary is failing to see new SST files or is encountering a silent issue when loading the manifest?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions