Skip to content

Improve crash handling #35

@Kleissner

Description

@Kleissner

I read the code and documentation and wanted to ask if there is a specific reason why you are discarding the old index files and always recreating them? It sounds like a dangerous default and expensive especially re production environments.

In the event of a crash caused by a power loss or an operating system failure, Pogreb discards the index and replays the WAL building a new index from scratch. Segments are iterated from the oldest to the newest and items are inserted into the index.

My use case is to store billions of key-values - and if I read the code correctly, anytime it crashes for any reason, the lock file will be detected and causes Pogreb to discard the index files (*.pix). Current estimated indexing time is 8 days and likely hundreds of GB. Any reboot/crash to cause reindex of hundreds of GB and days of work doesn't make sense? Possible solutions:

  1. New Options.ReindexOnCrash to allow the user to specify whether (on false) it should try to re-open, or (on true) immediately reindex everything; or instead:
  2. Introduce Options.AutoReindexCorruptDatabase which triggers a reindex only in case openIndex returns an error. The lock file will be disregarded for crash detection and it will always try to open the existing database.

I believe the second option makes most sense. In case of crashes most if not all users assume the database will just pick up where it left - especially in production environments.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions