-
Notifications
You must be signed in to change notification settings - Fork 97
Description
I read the code and documentation and wanted to ask if there is a specific reason why you are discarding the old index files and always recreating them? It sounds like a dangerous default and expensive especially re production environments.
In the event of a crash caused by a power loss or an operating system failure, Pogreb discards the index and replays the WAL building a new index from scratch. Segments are iterated from the oldest to the newest and items are inserted into the index.
My use case is to store billions of key-values - and if I read the code correctly, anytime it crashes for any reason, the lock file will be detected and causes Pogreb to discard the index files (*.pix). Current estimated indexing time is 8 days and likely hundreds of GB. Any reboot/crash to cause reindex of hundreds of GB and days of work doesn't make sense? Possible solutions:
- New
Options.ReindexOnCrashto allow the user to specify whether (on false) it should try to re-open, or (on true) immediately reindex everything; or instead: - Introduce
Options.AutoReindexCorruptDatabasewhich triggers a reindex only in caseopenIndexreturns an error. The lock file will be disregarded for crash detection and it will always try to open the existing database.
I believe the second option makes most sense. In case of crashes most if not all users assume the database will just pick up where it left - especially in production environments.