-
Notifications
You must be signed in to change notification settings - Fork 27
Description
Problem
Externalized transactions (>32KB) are found in Aerospike with no corresponding file on disk. The Aerospike record references an external file that no longer exists, causing read failures when the transaction is accessed.
Root Cause
The deletion ordering in flushCleanupBatches (stores/utxo/aerospike/pruner/pruner_service.go:1338-1361) deletes external files before Aerospike records:
// Phase 3: External file deletion — runs FIRST
if len(externalFiles) > 0 {
if err := s.executeBatchExternalFileDeletions(ctx, externalFiles); err != nil {
return err // Safe: files failed, records untouched
}
}
// Phase 2b: Aerospike record deletion — runs SECOND
if len(deletions) > 0 {
if err := s.executeBatchDeletions(ctx, deletions); err != nil {
return err // PROBLEM: files already gone, records remain
}
}If file deletion succeeds but the subsequent Aerospike batch delete fails (cluster overload, network partition, timeout), the records remain pointing to files that no longer exist.
While this should self-heal on the next pruner run, persistent Aerospike errors create a lasting inconsistency.
Contributing factors
-
Context cancellation mid-batch (
pruner_service.go:1559-1566):executeBatchExternalFileDeletionschecksctx.Done()between each file deletion. If context is cancelled (shutdown, timeout) after some files are deleted but before all are processed,flushCleanupBatchesreturns early and Aerospike records are never deleted for the already-removed files. -
TTL expiration mode (
pruner_service.go:1484-1491): WhenutxoSetTTL=true, records aren't deleted — they get TTL=1s. The record lingers until Aerospike evicts it, creating a window where the file is gone but the record still resolves in queries.
Proposed Fix
Reverse the ordering in flushCleanupBatches — delete Aerospike records first, then external files:
// Delete Aerospike records FIRST
if !s.settings.Pruner.SkipDeletions {
if len(deletions) > 0 {
if err := s.executeBatchDeletions(ctx, deletions); err != nil {
return err // Records and files both still exist — consistent
}
}
}
// Delete external files SECOND
if len(externalFiles) > 0 {
if err := s.executeBatchExternalFileDeletions(ctx, externalFiles); err != nil {
return err // Records gone, orphaned files on disk — harmless
}
}Trade-off: If Aerospike deletion succeeds but file deletion fails, files are orphaned on disk (wasted space). This is strictly better than the current behavior where records point to missing files (read failures). Orphaned files can be cleaned up with a background scan.
Impact
- Severity: Medium-high — causes read failures for affected transactions
- Frequency: Proportional to Aerospike error rate during pruning and shutdown frequency during pruner batch operations
- Services affected: Any service reading externalized transactions (asset server, block persister, validator)