Skip to content

Commit 76a7b8f

Browse files
Agustin Henzeagustinhenze
authored andcommitted
Fix concurrent publish operations causing missing package files
When multiple repository operations execute concurrently on shared pool directories, race conditions could cause .deb files to be deleted despite appearing in repository metadata, resulting in apt 404 errors. Three distinct but related race conditions were identified and fixed: 1. Package addition vs publish race: When packages are added to a local repository that is already published, the publish operation could read stale package references before the add transaction commits. Fixed by locking all published repositories that reference the local repo during package addition. 2. Pool file deletion race: When multiple published repositories share the same pool directory (same storage+prefix) and publish concurrently, cleanup operations could delete each other's newly created files. The cleanup in thread B would: - Query database for referenced files (not seeing thread A's uncommitted files) - Scan pool directory (seeing thread A's files) - Delete thread A's files as "orphaned" Fixed by implementing pool-sibling locking: acquire locks on ALL published repositories sharing the same storage and prefix before publish/cleanup. 3. Concurrent cleanup on same prefix: Multiple distributions publishing to the same prefix concurrently could have cleanup operations delete shared files. Fixed by: - Adding prefix-level locking to serialize cleanup operations - Removing ref subtraction that incorrectly marked shared files as orphaned - Forcing database reload before cleanup to see recent commits The existing task system serializes operations based on resource locks, preventing these race conditions when proper lock sets are acquired. Test coverage includes concurrent publish scenarios that reliably reproduced all three bugs before the fixes.
1 parent ba65daf commit 76a7b8f

File tree

5 files changed

+825
-23
lines changed

5 files changed

+825
-23
lines changed

api/publish.go

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -465,7 +465,18 @@ func apiPublishUpdateSwitch(c *gin.Context) {
465465
published.MultiDist = *b.MultiDist
466466
}
467467

468-
resources := []string{string(published.Key())}
468+
// Lock the pool prefix and all published repos sharing it to prevent cleanup race conditions.
469+
// Without prefix-level locking, concurrent publishes to different distributions under the
470+
// same prefix can race during cleanup, causing one operation to delete the other's files.
471+
poolLockKey := fmt.Sprintf("pool:%s:%s", storage, prefix)
472+
resources := []string{poolLockKey, string(published.Key())}
473+
poolSiblings := collection.ByStoragePrefix(storage, prefix)
474+
for _, sibling := range poolSiblings {
475+
if sibling.UUID != published.UUID {
476+
resources = append(resources, string(sibling.Key()))
477+
}
478+
}
479+
469480
taskName := fmt.Sprintf("Update published %s repository %s/%s", published.SourceKind, published.StoragePrefix(), published.Distribution)
470481
maybeRunTaskInBackground(c, taskName, resources, func(out aptly.Progress, _ *task.Detail) (*task.ProcessReturnValue, error) {
471482
err = collection.LoadComplete(published, collectionFactory)
@@ -1024,7 +1035,18 @@ func apiPublishUpdate(c *gin.Context) {
10241035
published.MultiDist = *b.MultiDist
10251036
}
10261037

1027-
resources := []string{string(published.Key())}
1038+
// Lock the pool prefix and all published repos sharing it to prevent cleanup race conditions.
1039+
// Without prefix-level locking, concurrent publishes to different distributions under the
1040+
// same prefix can race during cleanup, causing one operation to delete the other's files.
1041+
poolLockKey := fmt.Sprintf("pool:%s:%s", storage, prefix)
1042+
resources := []string{poolLockKey, string(published.Key())}
1043+
poolSiblings := collection.ByStoragePrefix(storage, prefix)
1044+
for _, sibling := range poolSiblings {
1045+
if sibling.UUID != published.UUID {
1046+
resources = append(resources, string(sibling.Key()))
1047+
}
1048+
}
1049+
10281050
taskName := fmt.Sprintf("Update published %s repository %s/%s", published.SourceKind, published.StoragePrefix(), published.Distribution)
10291051
maybeRunTaskInBackground(c, taskName, resources, func(out aptly.Progress, _ *task.Detail) (*task.ProcessReturnValue, error) {
10301052
result, err := published.Update(collectionFactory, out)

0 commit comments

Comments
 (0)