Skip to content

Conversation

JoeColeman95
Copy link
Contributor

@JoeColeman95 JoeColeman95 commented Sep 15, 2025

Adds configuratble Docker Garbage collection that addresses disk space management issues on both Linux and Windows instances with customization available. This change will default to the same behaviour as previously available, to prevent any unexpected changes for current users.

Key Features:

  • New CFN parameters allow customization of clean-up frequency and retention periods
  • Supported on both Linux and Windows
  • Allow flexibility over containers, images, networks and volumes
  • Replaces the old disk-based trigger system with time-based scheduling

CloudFormation Parameters Added:

  • DockerGCSchedule Configurable cleanup frequency (default: daily at 3 AM)
  • DockerGCPruneUntil Age threshold for cleanup (default: 1 hour)
  • DockerGCPruneImages Enable/disable image cleanup (default: true)
  • DockerGCPruneVolumes Enable/disable volume cleanup (default: false)

Fixes #465
Fixes #639

Stop hardcoding docker gc settings and let users pick what works
for them. Adds params for schedule, prune timing, and whether to
nuke images/volumes.
The time filter argument was removed from 'docker network prune' and 'docker volume prune' commands in both Linux and Windows cleanup scripts. This aligns the cleanup behavior and avoids passing unsupported or unnecessary filters to these commands.
@JoeColeman95 JoeColeman95 requested a review from a team as a code owner September 15, 2025 12:42
@JoeColeman95 JoeColeman95 dismissed petetomasik’s stale review September 15, 2025 15:21

Applied in a different script

scadu
scadu previously requested changes Sep 16, 2025
Copy link
Contributor

@scadu scadu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm aware that built-in Docker garbage collection is not available on Windows, but perhaps we could simplify Linux implementation and rely on Docker's garbage collection policies instead?

@JoeColeman95
Copy link
Contributor Author

I'm aware that built-in Docker garbage collection is not available on Windows, but perhaps we could simplify Linux implementation and rely on Docker's garbage collection policies instead?

@scadu this one is set up to trigger before a job runs too, so this can't be done with a policy as it's on-demand, it's an adaptation of what already existed, with the changes being made to enable customization. This triggers from the environment hook when the disk cleanup cannot reclaim enough space.

@JoeColeman95 JoeColeman95 requested a review from scadu September 17, 2025 08:36
@JoeColeman95 JoeColeman95 dismissed scadu’s stale review September 17, 2025 08:37

Doesn't fit in the to design

@JoeColeman95 JoeColeman95 requested a review from scadu September 17, 2025 10:51
Description: Systemd timer schedule for docker garbage collection (default is hourly)
Type: String
Default: "hourly"
AllowedPattern: "^(minutely|hourly|daily|weekly|monthly|yearly|\\*-\\*-\\* \\d{1,2}:\\d{2}:\\d{2}|\\d{1,2} \\d{1,2} \\* \\* \\*)$"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script doesn't seem to handle cron expressions?

case "$DOCKER_GC_SCHEDULE" in
hourly | daily | weekly | monthly) ;;

/usr/local/bin/docker-gc:
exit-status: 0

systemctl is-enabled docker-low-disk-gc.timer:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add those back?

echo "$(date): Docker cleanup completed successfully"
EOF

sed -i "s/DOCKER_PRUNE_UNTIL_PLACEHOLDER/$DOCKER_GC_PRUNE_UNTIL/g" /usr/local/bin/docker-gc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sed -i "s/DOCKER_PRUNE_UNTIL_PLACEHOLDER/$DOCKER_GC_PRUNE_UNTIL/g" /usr/local/bin/docker-gc
DOCKER_GC_PRUNE_UNTIL="${DOCKER_GC_PRUNE_UNTIL:-4h}"
DOCKER_GC_PRUNE_IMAGES="${DOCKER_GC_PRUNE_IMAGES:-false}"
DOCKER_GC_PRUNE_VOLUMES="${DOCKER_GC_PRUNE_VOLUMES:-false}"

Do we need to sed those variables? Can we refer to vars instead?

$action = New-ScheduledTaskAction -Execute "powershell.exe" -Argument "-ExecutionPolicy Bypass -File C:\buildkite-agent\bin\docker-gc.ps1 >> C:\buildkite-agent\elastic-stack.log 2>&1"
$settings = New-ScheduledTaskSettingsSet -AllowStartIfOnBatteries -DontStopIfGoingOnBatteries -StartWhenAvailable

Write-Output "creating scheduled task"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, but could be set up with retries like so:

Suggested change
Write-Output "creating scheduled task"
function Register-DockerGCTask {
param(
[string]$Schedule,
[int]$MaxRetries = 3
)
$retryCount = 0
while ($retryCount -lt $MaxRetries) {
try {
Register-ScheduledTask @taskParams -ErrorAction Stop
Write-Output "Task registered successfully"
return $true
}
catch {
$retryCount++
if ($retryCount -eq $MaxRetries) {
Write-Error "Failed to register task after $MaxRetries attempts: $_"
return $false
}
Write-Warning "Attempt $retryCount failed, retrying in 5 seconds..."
Start-Sleep -Seconds 5
}
}
}

# Check if this is a disk-space triggered cleanup or scheduled cleanup
FORCE_CLEANUP=${1:-""}
if [[ "$FORCE_CLEANUP" != "force" ]]; then
if /usr/local/bin/bk-check-disk-space.sh >/dev/null 2>&1; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't disk space already checked?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide a way of customizing DOCKER_PRUNE_UNTIL, DISK_MIN_AVAILABLE, and DISK_MIN_INODES for the cron job
3 participants