Cloud custodian can't count empty or unused S3 buckets #10491
Unanswered
amckeown-blc
asked this question in
AWS
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
A Cloud custodian policy can't count empty or unused S3 buckets it seems based on cloudwatch metrics. We are running an unused after 90 day policy for all our resources, it seems S3 it the one with trouble.
If we run a similar Python script on Lambda rather than a Cloud Custodian policy it does pick out buckets which are empty or not used. From the example below it brings just [] back
{
"execution-options": {},
"policies": [
{
"name": "s3-unused-90days",
"resource": "s3",
"filters": [
{
"type": "value",
"key": "Location.LocationConstraint",
"op": "in",
"value": [
"eu-west-2",
"ap-southeast-2",
null
]
},
{
"type": "metrics",
"name": "BucketSizeBytes",
"namespace": "AWS/S3",
"statistics": "Average",
"period": 86400,
"days": 90,
"value": 0,
"op": "eq",
"dimensions": {
"StorageType": "StandardStorage"
}
},
{
"type": "metrics",
"name": "NumberOfObjects",
"namespace": "AWS/S3",
"statistics": "Average",
"period": 86400,
"days": 90,
"value": 0,
"op": "eq",
"dimensions": {
"StorageType": "AllStorageTypes"
}
}
],
"mode": {
"type": "periodic",
"schedule": "rate(1 day)",
"role": "arn:aws:iam::*:role/CloudCustodianRole",
"timeout": 900,
"execution-options": {
"output_dir": "s3://custodian-output-bucket-blc/",
"output_format": "jsonlines"
},
"tags": {
"custodian-info": "mode=periodic:version=0.9.48"
}
}
}
]
}
But the python works fine:
import boto3
import json
import gzip
import io
from datetime import datetime, timedelta
Configuration
THRESHOLD_DAYS = 90
TARGET_REGIONS = ["eu-west-2", "ap-southeast-2"]
OUTPUT_BUCKET = "custodian-output-bucket-blc"
OUTPUT_PREFIX = "s3-unused-python"
s3 = boto3.client("s3")
cloudwatch = boto3.client("cloudwatch")
def lambda_handler(event, context):
now = datetime.utcnow()
start = now - timedelta(days=THRESHOLD_DAYS)
Apparently
The reason Cloud Custodian is returning an empty list ([]) while your custom Python Lambda finds many unused S3 buckets is a known limitation with how S3 CloudWatch metrics work:
AWS does not emit any datapoints for BucketSizeBytes or NumberOfObjects on truly empty buckets (or buckets with zero size/activity).
When Cloud Custodian's metrics filter queries CloudWatch and gets no datapoints, it treats this as "no match" — the bucket is excluded, even though zero/no datapoints actually indicates an empty/unused bucket.
Your Lambda explicitly falls back to list_objects_v2 (checking KeyCount) when metrics are unavailable or inaccessible, which correctly identifies empty buckets. Cloud Custodian does not do this fallback natively.
This behavior is documented in multiple open Cloud Custodian GitHub issues (e.g., #3798, #363, #1790) and has been a longstanding request — there is still no built-in "empty" or "has-objects" filter for S3 that reliably handles the no-datapoints case.
Beta Was this translation helpful? Give feedback.
All reactions