Skip to content

Account cleanup fails when AWS Security Incident Response is enabled (SIR EventBridge rules) #101

@chrisns

Description

@chrisns

Description

When AWS Security Incident Response is enabled on accounts in the organization, it creates two EventBridge rules in every account across every managed region:

  • SIRGuardDutyRule
  • SIRSecurityHubRule

These rules are managed by the SIR service (ManagedBy: triage.security-ir.amazonaws.com) and cannot be deleted via the normal EventBridge DeleteRule API. Each rule also has a target (TargetID: security-ir) that similarly cannot be removed.

During account cleanup, aws-nuke attempts to delete these rules, fails, and enters an infinite retry loop until the CodeBuild job times out. This causes every cleanup to fail, leaving accounts permanently stuck in CleanUp status and draining the entire account pool.

Impact

In our deployment, 7 out of 9 pool accounts were stuck in CleanUp simultaneously, making the sandbox completely unavailable. The step function retries the CodeBuild job multiple times, but each attempt hits the same infinite loop and times out.

Root cause

The default nuke config at source/infrastructure/lib/components/config/nuke-config.yaml filters CloudWatchEventsRule for Control Tower rules only:

CloudWatchEventsRule:
  - property: Name
    type: glob
    value: aws-controltower-*
  - property: Name
    type: contains
    value: AWSControlTower

It does not filter the SIR-managed rules, and there is no CloudWatchEventsTarget filter section at all.

Suggested fix

Add filters for the SIR rules and their targets to the default nuke config:

CloudWatchEventsRule:
  - property: Name
    type: glob
    value: aws-controltower-*
  - property: Name
    type: contains
    value: AWSControlTower
  - property: Name
    type: exact
    value: SIRGuardDutyRule # managed by AWS Security Incident Response
  - property: Name
    type: exact
    value: SIRSecurityHubRule # managed by AWS Security Incident Response
CloudWatchEventsTarget:
  - property: Name
    type: exact
    value: SIRGuardDutyRule # targets on SIR-managed rules
  - property: Name
    type: exact
    value: SIRSecurityHubRule # targets on SIR-managed rules

Evidence from cleanup logs

aws-nuke reports 4 failed / 4 waiting resources in a loop until CodeBuild times out:

{"component":"libnuke","failed":4,"finished":0,"level":"info","msg":"Removal requested: 4 waiting, 4 failed, 470 skipped, 0 finished","time":"2026-02-25T12:32:17Z","waiting":4}

The failing resources (in both us-east-1 and us-west-2):

{"level":"info","msg":"failed","name":"Rule: SIRGuardDutyRule","owner":"us-east-1","prop:Name":"SIRGuardDutyRule","state":"failed","state_code":6,"type":"CloudWatchEventsRule"}
{"level":"info","msg":"failed","name":"Rule: SIRSecurityHubRule","owner":"us-east-1","prop:Name":"SIRSecurityHubRule","state":"failed","state_code":6,"type":"CloudWatchEventsRule"}

Confirming these are SIR-managed and undeletable:

{
  "Name": "SIRGuardDutyRule",
  "State": "ENABLED",
  "ManagedBy": "triage.security-ir.amazonaws.com",
  "EventBusName": "default"
}

Workaround

We added the filters above to the nuke config via AppConfig (hosted configuration version) which resolved the issue without needing to redeploy the solution.

Environment

  • Version: v1.1 (container image public.ecr.aws/aws-solutions/innovation-sandbox-on-aws-account-cleaner:v1.1)
  • Region: us-west-2
  • Modified from published version: No (only AppConfig nuke config override)

Related

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions