Skip to content

[Infrastructure] Default Cost Safeguards & Hardening for Runner EnvironmentsΒ #296

@igor-soldev

Description

@igor-soldev

Hello ForgeMT team! πŸ‘‹

Since one of the core promises of ForgeMT is "Cost Optimization" alongside "Enterprise-grade Security", I decided to run the Terraform modules through InfraScan - Open Source tool my team is developing to audit infrastructure-as-code against strict cloud security and FinOps best practices. We are building a community around secure-by-default public cloud infrastructure, feel free to join us.

The codebase scored a very solid "B" (90%), which is impressive for a project of this scale. However, the scan highlighted a few systemic patterns in the modules that could help tighten the baseline out-of-the-box:

1. Missing AWS Budgets across modules (Cost Safety)
Throughout the modules/infra/* and modules/integrations/* directories, there are many discrete AWS provider configurations, but none of them deploy an aws_budgets_budget resource by default.

  • Impact: In a multi-tenant CI/CD environment, runaway runner auto-scaling (e.g., due to a broken workflow triggering infinite loops) can result in massive, unexpected EC2/EKS bills.
  • Suggestion: Providing a default or optional budget alert integrated into the Control Plane would act as a critical safety net for platform administrators.

2. S3 Lifecycle Policies for Logs (Cost Hygiene)
Buckets intended for short-term data (e.g., s3_short_term in modules/infra/storage/main.tf or billing reports) could benefit from an explicit aws_s3_bucket_lifecycle_configuration to abort incomplete multipart uploads and age out old logs automatically, preventing creeping storage costs at scale.

3. IAM Privilege Escalation Boundaries (Security)
The scan flagged a few IAM policies (like execution_role_policy in cloud_formation/main.tf and packer_support_for_forge_runners in forge_subscription/policies.tf) that use wildcards (*) for actions.

  • Impact: While often necessary for bootstrapping modules (like CloudFormation or Packer), tightening these scopes where possible reduces the blast radius if a tenant manages to escape the runner sandbox.
Image

If you'd like to trace these findings down to the specific files and modules, the full interactive report is available here:

πŸ‘‰ View Full InfraScan Report for forge

(Full disclosure: The link above is generated by our tool, but I manually reviewed the findings to ensure they align with the scale and isolation requirements of a CI/CD platform).

If you agree that these baseline improvements add value to the platform's FinOps and security posture, I'd be happy to submit a PR to help address the S3 lifecycles or budget defaults. Let me know what you think!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions