Skip to content

Conversation

@rudransh-shrivastava
Copy link
Collaborator

@rudransh-shrivastava rudransh-shrivastava commented Dec 2, 2025

Resolves #2771

Proposed change

  • Add lifecycle for untagged images.
  • Use boto3 for ECS: load_data_task.
  • Remove use of AWS-managed IAM policies (AmazonEC*).

Note: PR depends on another PR, needs to be rebased. Relevant commits start at 8159861 (including)

Checklist

  • I've read and followed the contributing guidelines.
  • I've run make check-test locally; all checks and tests passed.

@github-actions github-actions bot added docs Improvements or additions to documentation backend backend-tests makefile ci labels Dec 2, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 2, 2025

Summary by CodeRabbit

  • Infrastructure Improvements
    • Enhanced state protection with object lock retention policies and accidental deletion safeguards
    • Added automatic cleanup for untagged container images after 7 days
    • Improved bucket naming consistency and uniqueness
    • Reorganized access control policies for better security posture
    • Increased task resource allocations for improved performance

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Adds random suffixes to S3 bucket names for state, logs, and fixtures; implements prevent_destroy lifecycle rules on DynamoDB and S3 buckets; adds S3 Object Lock with 30-day GOVERNANCE retention on state bucket; introduces ECR lifecycle policy for auto-expiring untagged images; replaces AWS-managed IAM policies with custom policies for ECS task execution and EventBridge; adds configurable image_tag variable for ECS tasks; converts data-loading task from AWS CLI to boto3; renames bucket-related variables across modules; updates backend configuration documentation.

Changes

Cohort / File(s) Summary
Documentation Updates
infrastructure/README.md
Reformats notes from hyphenated to bolded style; adds procedural steps for creating and populating terraform.tfbackend file; updates terraform init commands to reference backend config file; adds guidance on copying state bucket name and clearing buckets/ECR before teardown.
Backend Provider & Lock Configuration
infrastructure/backend/.terraform.lock.hcl, infrastructure/backend/main.tf
Adds random provider (v3.7.2); introduces random_id suffix resource for bucket naming; adds prevent_destroy lifecycle rules to DynamoDB state_lock table and S3 buckets; enables S3 Object Lock with 30-day GOVERNANCE retention on state bucket.
Backend Staging Configuration
infrastructure/staging/backend.tf, infrastructure/staging/terraform.tfbackend.example
Removes inline backend config parameters (bucket, dynamodb_table, region) from backend block; adds example backend config file with static backend settings.
Storage Module Main Configuration
infrastructure/modules/storage/main.tf, infrastructure/modules/storage/variables.tf, infrastructure/modules/storage/outputs.tf
Adds random provider and random_id suffix resource; updates bucket names to include random suffix; renames fixtures_s3_bucket to fixtures_bucket_name and zappa_s3_bucket to zappa_bucket_name; adds fixtures_s3_bucket_name output.
Storage S3 Bucket Submodule
infrastructure/modules/storage/modules/s3-bucket/main.tf, infrastructure/modules/storage/modules/s3-bucket/variables.tf
Removes force_destroy attribute and replaces with lifecycle prevent_destroy = true; removes force_destroy variable; updates variable descriptions for consistency.
ECS Module Configuration
infrastructure/modules/ecs/main.tf, infrastructure/modules/ecs/variables.tf
Adds ECR lifecycle policy for untagged images; replaces AWS-managed IAM policy attachments with custom policies for ECS task execution and EventBridge; updates task command from AWS CLI to boto3 for S3 data loading; adds image_tag variable; increases load_data_task CPU and memory defaults; renames fixtures_s3_bucket parameter to fixtures_bucket_name.
Staging Variables & Configuration
infrastructure/staging/variables.tf, infrastructure/staging/main.tf, infrastructure/staging/terraform.tfvars.example
Removes force_destroy_bucket variable; renames fixtures_s3_bucket to fixtures_bucket_name (default "owasp-nest-fixtures") and zappa_s3_bucket to zappa_bucket_name; updates module input references to use renamed variables and new storage module outputs.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • infrastructure/modules/ecs/main.tf — IAM policy replacements (custom policies for ECS task execution and EventBridge) require careful verification that permissions are correctly scoped; boto3-based S3 download logic needs validation for correctness and error handling.
  • infrastructure/backend/main.tf — S3 Object Lock configuration with GOVERNANCE retention and prevent_destroy rules need careful review to ensure accidental deletion is properly prevented and recovery procedures are understood.
  • infrastructure/modules/storage/main.tf — Random suffix introduction for bucket naming affects multiple resources and module inputs; ensure suffix consistency across all bucket references and that IAM policies reference correct bucket ARNs.
  • Variable renaming across modules — fixtures_s3_bucket → fixtures_bucket_name and zappa_s3_bucket → zappa_bucket_name changes affect module contracts; verify all call sites and outputs are consistently updated.

Possibly related issues

  • Improve ecs Module #2771 — PR directly addresses all checklist items: hardcoded latest tag replaced with configurable image_tag variable, ECR lifecycle policy added for untagged images, boto3-based load_data_task implemented, and AWS-managed IAM policies replaced with custom policies.
  • Improve Zappa/Terraform Migration #2715 — PR implements comprehensive infrastructure improvements (random suffix for bucket naming, prevent_destroy rules, S3 Object Lock, custom IAM policies) that align with this issue's objectives.

Possibly related PRs

  • Add Remote State Management #2699 — Both PRs extend Terraform remote state backend configuration by adding random suffixes, object-locking, and prevent_destroy lifecycle rules to state infrastructure.
  • Fix long-running ECS/Fargate Tasks #2620 — Both PRs modify ECS task definitions and load_data flow, including S3 data-loading implementation and command structure changes.
  • Add logging in Infrastructure #2743 — Both PRs modify storage S3 bucket implementation, including bucket naming behavior, force_destroy removal, and lifecycle configuration.

Suggested labels

backend

Suggested reviewers

  • kasya
  • arkid15r

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning The PR includes substantial out-of-scope changes beyond the ECS module improvements, including backend bucket naming, S3 object lock, infrastructure README updates, variable renames across modules, and storage module modifications. Separate out-of-scope changes into distinct PRs. Keep this PR focused on ECS module improvements only: ECR lifecycle policy, boto3 integration, and IAM policy changes.
Title check ❓ Inconclusive The title is vague and does not clearly specify which improvements are being made to the ECS module. Use a more descriptive title that mentions the primary changes, such as 'Add ECR lifecycle policy and boto3 integration to ECS module' or similar.
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed The description clearly relates to the changeset, outlining the three main objectives addressed by the PR.
Linked Issues check ✅ Passed The PR addresses all key requirements from issue #2771: ECR lifecycle policy for untagged images, boto3 integration for load_data_task, and removal of AWS-managed IAM policies.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@rudransh-shrivastava rudransh-shrivastava changed the base branch from main to feature/nest-zappa-migration December 2, 2025 15:23
@sonarqubecloud
Copy link

sonarqubecloud bot commented Dec 2, 2025

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
infrastructure/README.md (3)

58-75: Clarify placeholder for state bucket in backend configuration file.

The new backend configuration file workflow is helpful, but users need explicit guidance on retrieving the state bucket name. Line 68 instructs updating the state bucket name from "the previous step," but users should know this comes from the backend output (e.g., terraform output from infrastructure/backend/). Consider adding a step to explicitly run cd infrastructure/backend && terraform output or similar to retrieve the bucket name.

Also, verify that -backend-config=terraform.tfbackend correctly resolves relative paths from the staging/ directory during terraform init.


68-68: Fix markdown formatting for consistency.

Lines 68 and 70 use *Note:* (emphasis), but other notes in the file use **Note:** (bold). Please standardize to bold.

Also applies to: 70-70


183-183: Clarify the <id> placeholder in fixture bucket path.

The placeholder owasp-nest-fixtures-<id> should be documented. Add a note that <id> is the random suffix output from the backend setup step, or provide a command to retrieve it dynamically (e.g., terraform output fixtures_s3_bucket_name).

infrastructure/staging/terraform.tfbackend.example (1)

1-3: Placeholder token format should match documentation convention.

The ${STATE_BUCKET_NAME} placeholder is clear, but verify it aligns with documentation. If users reference the README instructions, ensure the placeholder name matches what's documented (e.g., if README says "Copy the state bucket name from output as STATE_BUCKET_NAME", this should match).

Also, consider whether region = "us-east-2" should be a placeholder (e.g., ${AWS_REGION}) if users might deploy to different regions, or if it's intentionally hardcoded to the staging region default.

infrastructure/modules/storage/modules/s3-bucket/main.tf (1)

38-45: Prevent_destroy is a breaking change—consider making it optional per module instance.

Removing force_destroy and adding lifecycle.prevent_destroy = true is a significant safeguard, but it's also a breaking change for module consumers. All S3 buckets created via this module (fixtures, zappa) now require manual intervention (terraform state rm or prevent_destroy = false) before destruction.

This may conflict with automation workflows that expect to cleanly destroy test environments. Consider adding a prevent_destroy_bucket variable (defaulting to true for safety) so consumers can opt-out per instance if needed:

variable "prevent_destroy_bucket" {
  type    = bool
  default = true
}

lifecycle {
  prevent_destroy = var.prevent_destroy_bucket
}

Alternatively, document this as a breaking change in release notes and require explicit code changes from consumers to upgrade.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e219f12 and 8159861.

📒 Files selected for processing (15)
  • infrastructure/README.md (6 hunks)
  • infrastructure/backend/.terraform.lock.hcl (1 hunks)
  • infrastructure/backend/main.tf (4 hunks)
  • infrastructure/modules/ecs/main.tf (10 hunks)
  • infrastructure/modules/ecs/variables.tf (2 hunks)
  • infrastructure/modules/storage/main.tf (3 hunks)
  • infrastructure/modules/storage/modules/s3-bucket/main.tf (1 hunks)
  • infrastructure/modules/storage/modules/s3-bucket/variables.tf (2 hunks)
  • infrastructure/modules/storage/outputs.tf (1 hunks)
  • infrastructure/modules/storage/variables.tf (1 hunks)
  • infrastructure/staging/backend.tf (1 hunks)
  • infrastructure/staging/main.tf (2 hunks)
  • infrastructure/staging/terraform.tfbackend.example (1 hunks)
  • infrastructure/staging/terraform.tfvars.example (0 hunks)
  • infrastructure/staging/variables.tf (2 hunks)
💤 Files with no reviewable changes (1)
  • infrastructure/staging/terraform.tfvars.example
🧰 Additional context used
🧠 Learnings (7)
📚 Learning: 2025-11-08T11:16:25.725Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2551
File: infrastructure/modules/parameters/main.tf:1-191
Timestamp: 2025-11-08T11:16:25.725Z
Learning: The parameters module in infrastructure/modules/parameters/ is currently configured for staging environment only. The `configuration` and `settings_module` variables default to "Staging" and "settings.staging" respectively, and users can update parameter values via the AWS Parameter Store console. The lifecycle.ignore_changes blocks on these parameters support manual console updates without Terraform reverting them.

Applied to files:

  • infrastructure/staging/backend.tf
  • infrastructure/staging/terraform.tfbackend.example
  • infrastructure/staging/main.tf
📚 Learning: 2025-11-23T11:52:15.463Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2699
File: backend/wsgi.py:13-13
Timestamp: 2025-11-23T11:52:15.463Z
Learning: In the OWASP Nest project, the SSM parameter store setup in backend/wsgi.py (using boto3 to fetch parameters from AWS Systems Manager) is designed for staging and production environments, not just for testing purposes.

Applied to files:

  • infrastructure/staging/backend.tf
  • infrastructure/README.md
  • infrastructure/staging/terraform.tfbackend.example
📚 Learning: 2025-11-27T19:38:23.956Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2743
File: infrastructure/modules/storage/main.tf:1-9
Timestamp: 2025-11-27T19:38:23.956Z
Learning: In the OWASP/Nest infrastructure code (infrastructure/ directory), the team prefers exact version pinning for Terraform (e.g., "1.14.0") and AWS provider (e.g., "6.22.0") over semantic versioning constraints (e.g., "~> 1.14" or "~> 6.22"). This is a deliberate choice for maximum reproducibility and explicit control over version updates in their testing environment.

Applied to files:

  • infrastructure/staging/backend.tf
  • infrastructure/staging/terraform.tfbackend.example
  • infrastructure/backend/.terraform.lock.hcl
  • infrastructure/backend/main.tf
  • infrastructure/modules/storage/main.tf
📚 Learning: 2025-10-17T15:25:53.713Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/modules/database/main.tf:22-60
Timestamp: 2025-10-17T15:25:53.713Z
Learning: The infrastructure code in the `infrastructure/` directory is intended for quick testing purposes only, not production-grade deployment. Production-grade security hardening controls (such as IAM database authentication, deletion protection, Performance Insights KMS encryption) are not required for this testing infrastructure.

Applied to files:

  • infrastructure/README.md
📚 Learning: 2025-10-17T15:25:34.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/modules/cache/main.tf:30-30
Timestamp: 2025-10-17T15:25:34.963Z
Learning: The infrastructure/Terraform code in the OWASP Nest repository under the `infrastructure/` directory is intended for quick testing purposes only, not for production deployment.

Applied to files:

  • infrastructure/README.md
📚 Learning: 2025-10-23T19:22:23.811Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/main.tf:0-0
Timestamp: 2025-10-23T19:22:23.811Z
Learning: In Zappa-based serverless deployments, Lambda functions and IAM execution roles are managed by Zappa at application deployment time (via `zappa deploy`/`zappa update`), not via Terraform. Terraform provisions the supporting infrastructure (VPC, RDS, S3, security groups, RDS Proxy, Secrets Manager), while Zappa handles the Lambda orchestration layer.

Applied to files:

  • infrastructure/README.md
  • infrastructure/staging/variables.tf
  • infrastructure/modules/storage/main.tf
📚 Learning: 2025-10-17T15:25:55.624Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/providers.tf:1-3
Timestamp: 2025-10-17T15:25:55.624Z
Learning: The infrastructure code in the OWASP/Nest repository (infrastructure/ directory) is intended for quick testing purposes only, not for production deployment.

Applied to files:

  • infrastructure/README.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run frontend unit tests
🔇 Additional comments (19)
infrastructure/backend/.terraform.lock.hcl (1)

27-45: Random provider lock file entry looks good.

The exact version pinning (3.7.2) aligns with your infrastructure standards. Since this is auto-maintained by Terraform, no manual changes needed here.

infrastructure/backend/main.tf (3)

53-55: Random suffix for bucket naming is a solid approach.

The 4-byte random suffix (8 hex characters) effectively prevents S3 bucket name collisions in shared AWS accounts. Since Terraform stores the random_id resource in state, the suffix will persist across apply cycles.


69-94: Verify prevent_destroy lifecycle is operationally acceptable.

Adding prevent_destroy = true to state, logs, and state_lock resources is excellent for protecting critical infrastructure. However, this creates a strong safeguard that requires manual intervention (terraform state rm or setting prevent_destroy = false) before any destruction is possible.

Ensure your team is aware of this requirement and that the README guidance (line 210) is highlighted during deployment. This is especially important if team members are unfamiliar with Terraform lifecycle rules.


138-147: Verify S3 Object Lock retention policy meets operational requirements.

The 30-day GOVERNANCE retention is appropriate for testing infrastructure, and GOVERNANCE mode (rather than COMPLIANCE) allows flexibility if needed. However, confirm:

  1. Does 30 days align with your state backup retention policy?
  2. Is GOVERNANCE mode (deletable with s3:BypassGovernanceRetention permission) the right choice, or should this be COMPLIANCE (immutable)?
  3. Object Lock applies only to objects created after this configuration is applied—existing state versions may not be protected.
infrastructure/staging/backend.tf (1)

2-4: Backend configuration split is clean and follows Terraform best practices.

Separating code-level defaults (encrypt, key) from environment-specific config (bucket, region) in terraform.tfbackend makes it easy to parameterize across environments without code changes. Users must remember the -backend-config flag during terraform init, which is documented in the README.

infrastructure/modules/storage/variables.tf (1)

12-12: Variable renames are breaking changes—verify all consumers are updated.

The renames fixtures_s3_bucket → fixtures_bucket_name and zappa_s3_bucket → zappa_bucket_name are semantic improvements but constitute breaking changes for any code calling the storage module.

Verify that all consumers (e.g., infrastructure/staging/main.tf) have been updated to use the new variable names. If this PR includes multiple modules across the infrastructure, ensure consistency across the entire tree.

Also applies to: 22-22

infrastructure/modules/storage/outputs.tf (1)

11-19: Verify consistency in bucket output references.

Line 13 uses module.fixtures_bucket.bucket.id while line 18 uses module.zappa_bucket.bucket (no .id suffix). These should be consistent—either both should use .id or both should omit it, depending on what the s3-bucket module outputs. Check the s3-bucket module's outputs definition to confirm whether this inconsistency is intentional or requires correction.

infrastructure/staging/variables.tf (1)

84-88: Variable renames improve consistency and clarity.

Renaming from fixtures_s3_bucketfixtures_bucket_name and zappa_s3_bucketzappa_bucket_name removes redundant qualifiers and improves naming consistency across the infrastructure module. The force_destroy_bucket removal is a good safety improvement.

Also applies to: 138-142

infrastructure/modules/ecs/variables.tf (2)

56-66: Verify the resource requirement increase for load_data_task.

The CPU allocation doubles from 256 to 512, and memory doubles from 2048 to 4096. While this change may be necessary to support the boto3-based data loading approach or handle larger fixture datasets, the PR objectives don't explain the rationale. Please clarify whether this increase is:

  • Required for boto3 operations or expected workload characteristics
  • Temporary tuning or a permanent requirement
  • Impact on deployment costs

38-42: Image tag configuration supports PR objective.

The new image_tag variable with a "latest" default enables configurable Docker image versions for ECS tasks, addressing the hardcoded tag removal objective.

infrastructure/staging/main.tf (1)

54-66: Module wiring requires verification of storage module outputs.

The staging module correctly wires the renamed variables (fixtures_bucket_name, zappa_bucket_name) to the storage module, and passes module.storage.fixtures_s3_bucket_name to the ECS module. However, the storage module's outputs.tf file is not provided for review. Please verify:

  1. The output fixtures_s3_bucket_name exists in infrastructure/modules/storage/outputs.tf
  2. The output includes the random suffix (e.g., "${aws_s3_bucket.fixtures_bucket.id}" or equivalent)
  3. This ensures the ECS module receives the actual bucket name with the suffix for boto3 S3 operations

Also applies to: 107-115

infrastructure/modules/storage/modules/s3-bucket/variables.tf (1)

1-5: Documentation updates improve clarity.

The updated descriptions for abort_incomplete_multipart_upload_days and noncurrent_version_expiration_days are more explicit and aligned with AWS S3 lifecycle terminology.

Also applies to: 12-16

infrastructure/modules/storage/main.tf (1)

8-29: Random suffix implementation ensures bucket name uniqueness.

The random provider and random_id resource with 4-byte length provide sufficient entropy to prevent bucket naming collisions across deployments. The suffix is consistently applied to both fixtures and zappa bucket names, and the IAM policy ARN correctly references the suffixed bucket name.

infrastructure/modules/ecs/main.tf (6)

19-39: ECR lifecycle policy correctly implements untagged image cleanup.

The lifecycle policy automates removal of untagged images after 7 days, addressing the PR objective to remove hardcoded image tag reliance. The configuration is valid and non-destructive to tagged images.


85-118: Custom ECS task execution policy properly replaces AWS-managed policy.

The custom policy grants least-privilege permissions for ECR registry access and CloudWatch Logs, replacing the AWS-managed AmazonEC2ContainerRegistryPowerUser policy. The policy correctly scopes:

  • ecr:GetAuthorizationToken to "*" (required for registry authentication)
  • Image operations (BatchCheckLayerAvailability, BatchGetImage, GetDownloadUrlForLayer) to the ECR repository ARN
  • Log operations to the project-specific log group pattern

171-198: Custom EventBridge policy implements least-privilege design.

The policy grants EventBridge only the necessary permissions to run ECS tasks and pass roles. The ecs:RunTask action includes a cluster ARN condition, and iam:PassRole is scoped to only the required execution and task roles. This replaces reliance on overly-permissive AWS-managed policies.


295-325: boto3 S3 download correctly implements objective and enables S3 access.

The load_data_task replaces AWS CLI commands with a Python boto3 script, achieving the PR objective. The task correctly:

  • Sets task_role_arn to ecs_task_role (Line 324), enabling IAM-based S3 access
  • Uses var.fixtures_bucket_name (which includes the random suffix from the storage module output)
  • Relies on the attached fixtures_read_only_policy_arn for S3:GetObject permissions

Verification needed: Confirm that fixtures_read_only_policy (from the storage module) includes s3:GetObject permission on the suffixed bucket. This should be verified in infrastructure/modules/storage/main.tf lines 15-25 to ensure the boto3 download succeeds.


217-217: All ECS tasks consistently use configurable image_tag variable.

Every task module (sync_data_task, owasp_update_project_health_metrics_task, owasp_update_project_health_scores_task, migrate_task, load_data_task, index_data_task) correctly references var.image_tag, enabling configurable Docker image versions across all ECS tasks.

Also applies to: 246-246, 267-267, 287-287, 318-318, 338-338


323-325: Task role assignment is correctly scoped.

Only load_data_task includes task_role_arn to enable S3 bucket access for boto3 file downloads. Other tasks do not require IAM permissions beyond execution role capabilities, so omitting task_role_arn is appropriate.

@rudransh-shrivastava rudransh-shrivastava marked this pull request as ready for review December 2, 2025 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve ecs Module

1 participant