Skip to content

Conversation

@rudransh-shrivastava
Copy link
Collaborator

@rudransh-shrivastava rudransh-shrivastava commented Dec 2, 2025

Resolves #2771

Proposed change

  • Add lifecycle for untagged images.
  • Use boto3 for ECS: load_data_task.
  • Remove use of AWS-managed IAM policies (AmazonEC*).

Checklist

  • I've read and followed the contributing guidelines.
  • I've run make check-test locally; all checks and tests passed.

@github-actions github-actions bot added docs Improvements or additions to documentation backend backend-tests makefile ci labels Dec 2, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 2, 2025

Summary by CodeRabbit

  • Infrastructure & Deployment
    • Enabled container image versioning for more consistent deployments across environments
    • Increased data loading task resources (CPU/memory) for improved performance
    • Implemented automatic cleanup of untagged images after 7 days

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

This PR implements infrastructure improvements to the ECS module: adds ECR lifecycle policy for untagged images, replaces AWS-managed IAM policies with custom policies, introduces image tagging support via new variable, updates task image URLs to use tags, and modifies load_data_task to use boto3. Storage module exports fixtures bucket name; staging references it accordingly.

Changes

Cohort / File(s) Summary
ECS Module IAM & ECR Management
infrastructure/modules/ecs/main.tf
Adds ECR lifecycle policy expiring untagged images after 7 days; replaces AWS-managed ECS task execution policy with custom IAM policy and attachment; replaces AWS-managed EventBridge policy with custom IAM policy allowing scoped RunTask and PassRole permissions; updates 6 task image URLs to include ${var.image_tag} parameter; modifies load_data_task data loading to use boto3 instead of AWS CLI
ECS Module Configuration
infrastructure/modules/ecs/variables.tf
Adds new image_tag variable (string, default "latest"); increases load_data_task_cpu default from 256 to 512; increases load_data_task_memory default from 2048 to 4096
Storage Module Outputs
infrastructure/modules/storage/outputs.tf
Adds new output fixtures_s3_bucket_name exposing module.fixtures_bucket.bucket.id
Staging Configuration
infrastructure/staging/main.tf
Updates ECS module input fixtures_bucket_name to reference module.storage.fixtures_s3_bucket_name instead of var.fixtures_bucket_name

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • IAM policy logic: Verify custom ECS task execution and EventBridge policies grant only necessary permissions with correct ARN scoping
  • Resource allocation: Confirm load_data_task CPU/memory increase (256→512 and 2048→4096) aligns with workload requirements
  • boto3 migration: Review inline script syntax and error handling for data loading via boto3 instead of AWS CLI
  • Image tagging: Ensure all 6 task definitions correctly construct image URLs with tag parameter and default behavior

Possibly related issues

  • Improve ecs Module #2771: Directly addresses all four requested improvements—image tag hardcoding, ECR lifecycle policy for untagged images, boto3 for load_data_task, and removal of AWS-managed IAM policies.

Possibly related PRs

Suggested reviewers

  • kasya
  • arkid15r

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Improve ecs Module' is generic and partially related to the changeset but lacks specificity about the main improvements (lifecycle policies, boto3 integration, IAM policy changes). Consider a more specific title like 'Add ECR lifecycle policy, implement boto3 for load_data_task, and replace AWS-managed IAM policies' to better convey the key changes.
✅ Passed checks (4 passed)
Check name Status Explanation
Linked Issues check ✅ Passed All requirements from issue #2771 are addressed: ECR lifecycle policy added for untagged images, image_tag variable replaces hardcoded 'latest', custom IAM policies replace AWS-managed ones, and boto3 is used for load_data_task.
Out of Scope Changes check ✅ Passed Changes are well-scoped to linked issue #2771 objectives. Updates to load_data_task CPU/memory and fixtures_s3_bucket_name variable pass-through are supporting changes for the main objectives.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check ✅ Passed The pull request description clearly describes the proposed changes: adding ECR lifecycle policy for untagged images, implementing boto3 for load_data_task, and removing AWS-managed IAM policies.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@rudransh-shrivastava rudransh-shrivastava changed the base branch from main to feature/nest-zappa-migration December 2, 2025 15:23
@rudransh-shrivastava rudransh-shrivastava force-pushed the feature/nest-zappa-migration-ecs-improvement branch from 96cfa89 to 8159861 Compare December 2, 2025 15:24
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
infrastructure/README.md (3)

58-75: Clarify placeholder for state bucket in backend configuration file.

The new backend configuration file workflow is helpful, but users need explicit guidance on retrieving the state bucket name. Line 68 instructs updating the state bucket name from "the previous step," but users should know this comes from the backend output (e.g., terraform output from infrastructure/backend/). Consider adding a step to explicitly run cd infrastructure/backend && terraform output or similar to retrieve the bucket name.

Also, verify that -backend-config=terraform.tfbackend correctly resolves relative paths from the staging/ directory during terraform init.


68-68: Fix markdown formatting for consistency.

Lines 68 and 70 use *Note:* (emphasis), but other notes in the file use **Note:** (bold). Please standardize to bold.

Also applies to: 70-70


183-183: Clarify the <id> placeholder in fixture bucket path.

The placeholder owasp-nest-fixtures-<id> should be documented. Add a note that <id> is the random suffix output from the backend setup step, or provide a command to retrieve it dynamically (e.g., terraform output fixtures_s3_bucket_name).

infrastructure/staging/terraform.tfbackend.example (1)

1-3: Placeholder token format should match documentation convention.

The ${STATE_BUCKET_NAME} placeholder is clear, but verify it aligns with documentation. If users reference the README instructions, ensure the placeholder name matches what's documented (e.g., if README says "Copy the state bucket name from output as STATE_BUCKET_NAME", this should match).

Also, consider whether region = "us-east-2" should be a placeholder (e.g., ${AWS_REGION}) if users might deploy to different regions, or if it's intentionally hardcoded to the staging region default.

infrastructure/modules/storage/modules/s3-bucket/main.tf (1)

38-45: Prevent_destroy is a breaking change—consider making it optional per module instance.

Removing force_destroy and adding lifecycle.prevent_destroy = true is a significant safeguard, but it's also a breaking change for module consumers. All S3 buckets created via this module (fixtures, zappa) now require manual intervention (terraform state rm or prevent_destroy = false) before destruction.

This may conflict with automation workflows that expect to cleanly destroy test environments. Consider adding a prevent_destroy_bucket variable (defaulting to true for safety) so consumers can opt-out per instance if needed:

variable "prevent_destroy_bucket" {
  type    = bool
  default = true
}

lifecycle {
  prevent_destroy = var.prevent_destroy_bucket
}

Alternatively, document this as a breaking change in release notes and require explicit code changes from consumers to upgrade.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e219f12 and 8159861.

📒 Files selected for processing (15)
  • infrastructure/README.md (6 hunks)
  • infrastructure/backend/.terraform.lock.hcl (1 hunks)
  • infrastructure/backend/main.tf (4 hunks)
  • infrastructure/modules/ecs/main.tf (10 hunks)
  • infrastructure/modules/ecs/variables.tf (2 hunks)
  • infrastructure/modules/storage/main.tf (3 hunks)
  • infrastructure/modules/storage/modules/s3-bucket/main.tf (1 hunks)
  • infrastructure/modules/storage/modules/s3-bucket/variables.tf (2 hunks)
  • infrastructure/modules/storage/outputs.tf (1 hunks)
  • infrastructure/modules/storage/variables.tf (1 hunks)
  • infrastructure/staging/backend.tf (1 hunks)
  • infrastructure/staging/main.tf (2 hunks)
  • infrastructure/staging/terraform.tfbackend.example (1 hunks)
  • infrastructure/staging/terraform.tfvars.example (0 hunks)
  • infrastructure/staging/variables.tf (2 hunks)
💤 Files with no reviewable changes (1)
  • infrastructure/staging/terraform.tfvars.example
🧰 Additional context used
🧠 Learnings (7)
📚 Learning: 2025-11-08T11:16:25.725Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2551
File: infrastructure/modules/parameters/main.tf:1-191
Timestamp: 2025-11-08T11:16:25.725Z
Learning: The parameters module in infrastructure/modules/parameters/ is currently configured for staging environment only. The `configuration` and `settings_module` variables default to "Staging" and "settings.staging" respectively, and users can update parameter values via the AWS Parameter Store console. The lifecycle.ignore_changes blocks on these parameters support manual console updates without Terraform reverting them.

Applied to files:

  • infrastructure/staging/backend.tf
  • infrastructure/staging/terraform.tfbackend.example
  • infrastructure/staging/main.tf
📚 Learning: 2025-11-23T11:52:15.463Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2699
File: backend/wsgi.py:13-13
Timestamp: 2025-11-23T11:52:15.463Z
Learning: In the OWASP Nest project, the SSM parameter store setup in backend/wsgi.py (using boto3 to fetch parameters from AWS Systems Manager) is designed for staging and production environments, not just for testing purposes.

Applied to files:

  • infrastructure/staging/backend.tf
  • infrastructure/README.md
  • infrastructure/staging/terraform.tfbackend.example
📚 Learning: 2025-11-27T19:38:23.956Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2743
File: infrastructure/modules/storage/main.tf:1-9
Timestamp: 2025-11-27T19:38:23.956Z
Learning: In the OWASP/Nest infrastructure code (infrastructure/ directory), the team prefers exact version pinning for Terraform (e.g., "1.14.0") and AWS provider (e.g., "6.22.0") over semantic versioning constraints (e.g., "~> 1.14" or "~> 6.22"). This is a deliberate choice for maximum reproducibility and explicit control over version updates in their testing environment.

Applied to files:

  • infrastructure/staging/backend.tf
  • infrastructure/staging/terraform.tfbackend.example
  • infrastructure/backend/.terraform.lock.hcl
  • infrastructure/backend/main.tf
  • infrastructure/modules/storage/main.tf
📚 Learning: 2025-10-17T15:25:53.713Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/modules/database/main.tf:22-60
Timestamp: 2025-10-17T15:25:53.713Z
Learning: The infrastructure code in the `infrastructure/` directory is intended for quick testing purposes only, not production-grade deployment. Production-grade security hardening controls (such as IAM database authentication, deletion protection, Performance Insights KMS encryption) are not required for this testing infrastructure.

Applied to files:

  • infrastructure/README.md
📚 Learning: 2025-10-17T15:25:34.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/modules/cache/main.tf:30-30
Timestamp: 2025-10-17T15:25:34.963Z
Learning: The infrastructure/Terraform code in the OWASP Nest repository under the `infrastructure/` directory is intended for quick testing purposes only, not for production deployment.

Applied to files:

  • infrastructure/README.md
📚 Learning: 2025-10-23T19:22:23.811Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/main.tf:0-0
Timestamp: 2025-10-23T19:22:23.811Z
Learning: In Zappa-based serverless deployments, Lambda functions and IAM execution roles are managed by Zappa at application deployment time (via `zappa deploy`/`zappa update`), not via Terraform. Terraform provisions the supporting infrastructure (VPC, RDS, S3, security groups, RDS Proxy, Secrets Manager), while Zappa handles the Lambda orchestration layer.

Applied to files:

  • infrastructure/README.md
  • infrastructure/staging/variables.tf
  • infrastructure/modules/storage/main.tf
📚 Learning: 2025-10-17T15:25:55.624Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/providers.tf:1-3
Timestamp: 2025-10-17T15:25:55.624Z
Learning: The infrastructure code in the OWASP/Nest repository (infrastructure/ directory) is intended for quick testing purposes only, not for production deployment.

Applied to files:

  • infrastructure/README.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run frontend unit tests
🔇 Additional comments (19)
infrastructure/backend/.terraform.lock.hcl (1)

27-45: Random provider lock file entry looks good.

The exact version pinning (3.7.2) aligns with your infrastructure standards. Since this is auto-maintained by Terraform, no manual changes needed here.

infrastructure/backend/main.tf (3)

53-55: Random suffix for bucket naming is a solid approach.

The 4-byte random suffix (8 hex characters) effectively prevents S3 bucket name collisions in shared AWS accounts. Since Terraform stores the random_id resource in state, the suffix will persist across apply cycles.


69-94: Verify prevent_destroy lifecycle is operationally acceptable.

Adding prevent_destroy = true to state, logs, and state_lock resources is excellent for protecting critical infrastructure. However, this creates a strong safeguard that requires manual intervention (terraform state rm or setting prevent_destroy = false) before any destruction is possible.

Ensure your team is aware of this requirement and that the README guidance (line 210) is highlighted during deployment. This is especially important if team members are unfamiliar with Terraform lifecycle rules.


138-147: Verify S3 Object Lock retention policy meets operational requirements.

The 30-day GOVERNANCE retention is appropriate for testing infrastructure, and GOVERNANCE mode (rather than COMPLIANCE) allows flexibility if needed. However, confirm:

  1. Does 30 days align with your state backup retention policy?
  2. Is GOVERNANCE mode (deletable with s3:BypassGovernanceRetention permission) the right choice, or should this be COMPLIANCE (immutable)?
  3. Object Lock applies only to objects created after this configuration is applied—existing state versions may not be protected.
infrastructure/staging/backend.tf (1)

2-4: Backend configuration split is clean and follows Terraform best practices.

Separating code-level defaults (encrypt, key) from environment-specific config (bucket, region) in terraform.tfbackend makes it easy to parameterize across environments without code changes. Users must remember the -backend-config flag during terraform init, which is documented in the README.

infrastructure/modules/storage/variables.tf (1)

12-12: Variable renames are breaking changes—verify all consumers are updated.

The renames fixtures_s3_bucket → fixtures_bucket_name and zappa_s3_bucket → zappa_bucket_name are semantic improvements but constitute breaking changes for any code calling the storage module.

Verify that all consumers (e.g., infrastructure/staging/main.tf) have been updated to use the new variable names. If this PR includes multiple modules across the infrastructure, ensure consistency across the entire tree.

Also applies to: 22-22

infrastructure/modules/storage/outputs.tf (1)

11-19: Verify consistency in bucket output references.

Line 13 uses module.fixtures_bucket.bucket.id while line 18 uses module.zappa_bucket.bucket (no .id suffix). These should be consistent—either both should use .id or both should omit it, depending on what the s3-bucket module outputs. Check the s3-bucket module's outputs definition to confirm whether this inconsistency is intentional or requires correction.

infrastructure/staging/variables.tf (1)

84-88: Variable renames improve consistency and clarity.

Renaming from fixtures_s3_bucketfixtures_bucket_name and zappa_s3_bucketzappa_bucket_name removes redundant qualifiers and improves naming consistency across the infrastructure module. The force_destroy_bucket removal is a good safety improvement.

Also applies to: 138-142

infrastructure/modules/ecs/variables.tf (2)

56-66: Verify the resource requirement increase for load_data_task.

The CPU allocation doubles from 256 to 512, and memory doubles from 2048 to 4096. While this change may be necessary to support the boto3-based data loading approach or handle larger fixture datasets, the PR objectives don't explain the rationale. Please clarify whether this increase is:

  • Required for boto3 operations or expected workload characteristics
  • Temporary tuning or a permanent requirement
  • Impact on deployment costs

38-42: Image tag configuration supports PR objective.

The new image_tag variable with a "latest" default enables configurable Docker image versions for ECS tasks, addressing the hardcoded tag removal objective.

infrastructure/staging/main.tf (1)

54-66: Module wiring requires verification of storage module outputs.

The staging module correctly wires the renamed variables (fixtures_bucket_name, zappa_bucket_name) to the storage module, and passes module.storage.fixtures_s3_bucket_name to the ECS module. However, the storage module's outputs.tf file is not provided for review. Please verify:

  1. The output fixtures_s3_bucket_name exists in infrastructure/modules/storage/outputs.tf
  2. The output includes the random suffix (e.g., "${aws_s3_bucket.fixtures_bucket.id}" or equivalent)
  3. This ensures the ECS module receives the actual bucket name with the suffix for boto3 S3 operations

Also applies to: 107-115

infrastructure/modules/storage/modules/s3-bucket/variables.tf (1)

1-5: Documentation updates improve clarity.

The updated descriptions for abort_incomplete_multipart_upload_days and noncurrent_version_expiration_days are more explicit and aligned with AWS S3 lifecycle terminology.

Also applies to: 12-16

infrastructure/modules/storage/main.tf (1)

8-29: Random suffix implementation ensures bucket name uniqueness.

The random provider and random_id resource with 4-byte length provide sufficient entropy to prevent bucket naming collisions across deployments. The suffix is consistently applied to both fixtures and zappa bucket names, and the IAM policy ARN correctly references the suffixed bucket name.

infrastructure/modules/ecs/main.tf (6)

19-39: ECR lifecycle policy correctly implements untagged image cleanup.

The lifecycle policy automates removal of untagged images after 7 days, addressing the PR objective to remove hardcoded image tag reliance. The configuration is valid and non-destructive to tagged images.


85-118: Custom ECS task execution policy properly replaces AWS-managed policy.

The custom policy grants least-privilege permissions for ECR registry access and CloudWatch Logs, replacing the AWS-managed AmazonEC2ContainerRegistryPowerUser policy. The policy correctly scopes:

  • ecr:GetAuthorizationToken to "*" (required for registry authentication)
  • Image operations (BatchCheckLayerAvailability, BatchGetImage, GetDownloadUrlForLayer) to the ECR repository ARN
  • Log operations to the project-specific log group pattern

171-198: Custom EventBridge policy implements least-privilege design.

The policy grants EventBridge only the necessary permissions to run ECS tasks and pass roles. The ecs:RunTask action includes a cluster ARN condition, and iam:PassRole is scoped to only the required execution and task roles. This replaces reliance on overly-permissive AWS-managed policies.


295-325: boto3 S3 download correctly implements objective and enables S3 access.

The load_data_task replaces AWS CLI commands with a Python boto3 script, achieving the PR objective. The task correctly:

  • Sets task_role_arn to ecs_task_role (Line 324), enabling IAM-based S3 access
  • Uses var.fixtures_bucket_name (which includes the random suffix from the storage module output)
  • Relies on the attached fixtures_read_only_policy_arn for S3:GetObject permissions

Verification needed: Confirm that fixtures_read_only_policy (from the storage module) includes s3:GetObject permission on the suffixed bucket. This should be verified in infrastructure/modules/storage/main.tf lines 15-25 to ensure the boto3 download succeeds.


217-217: All ECS tasks consistently use configurable image_tag variable.

Every task module (sync_data_task, owasp_update_project_health_metrics_task, owasp_update_project_health_scores_task, migrate_task, load_data_task, index_data_task) correctly references var.image_tag, enabling configurable Docker image versions across all ECS tasks.

Also applies to: 246-246, 267-267, 287-287, 318-318, 338-338


323-325: Task role assignment is correctly scoped.

Only load_data_task includes task_role_arn to enable S3 bucket access for boto3 file downloads. Other tasks do not require IAM permissions beyond execution role capabilities, so omitting task_role_arn is appropriate.

@rudransh-shrivastava rudransh-shrivastava marked this pull request as ready for review December 2, 2025 15:35
@rudransh-shrivastava rudransh-shrivastava force-pushed the feature/nest-zappa-migration-ecs-improvement branch from 8159861 to da6cb9a Compare December 3, 2025 13:53
@github-actions github-actions bot removed the docs Improvements or additions to documentation label Dec 3, 2025
@sonarqubecloud
Copy link

sonarqubecloud bot commented Dec 3, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve ecs Module

1 participant