Skip to content

Conversation

@gberenice
Copy link
Member

@gberenice gberenice commented Aug 28, 2025

what

  • Adds support for specifying VPC and subnets by name in addition to existing ID-based configuration.
  • New variables: vpc_name and subnet_names for name-based resource selection
  • Data sources: Added aws_vpc and aws_subnet lookups with for_each for subnet names
  • Precedence logic: Names take precedence over IDs when both are provided
  • Validation: terraform_data preconditions halt execution if neither names nor IDs provided
  • Warnings: Non-blocking check blocks warn when both names and IDs are specified
  • Tests: Added focused tests for VPC/subnet input logic

Also,

  • Bumps S3 bucket module + cuts arguments matching the defaults.
  • Addresses some trivy warnings.

why

  • This way, root modules should not add extra data sources to retrieve VPC and Subnets IDs.

references

  • N/A

Summary by CodeRabbit

  • New Features

    • Select VPC and subnets by name (Name tag) with automatic precedence over IDs.
    • Inputs for VPC/Subnets are now optional; module resolves from names or IDs.
    • Validation and warnings prevent mixing names and IDs.
  • Improvements

    • Upgraded logging bucket to stronger, KMS-based encryption.
  • Documentation

    • Added a complete example showing ID-based, name-based, and mixed configurations.
    • New example outputs for instance profile ARNs.
  • Tests

    • Added tests confirming name-based precedence and input warnings.

@gberenice gberenice requested a review from a team as a code owner August 28, 2025 17:19
@gberenice gberenice requested a review from Gowiem August 28, 2025 17:19
@coderabbitai
Copy link

coderabbitai bot commented Aug 28, 2025

Walkthrough

  • Adds name-based data sources: aws_vpc.selected and aws_subnet.selected to resolve IDs from tags.
  • Introduces locals for vpc_id and subnet_ids with precedence for names over IDs.
  • Adds validation: terraform_data vpc_subnet_validation preconditions and check vpc_subnet_warnings for mutually exclusive inputs.
  • Updates inputs: new vpc_name and subnet_names; makes vpc_id, subnet_ids, permissions_boundary optional.
  • Updates resources to use locals; adds description to egress rule.
  • Upgrades logs_bucket module to v4.10.0 with KMS-based encryption and updated label module inputs.
  • Adjusts Trivy ignore annotation format.
  • Adds example (with-names) with three module usages, variables, and outputs.
  • Adds test verifying name precedence and warning behavior.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • oycyc

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/fetch-ids-by-resource-name

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbit in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbit in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbit gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbit read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbit help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbit ignore or @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbit summary or @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbit or @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (6)
variables.tf (2)

13-17: Clarify precedence in description and align with null defaults

If you switch to null defaults, consider updating the description to “If set, takes precedence over vpc_id” and ensure locals use try()/coalesce() with null checks (e.g., var.vpc_name != null).


19-22: Mirror precedence wording and null handling for subnet_names

Same note as vpc_name; align wording and locals to check for null rather than length() > 0 on an empty default.

data.tf (1)

5-12: Guard against non-unique VPC Name matches

Using data.aws_vpc with a Name tag filter can fail if multiple VPCs share the same Name. Consider adding an additional discriminating filter (e.g., isDefault or cidr-block) or clearly document the uniqueness requirement. Alternatively, switch to data.aws_vpcs and assert a single match.

main.tf (3)

17-20: Name precedence locals look good; consider uniqueness safeguards

The locals correctly prioritize names over IDs. Given the potential for duplicate Names, ensure the subnet data source is VPC-scoped (see data.tf suggestion) to prevent ambiguous resolution.


50-60: Warn checks read well; consider an extra assert for subnet resolution count

Optional: add an assert to detect partial resolution when using names (helps surface typos early).

Apply:

 check "vpc_subnet_warnings" {
   assert {
     condition     = !(length(var.vpc_name) > 0 && length(var.vpc_id) > 0)
     error_message = "Both vpc_name and vpc_id are provided. When vpc_name is specified, vpc_id will be ignored."
   }

   assert {
     condition     = !(length(var.subnet_names) > 0 && length(var.subnet_ids) > 0)
     error_message = "Both subnet_names and subnet_ids are provided. When subnet_names are specified, subnet_ids will be ignored."
   }
+  # When names are used, ensure all of them resolved to IDs
+  assert {
+    condition     = length(var.subnet_names) == 0 || length(local.subnet_ids) == length(distinct(var.subnet_names))
+    error_message = "One or more subnet_names did not resolve to unique subnets."
+  }
 }

221-230: S3 bucket module upgrade + KMS SSE

Good upgrade and KMS configuration. Consider also enforcing KMS-only uploads at the bucket policy level if clients might attempt unencrypted uploads.

I can draft a minimal bucket policy statement to enforce aws:kms on put-object if helpful.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between b45ebcd and 2c0ec7b.

📒 Files selected for processing (7)
  • data.tf (3 hunks)
  • examples/with-names/main.tf (1 hunks)
  • examples/with-names/outputs.tf (1 hunks)
  • examples/with-names/variables.tf (1 hunks)
  • main.tf (5 hunks)
  • tests/vpc-subnet-locals-only.tftest.hcl (1 hunks)
  • variables.tf (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.tf

⚙️ CodeRabbit configuration file

**/*.tf: You're a Terraform expert who has thoroughly studied all the documentation from Hashicorp https://developer.hashicorp.com/terraform/docs and OpenTofu https://opentofu.org/docs/.
You have a strong grasp of Terraform syntax and prioritize providing accurate and insightful code suggestions.
As a fan of the Cloud Posse / SweetOps ecosystem, you incorporate many of their best practices https://docs.cloudposse.com/best-practices/terraform/ while balancing them with general Terraform guidelines.

Files:

  • examples/with-names/variables.tf
  • examples/with-names/outputs.tf
  • examples/with-names/main.tf
  • variables.tf
  • data.tf
  • main.tf
🧠 Learnings (5)
📚 Learning: 2024-10-23T18:20:57.022Z
Learnt from: oycyc
PR: masterpointio/terraform-aws-ssm-agent#28
File: tests/unit.tftest.hcl:34-71
Timestamp: 2024-10-23T18:20:57.022Z
Learning: In the `terraform-aws-ssm-agent` module, the user prefers not to include additional assertions for network and security configurations (e.g., root volume encryption, network interface configurations, and security group associations) in the `verify_launch_template` unit test in `tests/unit.tftest.hcl`.

Applied to files:

  • tests/vpc-subnet-locals-only.tftest.hcl
📚 Learning: 2024-10-29T00:06:05.693Z
Learnt from: Gowiem
PR: masterpointio/terraform-spacelift-automation#3
File: modules/spacelift-automation/main.tf:221-227
Timestamp: 2024-10-29T00:06:05.693Z
Learning: In the Terraform module `modules/spacelift-automation/main.tf`, when `var.aws_integration_id` is a required variable, avoid suggesting to make the `spacelift_aws_integration_attachment` resource conditional based on whether `aws_integration_id` is provided.

Applied to files:

  • variables.tf
📚 Learning: 2024-11-21T13:30:01.588Z
Learnt from: gberenice
PR: masterpointio/terraform-aws-tailscale#41
File: main.tf:0-0
Timestamp: 2024-11-21T13:30:01.588Z
Learning: In this Terraform module (`main.tf`), read permissions (`ssm:GetParameter`) for SSM parameters are managed by the SSM Agent module (`masterpointio/ssm-agent/aws`), so adding `ssm:GetParameter` permissions in the custom `ssm_policy` module is unnecessary.

Applied to files:

  • data.tf
  • main.tf
📚 Learning: 2025-01-30T16:39:53.021Z
Learnt from: oycyc
PR: masterpointio/terraform-spacelift-events-collector-audit-trail#2
File: s3.tf:2-14
Timestamp: 2025-01-30T16:39:53.021Z
Learning: The CloudPosse S3 bucket module (cloudposse/s3-bucket/aws) comes with secure defaults including versioning, private ACL, blocked public access, and server-side encryption. These don't need to be explicitly specified unless overriding the defaults.

Applied to files:

  • main.tf
📚 Learning: 2025-01-30T16:39:53.021Z
Learnt from: oycyc
PR: masterpointio/terraform-spacelift-events-collector-audit-trail#2
File: s3.tf:2-14
Timestamp: 2025-01-30T16:39:53.021Z
Learning: The CloudPosse S3 bucket module (cloudposse/s3-bucket/aws) comes with secure defaults: versioning is enabled by default (versioning_enabled=true), ACL is private by default (acl="private"), and public access is blocked by default (block_public_acls=true, block_public_policy=true, ignore_public_acls=true). These don't need to be explicitly specified unless overriding the defaults.

Applied to files:

  • main.tf
🔇 Additional comments (11)
examples/with-names/variables.tf (1)

1-29: Example inputs look good

Sensible defaults for a demo. Region/AZs consistent with us-east-1; ipv6_enabled wiring is clear.

examples/with-names/main.tf (2)

50-50: Verify correct VPC Name tag output

  • Check whether cloudposse/vpc/aws v2.1.0 exposes a Name tag output (e.g. vpc_name, name_tag, or via tags["Name"]).
  • If no suitable output exists, pass the desired Name explicitly (via a variable) or derive it using the same label settings as the VPC module.

25-29: Verify igw_id input type manually
I wasn’t able to locate the igw_id variable declaration in the v2.3.0 module code—please confirm in the module’s documentation whether igw_id expects a string or list and adjust the call (e.g. remove [] if it should be a string).

tests/vpc-subnet-locals-only.tftest.hcl (1)

57-88: Solid, focused precedence test

Good use of mocks and expect_failures to assert warning behavior. This keeps unit scope tight.

examples/with-names/outputs.tf (1)

1-14: Outputs are clear and aligned with examples

Names and descriptions map 1:1 with the three example module usages.

data.tf (2)

56-56: Annotation style update looks good

Migrating to trivy:ignore:AVD-AWS-0344 is correct.


68-80: AssumeRole policy LGTM

Minimal, correct trust policy for EC2.

main.tf (4)

34-47: Preconditions are solid

Good use of terraform_data preconditions to fail-fast when neither name nor ID inputs are provided.


130-134: SG now correctly uses resolved VPC ID

Switch to local.vpc_id is correct and matches the new precedence logic.


136-146: Egress rule description and ignore rationale make sense

The added description helps audits; the trivy ignore is justified for SSM egress.


354-355: ASG subnets now sourced from resolved IDs

Using local.subnet_ids aligns with the new inputs and reduces surprises. LGTM.

data.tf Outdated
Comment on lines 83 to 86
data "aws_s3_bucket" "logs_bucket" {
count = var.session_logging_enabled ? 1 : 0
bucket = try(coalesce(var.session_logging_bucket_name, module.logs_bucket.bucket_id), "")
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Tighten session logging IAM policy to least privilege and remove data source dependency

  • Avoid “*” where feasible: scope CloudWatch Logs actions to the specific log group, S3 GetEncryptionConfiguration to the target bucket, and kms:GenerateDataKey to the KMS key.
  • Optional: Construct S3 bucket ARN directly to drop the aws_s3_bucket data source (reduces IAM read perms and planning overhead).

Apply:

- data "aws_s3_bucket" "logs_bucket" {
-  count  = var.session_logging_enabled ? 1 : 0
-  bucket = try(coalesce(var.session_logging_bucket_name, module.logs_bucket.bucket_id), "")
- }
+ # Optional: remove this data source entirely by constructing the ARN directly.
+ # S3 bucket ARN format: arn:aws:s3:::<bucket-name>
   statement {
     sid    = "SSMAgentSessionAllowS3Logging"
     effect = "Allow"
     actions = [
       "s3:PutObject"
     ]
-    resources = ["${join("", data.aws_s3_bucket.logs_bucket.*.arn)}/*"]
+    # If keeping the data source:
+    resources = formatlist("%s/*", data.aws_s3_bucket.logs_bucket[*].arn)
+    # If removing the data source, use:
+    # resources = ["arn:aws:s3:::${local.session_logging_bucket_name}/*"]
   }

   statement {
     sid    = "SSMAgentSessionAllowCloudWatchLogging"
     effect = "Allow"
     actions = [
       "logs:CreateLogStream",
       "logs:PutLogEvents",
       "logs:DescribeLogGroups",
       "logs:DescribeLogStreams"
     ]
-    resources = ["*"]
+    resources = [
+      "arn:aws:logs:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:log-group:${module.logs_label.id}:*"
+    ]
   }

   statement {
     sid    = "SSMAgentSessionAllowGetEncryptionConfig"
     effect = "Allow"
     actions = [
       "s3:GetEncryptionConfiguration"
     ]
-    resources = ["*"]
+    # Scope to the session logs bucket only
+    resources = data.aws_s3_bucket.logs_bucket[*].arn
+    # Or: ["arn:aws:s3:::${local.session_logging_bucket_name}"]
   }

   statement {
     sid    = "SSMAgentSessionAllowKMSDataKey"
     effect = "Allow"
     actions = [
       "kms:GenerateDataKey"
     ]
-    resources = ["*"]
+    resources = compact([local.session_logging_kms_key_arn])
   }

Also applies to: 88-130

🤖 Prompt for AI Agents
In data.tf around lines 83-86 (and similarly 88-130), the session logging IAM
policy is overly permissive and relies on the aws_s3_bucket data source; tighten
permissions by scoping actions to exact ARNs and remove the data source
dependency: replace "*" resources with the specific CloudWatch Logs log group
ARN
(arn:aws:logs:${var.region}:${data.aws_caller_identity.current.account_id}:log-group:${var.session_log_group_name}:*),
the S3 bucket ARN (arn:aws:s3:::${var.session_logging_bucket_name != "" ?
var.session_logging_bucket_name : module.logs_bucket.bucket_id}) for
s3:GetEncryptionConfiguration, and the KMS key ARN for kms:GenerateDataKey;
optionally delete the data "aws_s3_bucket" resource and compute the bucket ARN
directly from the input/module output, then update all IAM policy resource
blocks to reference those concrete ARNs rather than wildcards so permissions are
least-privilege.

Comment on lines +58 to +59
depends_on = [module.subnets]
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

❓ Verification inconclusive

depends_on here won’t fix plan-time data lookups

Data sources inside the child module resolve during planning; they won’t “wait” for module.subnets. This example will often fail on first plan/apply when the subnets don’t yet exist. Document this limitation or prefer IDs when creating the subnets in the same run.

-  depends_on = [module.subnets]
+  # Note: Name-based lookups plan-time resolve and can fail on first apply if subnets are created in the same run.
+  # Consider using IDs when creating subnets, or run in two stages (create subnets, then apply this module with names).

Document plan-time data lookup limitation and prefer IDs or two-stage applies
Terraform data sources inside modules resolve during the planning phase, so a depends_on on module.subnets won’t delay those lookups until after subnet creation—this can cause failures on the first apply when subnets don’t yet exist. Update the example to either:

  • Use subnet IDs directly in this module (so the lookup happens against created resources)
  • Or split into two runs: first apply the subnets module, then apply the module relying on name-based lookups.
🤖 Prompt for AI Agents
In examples/with-names/main.tf around lines 58-59, the current depends_on =
[module.subnets] comment is misleading because Terraform data sources inside
modules are evaluated at plan time and won’t wait for the subnets to be created;
update the example to either (A) accept and pass the subnet IDs into this module
so the module uses explicit IDs (avoiding plan-time lookups), or (B) document
and demonstrate a two-stage apply workflow (first terraform apply for the
subnets module, then apply for the dependent module) and remove/clarify the
depends_on that implies data lookups will be delayed until after creation.

Comment on lines +74 to +75
depends_on = [module.subnets]
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Same plan-time lookup caveat for mixed example

Remove depends_on and add a note, or switch the example to consume existing infra names.

-  depends_on = [module.subnets]
+  # See note above regarding plan-time name lookups when subnets are created in the same run.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
depends_on = [module.subnets]
}
# See note above regarding plan-time name lookups when subnets are created in the same run.
}
🤖 Prompt for AI Agents
In examples/with-names/main.tf around lines 74-75, the explicit depends_on =
[module.subnets] introduces a plan-time lookup caveat; remove the depends_on
line and either (a) add a short comment/note in the example explaining the
caveat and recommending consuming named existing infra (or using data sources)
instead of forcing depends_on, or (b) modify the example to consume existing
infra names/IDs (e.g., accept variables or use data sources for subnets) so the
example no longer needs depends_on. Ensure the file reflects the chosen approach
and update any adjacent README/example text to explain the behavior.

variables.tf Outdated
variable "vpc_id" {
type = string
description = "The ID of the VPC which the EC2 Instance will run in."
default = ""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Prefer null default for optional string inputs

Use null instead of empty string to clearly indicate “unset” and simplify precedence checks.

-  default     = ""
+  default     = null
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
default = ""
default = null
🤖 Prompt for AI Agents
In variables.tf around line 4, the variable default is set to an empty string
which should be null to represent an unset optional string; change the default
from "" to null and adjust any downstream code or conditionals that check for
empty string to instead handle null (e.g., use terraform's coalescing or null
checks) so precedence and optional handling behave correctly.

main.tf Outdated
Comment on lines 34 to 61
# Validation using terraform_data to halt execution if requirements aren't met
resource "terraform_data" "vpc_subnet_validation" {
lifecycle {
precondition {
condition = length(var.vpc_name) > 0 || length(var.vpc_id) > 0
error_message = "Either vpc_name or vpc_id must be provided."
}

precondition {
condition = length(var.subnet_names) > 0 || length(var.subnet_ids) > 0
error_message = "Either subnet_names or subnet_ids must be provided."
}
}
}

# Warning checks for VPC and subnet configuration (non-blocking)
check "vpc_subnet_warnings" {
assert {
condition = !(length(var.vpc_name) > 0 && length(var.vpc_id) > 0)
error_message = "Both vpc_name and vpc_id are provided. When vpc_name is specified, vpc_id will be ignored."
}

assert {
condition = !(length(var.subnet_names) > 0 && length(var.subnet_ids) > 0)
error_message = "Both subnet_names and subnet_ids are provided. When subnet_names are specified, subnet_ids will be ignored."
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering, does this belong in the main.tf? Is there another file that this might best live in?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to validations.tf for clarity 👍

Copy link
Contributor

@oycyc oycyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great improvements! CodeRabbit had some good suggestions worth reviewing.

@gberenice gberenice force-pushed the feature/fetch-ids-by-resource-name branch from 532c216 to 9537e93 Compare August 29, 2025 09:40
@gberenice gberenice force-pushed the feature/fetch-ids-by-resource-name branch from 9537e93 to 21f6285 Compare August 29, 2025 09:45
Copy link
Contributor

@oycyc oycyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome

@gberenice gberenice merged commit 8a3d43c into main Aug 29, 2025
5 checks passed
@gberenice gberenice deleted the feature/fetch-ids-by-resource-name branch August 29, 2025 16:14
gberenice pushed a commit that referenced this pull request Aug 29, 2025
🤖 I have created a release *beep* *boop*
---


##
[1.7.0](v1.6.0...v1.7.0)
(2025-08-29)


### Features

* add VPC and subnet name support
([#54](#54))
([8a3d43c](8a3d43c))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: masterpointbot[bot] <177651640+masterpointbot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants