Skip to content

Conversation

@digitalghost-dev
Copy link
Owner

@digitalghost-dev digitalghost-dev commented Oct 19, 2025

Summary by CodeRabbit

  • New Features

    • Added a scheduled pricing pipeline and related pricing orchestration.
  • Bug Fixes

    • Improved Pokémon egg group display names for clearer output.
  • Documentation

    • Expanded AWS infrastructure guide with EC2 content; updated README badges and Docker examples; docs build now runs as non-root.
  • Chores

    • Bumped application version to v1.7.2 across build and packaging configs; updated Terraform provider lockfiles and infrastructure metadata.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 19, 2025

Walkthrough

Bumps version strings to v1.7.2 across build/packaging, adds Terraform configs and lockfiles for VPC/EC2/RDS, introduces pricing assets/job/schedule and merges them into returned Dagster definitions, updates Pokémon egg-group display mapping and corresponding golden tests, hardens docs Dockerfile, and refreshes docs and README badges.

Changes

Cohort / File(s) Summary
Version & Packaging
\.github/workflows/ci.yml`, `.goreleaser.yml`, `Dockerfile`, `nfpm.yaml`, `testdata/main_latest_flag.golden``
Bumps VERSION_NUMBER/ldflags/version/metadata from v1.7.1 (and testdata v1.7.0→v1.7.1) to v1.7.2 across CI, goreleaser, Docker build, and packaging metadata.
Terraform — EC2
card_data/infrastructure/aws/ec2/.terraform.lock.hcl, card_data/infrastructure/aws/ec2/provider.tf, card_data/infrastructure/aws/ec2/instance.tf, card_data/infrastructure/aws/ec2/outputs.tf
Adds provider lockfile and Terraform Cloud/provider config; introduces aws_instance resource for dagster-webserver and an output exposing its id.
Terraform — VPC
card_data/infrastructure/aws/vpc/.terraform.lock.hcl, card_data/infrastructure/aws/vpc/provider.tf, card_data/infrastructure/aws/vpc/vpc.tf, card_data/infrastructure/aws/vpc/outputs.tf
Adds provider lockfile and Terraform Cloud/provider config; introduces an aws_vpc resource (10.0.0.0/20, DNS enabled, tags) and an output for the VPC id.
Terraform — RDS
card_data/infrastructure/aws/rds/provider.tf, card_data/infrastructure/aws/rds/variables.tf, card_data/infrastructure/aws/rds/db_instance.tf, card_data/infrastructure/aws/rds/db_subnet_group.tf
Renames public variable kms-arn-keykms_arn_key, updates references, appends an extra security group to RDS instance, adjusts subnet_ids list and workspace name to rds.
Pipelines / Pricing Schedule
card_data/pipelines/definitions.py, card_data/pipelines/defs/load/load_pricing_data.py, card_data/pipelines/defs/transformation/transform_data.py, card_data/pipelines/defs/load/load_pricing_data.py
Adds imports for pricing assets, introduces pricing_pipeline_job and price_schedule (cron "31 21 * * *", America/Los_Angeles), creates defs_pricing, merges it into defs() return value; load_pricing_data now accepts a Polars DataFrame parameter and removes local dataframe creation; renames dbt asset function to dbt_load_pricing_data.
dbt project & sources
card_data/pipelines/poke_cli_dbt/dbt_project.yml, card_data/pipelines/poke_cli_dbt/models/sources.yml
Bumps dbt project version 1.7.0 → 1.7.2, adds materialized: table under models.poke_cli_dbt, and attaches Dagster asset_key metadata to the pricing_data source.
Pokémon CLI mapping & tests
cmd/pokemon/pokemon.go, testdata/*.golden (pokemon_defense.golden, pokemon_defense_ability_immunities.golden, pokemon_image.golden, pokemon_image_flag_non-valid_size.golden, pokemon_no_flags_dual_type.golden, pokemon_regional_form.golden, pokemon_stats.golden)
Introduces curated egg-group display name mapping (e.g., indeterminateAmorphous, groundField, no-eggsUndiscovered) with fallback capitalization; updates multiple golden test files to match new display values.
Docs & README
README.md, docs/Infrastructure_Guide/aws.md, docs/Infrastructure_Guide/dbt.md
Adds centered CodeRabbit PRs badge block, updates Docker image tag references and image-size badge to v1.7.2, adds EC2 subsection and example commands, and updates dbt link to HTTPS and small RDS text change.
Docs Docker hardening
docs/Dockerfile
Switches to non-root docsuser (UID/GID), uses multi-stage build copying site with ownership, reduces layer size with --no-cache-dir, and serves site as an unprivileged user on port 8080.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Scheduler as Dagster Scheduler
    participant PriceSchedule as price_schedule
    participant PricingJob as pricing_pipeline_job
    participant BuildDF as build_dataframe (asset)
    participant LoadPricing as load_pricing_data (asset)

    Scheduler->>PriceSchedule: cron trigger (31 21 * * *)
    PriceSchedule->>PricingJob: trigger job
    PricingJob->>BuildDF: materialize build_dataframe
    PricingJob->>LoadPricing: provide build_pricing_dataframe → run load_pricing_data
    LoadPricing-->>PricingJob: write complete / emit events
    Note right of PricingJob: defs_pricing merged into defs()\nvia dg.Definitions.merge(...)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • 1.7.1 #189 — Overlaps on Pokémon egg-group display changes in cmd/pokemon/pokemon.go; likely touches the same CLI mapping logic.

Poem

🐰 A little hop to bump the version bright,
VPCs and EC2 take their place in sight.
Pricing wakes at night by cron’s soft chime,
Egg groups renamed to make the output rhyme.
Docs now served safe by a careful, non-root bite.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title Check ❓ Inconclusive The PR title "1.7.2" is a bare version number that lacks meaningful context about the changeset. While version bumps from v1.7.1 to v1.7.2 do appear across multiple files (.github/workflows/ci.yml, .goreleaser.yml, Dockerfile, nfpm.yaml), the PR contains far more substantial changes including new AWS infrastructure provisioning (EC2, VPC, RDS), significant Dagster pipeline refactoring, egg group name mapping fixes, and documentation updates. The title uses only a non-descriptive version identifier without conveying what the PR actually accomplishes or what a teammate scanning the history should understand about these changes. Consider revising the title to capture the primary scope of changes. For example, a title like "Release v1.7.2: Add infrastructure provisioning and pipeline orchestration" or "v1.7.2 release with EC2/VPC infrastructure and pricing pipeline" would better communicate the main objectives and help reviewers and future maintainers understand the PR's purpose at a glance.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 1.7.2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link

codecov bot commented Oct 19, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines Coverage Δ
cmd/pokemon/pokemon.go 87.50% <100.00%> (+0.79%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (5)
docs/Infrastructure_Guide/aws.md (2)

164-171: Reduce repetitive sentence structure (line 165+).

Multiple consecutive sentences begin with "Under", creating repetitive phrasing. Consider rewording for variety.

Example refactoring:

 6. Under **Key Pair (login)**, select a key pair or create a new one. If a new one is created, check for the `.pem` file in the downloads folder.
 7. Under **Network Settings**:
    * Select the VPC created earlier.
    * Switch to a _public_ subnet to allow connection to the virtual machine.
    * Enable _Auto-assign public IP_.
    * For the _Firewall_, select the default security group that should've been created when setting up the VPC.
-8. Under **Configure Storage**, leave as default.
-9. Under **Advanced Details**, lease as default.
+8. For **Configure Storage**, leave as default.
+9. For **Advanced Details**, leave as default.

208-221: Use consistent code block style (MD046).

Line 208 uses a fenced code block (```) within what appears to be an indented context. Ensure consistency with the rest of the document's code block style.

card_data/infrastructure/aws/ec2/outputs.tf (1)

1-3: Simplify the output value using direct attribute reference.

The string interpolation syntax is unnecessary and less idiomatic in modern Terraform. Simplify by removing the string interpolation.

-output "aws_instance_tfer--i-01dbf82e21c0da38f_dagster-webserver_id" {
-  value = "${aws_instance.tfer--i-01dbf82e21c0da38f_dagster-webserver.id}"
+output "aws_instance_tfer--i-01dbf82e21c0da38f_dagster-webserver_id" {
+  value = aws_instance.tfer--i-01dbf82e21c0da38f_dagster-webserver.id
 }
card_data/infrastructure/aws/ec2/instance.tf (1)

14-20: Remove redundant tags_all block.

The tags_all attribute is managed by the Terraform AWS provider and should not be explicitly set. It automatically merges tags with any provider-level default tags.

card_data/pipelines/definitions.py (1)

14-15: Reorder definitions for better readability.

defs_pricing is referenced on line 15 but defined later at line 37. While this works due to late binding, defining defs_pricing before the defs() function would improve code clarity.

Apply this diff to reorder the definitions:

+price_schedule = dg.ScheduleDefinition(
+    name="price_schedule",
+    cron_schedule="10 10 * * *",
+    target=[build_dataframe, load_pricing_data],
+    execution_timezone="America/Los_Angeles",
+)
+
+defs_pricing = dg.Definitions(
+    schedules=[price_schedule],
+)
+
 @definitions
 def defs():
     folder_defs = load_from_defs_folder(project_root=Path(__file__).parent.parent)
     return dg.Definitions.merge(folder_defs, defs_pricing)
 
 dbt_project_directory = Path(__file__).absolute().parent / "poke_cli_dbt"
-# ... rest of file
-
-price_schedule = dg.ScheduleDefinition(
-    name="price_schedule",
-    cron_schedule="10 10 * * *",
-    target=[build_dataframe, load_pricing_data],
-    execution_timezone="America/Los_Angeles",
-)
-
-defs_pricing = dg.Definitions(
-    schedules=[price_schedule],
-)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2a3555a and 463f283.

📒 Files selected for processing (31)
  • .github/workflows/ci.yml (1 hunks)
  • .goreleaser.yml (1 hunks)
  • Dockerfile (1 hunks)
  • README.md (2 hunks)
  • card_data/infrastructure/aws/ec2/.terraform.lock.hcl (1 hunks)
  • card_data/infrastructure/aws/ec2/instance.tf (1 hunks)
  • card_data/infrastructure/aws/ec2/outputs.tf (1 hunks)
  • card_data/infrastructure/aws/ec2/provider.tf (1 hunks)
  • card_data/infrastructure/aws/rds/db_instance.tf (3 hunks)
  • card_data/infrastructure/aws/rds/db_subnet_group.tf (1 hunks)
  • card_data/infrastructure/aws/rds/provider.tf (1 hunks)
  • card_data/infrastructure/aws/rds/variables.tf (1 hunks)
  • card_data/infrastructure/aws/vpc/.terraform.lock.hcl (1 hunks)
  • card_data/infrastructure/aws/vpc/outputs.tf (1 hunks)
  • card_data/infrastructure/aws/vpc/provider.tf (1 hunks)
  • card_data/infrastructure/aws/vpc/vpc.tf (1 hunks)
  • card_data/pipelines/definitions.py (2 hunks)
  • card_data/pipelines/poke_cli_dbt/dbt_project.yml (1 hunks)
  • cmd/pokemon/pokemon.go (1 hunks)
  • docs/Dockerfile (1 hunks)
  • docs/Infrastructure_Guide/aws.md (2 hunks)
  • docs/Infrastructure_Guide/dbt.md (1 hunks)
  • nfpm.yaml (1 hunks)
  • testdata/main_latest_flag.golden (1 hunks)
  • testdata/pokemon_defense.golden (1 hunks)
  • testdata/pokemon_defense_ability_immunities.golden (1 hunks)
  • testdata/pokemon_image.golden (1 hunks)
  • testdata/pokemon_image_flag_non-valid_size.golden (1 hunks)
  • testdata/pokemon_no_flags_dual_type.golden (1 hunks)
  • testdata/pokemon_regional_form.golden (1 hunks)
  • testdata/pokemon_stats.golden (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
card_data/pipelines/definitions.py (2)
card_data/pipelines/defs/extract/extract_pricing_data.py (1)
  • build_dataframe (94-110)
card_data/pipelines/defs/load/load_pricing_data.py (1)
  • load_pricing_data (14-26)
🪛 Checkov (3.2.334)
card_data/infrastructure/aws/ec2/instance.tf

[medium] 1-21: Ensure that detailed monitoring is enabled for EC2 instances

(CKV_AWS_126)


[high] 1-21: Ensure Instance Metadata Service Version 1 is not enabled

(CKV_AWS_79)

🪛 LanguageTool
docs/Infrastructure_Guide/aws.md

[grammar] ~150-~150: Ensure spelling is correct
Context: ...Costs**, make any changes if necessary, they click on create database. --- ## EC...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[style] ~165-~165: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Instance Type*, choose t2.small. 6. Under Key Pair (login), select a key pair...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🪛 markdownlint-cli2 (0.18.1)
docs/Infrastructure_Guide/aws.md

155-155: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


208-208: Code block style
Expected: indented; Actual: fenced

(MD046, code-block-style)

🔇 Additional comments (33)
card_data/infrastructure/aws/rds/provider.tf (1)

9-12: Verify Terraform Cloud workspace migration strategy.

The workspace name changed from "poke-cli" to "rds". This separates RDS state into its own workspace, which aligns with the broader multi-module setup (ec2, vpc, rds each in separate workspaces). Ensure that:

  1. Any existing state in the "poke-cli" workspace has been properly migrated or backed up.
  2. Team members and CI/CD pipelines are aware of this workspace change.
  3. The Terraform Cloud organization has this workspace created.
card_data/infrastructure/aws/rds/db_subnet_group.tf (1)

5-8: Verify the impact of reduced subnet count on RDS availability.

The subnet list was reduced from 4 to 2 subnets. Ensure that:

  1. The remaining subnets span multiple availability zones (for multi-AZ RDS resilience).
  2. The subnet group still supports the RDS deployment strategy (particularly if multi-AZ is enabled in db_instance.tf).
  3. There is no unintended impact on RDS failover or recovery capabilities.
docs/Infrastructure_Guide/aws.md (1)

159-229: EC2 documentation is well-structured and comprehensive.

The new EC2 section provides clear, step-by-step instructions for launching instances, connecting, and configuration. The integration with RDS setup and Dagster deployment is logical and helpful.

card_data/infrastructure/aws/ec2/provider.tf (1)

1-20: Terraform provider configuration is properly structured.

The EC2 provider configuration follows the established pattern across RDS and VPC modules with consistent versioning (~> 6.7.0) and workspace organization.

card_data/infrastructure/aws/vpc/.terraform.lock.hcl (1)

1-25: Terraform lockfile is properly auto-generated.

This lockfile ensures reproducible Terraform provider versions and verification across environments. The provider version 6.7.0 matches the version constraint in provider.tf.

card_data/infrastructure/aws/rds/db_instance.tf (3)

23-23: Verify variable reference consistency with variables.tf.

Lines 23 and 35 reference var.kms_arn_key, which appears to be renamed from kms-arn-key (hyphen to underscore). Ensure that card_data/infrastructure/aws/rds/variables.tf defines this variable with the new name kms_arn_key.

Also applies to: 35-35


53-53: Verify the addition of the second security group.

Line 53 adds a second security group (sg-09ff0b46e3dd7a843) to the RDS instance. Clarify:

  1. The purpose of this additional security group (e.g., EC2-to-RDS connectivity, additional restrictions).
  2. Whether this aligns with the infrastructure changes in the VPC and EC2 modules.
  3. Any security policy implications (e.g., broader ingress rules, compliance).

1-54: RDS instance configuration is well-maintained.

The database configuration remains consistent with high-availability standards (encryption, monitoring, backups). The variable and security group updates enhance the infrastructure setup.

card_data/infrastructure/aws/ec2/.terraform.lock.hcl (1)

1-25: Terraform lockfile is properly auto-generated.

This lockfile ensures reproducible EC2 provider downloads with verified hashes. The provider version 6.7.0 is consistent across all modules (VPC, RDS, EC2).

card_data/infrastructure/aws/vpc/outputs.tf (1)

1-3: VPC output is properly defined.

The output correctly exposes the VPC ID for use by other modules (RDS, EC2). The naming follows the Terraform-generated resource pattern.

card_data/infrastructure/aws/rds/variables.tf (1)

1-5: Variable naming convention properly updated.

The renaming from kebab-case to snake_case aligns with Terraform conventions. Ensure all references (such as in db_instance.tf) have been updated to use var.kms_arn_key.

card_data/infrastructure/aws/vpc/provider.tf (1)

1-20: Terraform Cloud and provider configuration looks good.

The provider setup is well-structured with proper workspace organization and provider version pinning. Ensure the Terraform Cloud organization (digitalghost-dev) and workspace (poke-cli/vpc) are properly configured before applying.

testdata/pokemon_no_flags_dual_type.golden (1)

12-12: Test data aligns with the new egg group mapping.

This change correctly reflects the updated display name mapping from "no-eggs" to "Undiscovered" implemented in cmd/pokemon/pokemon.go (line 110).

testdata/pokemon_regional_form.golden (1)

12-12: Test data aligns with the new egg group mapping.

This change correctly reflects the updated display name mapping from "plant" to "Grass" implemented in cmd/pokemon/pokemon.go (line 109).

nfpm.yaml (1)

4-4: Version bump is correct and consistent.

The version update from v1.7.1 to v1.7.2 aligns with the version bumps in .github/workflows/ci.yml (line 33), .goreleaser.yml (line 17), and README.md.

.github/workflows/ci.yml (1)

33-33: Version bump is correct and consistent.

The VERSION_NUMBER update from v1.7.1 to v1.7.2 correctly propagates throughout the CI workflow for package building, Docker tagging, and SBOM generation. This aligns with version updates in nfpm.yaml (line 4) and .goreleaser.yml (line 17).

.goreleaser.yml (1)

17-17: Version bump is correct and consistent.

The ldflags version update from v1.7.1 to v1.7.2 ensures the binary reports the correct version at runtime. This aligns with version updates in .github/workflows/ci.yml (line 33) and nfpm.yaml (line 4).

docs/Infrastructure_Guide/dbt.md (1)

15-15: Good security practice updating the link to HTTPS.

Changing the external link from HTTP to HTTPS is a security improvement that ensures encrypted communication when users follow the documentation link.

cmd/pokemon/pokemon.go (1)

105-118: Clean implementation of egg group name mapping.

The mapping approach correctly standardizes egg group display names to match official Pokémon terminology while maintaining backward compatibility through the title-case fallback for unmapped names. The logic is straightforward and maintainable.

README.md (4)

5-5: Docker badge updated to reflect the new version.

The image size badge correctly references v1.7.2, consistent with the version bumps in .github/workflows/ci.yml (line 33) and other configuration files.


8-10: CodeRabbit badge addition enhances project visibility.

The centered badge block for CodeRabbit PR metrics is properly formatted and follows the existing badge styling pattern in the README.


97-97: Docker usage example updated to the new version.

The docker run command correctly references the v1.7.2 image tag, ensuring users pull the latest version.


101-101: Docker usage example updated to the new version.

The interactive docker run command with entrypoint correctly references the v1.7.2 image tag, maintaining consistency with the other usage examples.

Dockerfile (1)

11-11: LGTM! Version bump is consistent.

The version update from v1.7.1 to v1.7.2 aligns with the PR objectives.

card_data/pipelines/poke_cli_dbt/dbt_project.yml (2)

2-2: Verify the version skip from 1.7.0 to 1.7.2.

The version jumps from '1.7.0' directly to '1.7.2', skipping '1.7.1'. Ensure this is intentional and that no intermediate version update was missed.


21-21: Verify the materialization strategy change.

The addition of +materialized: table changes how dbt models are persisted. Ensure this change is intentional and that it has been tested, as it may impact database storage and query performance.

docs/Dockerfile (1)

4-5: Excellent security hardening!

The introduction of a non-root user (docsuser) with explicit UID/GID in both build and runtime stages significantly improves the security posture of the documentation container. The use of --no-cache-dir for pip installations and proper ownership management via --chown flags are best practices.

Also applies to: 9-9, 11-12, 14-14, 16-16, 24-25, 29-29, 31-31

testdata/main_latest_flag.golden (1)

4-4: Verify the version mismatch in test data.

The test data shows v1.7.1 as the latest available version, while the PR bumps the actual version to v1.7.2. If this test validates version checking against a remote source, this discrepancy is expected. However, confirm that this golden file will be updated after the v1.7.2 release or if it should reflect the current version.

testdata/pokemon_image_flag_non-valid_size.golden (1)

12-12: LGTM! Egg group display mapping update.

The change from "Ground" to "Field" is consistent with the egg group display name mapping implemented across multiple test files in this PR.

testdata/pokemon_stats.golden (1)

12-12: LGTM! Improved egg group display name.

The change from "Humanshape" to "Human-Like" improves readability and aligns with the systematic egg group display name updates in this PR.

testdata/pokemon_defense_ability_immunities.golden (1)

12-12: LGTM! More descriptive egg group naming.

The change from "Indeterminate" to "Amorphous" is more descriptive and consistent with the egg group display name mapping updates across this PR.

testdata/pokemon_defense.golden (1)

12-12: LGTM! Egg group display update with alphabetical ordering.

The change from "Dragon, Indeterminate" to "Amorphous, Dragon" reflects both the egg group name mapping ("Indeterminate" → "Amorphous") and alphabetical reordering. This is consistent with the systematic egg group display improvements across the PR.

testdata/pokemon_image.golden (1)

13-13: Egg group mapping verified and confirmed correct.

The golden file update is accurate. The implementation in cmd/pokemon/pokemon.go contains an explicit mapping ("ground": "Field") that transforms the API's egg group value to the display name. The change from "Ground" to "Field" correctly reflects this modernized egg group naming logic and aligns with the codebase.

@digitalghost-dev digitalghost-dev linked an issue Oct 19, 2025 that may be closed by this pull request
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
docs/Infrastructure_Guide/aws.md (1)

154-155: Fix Markdown heading formatting (MD036).

The subtitle uses emphasis instead of a proper Markdown heading structure. This was flagged in the previous review and should be resolved.

Apply one of the following fixes:

Option 1 (preferred): Use a proper subheading:

 ## EC2
-_Elastic Compute Cloud_
+
+### Elastic Compute Cloud

Option 2: Move to a single-line heading format:

-## EC2
-_Elastic Compute Cloud_
+## EC2 - Elastic Compute Cloud
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 463f283 and 488d96e.

📒 Files selected for processing (1)
  • docs/Infrastructure_Guide/aws.md (2 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/Infrastructure_Guide/aws.md

[style] ~165-~165: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Instance Type*, choose t2.small. 6. Under Key Pair (login), select a key pair...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🪛 markdownlint-cli2 (0.18.1)
docs/Infrastructure_Guide/aws.md

155-155: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


208-208: Code block style
Expected: indented; Actual: fenced

(MD046, code-block-style)

🔇 Additional comments (1)
docs/Infrastructure_Guide/aws.md (1)

150-150: Grammar fix approved.

The "they click" → "then click" correction in step 13 is correct and improves readability.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 488d96e and 812c234.

📒 Files selected for processing (2)
  • card_data/pipelines/definitions.py (2 hunks)
  • card_data/pipelines/defs/load/load_pricing_data.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
card_data/pipelines/defs/load/load_pricing_data.py (1)
card_data/pipelines/utils/secret_retriever.py (1)
  • fetch_secret (8-18)
card_data/pipelines/definitions.py (2)
card_data/pipelines/defs/extract/extract_pricing_data.py (1)
  • build_dataframe (94-110)
card_data/pipelines/defs/load/load_pricing_data.py (1)
  • load_pricing_data (14-25)
🔇 Additional comments (5)
card_data/pipelines/defs/load/load_pricing_data.py (1)

2-2: LGTM: Polars import added for type annotations.

The import is necessary for the pl.DataFrame type annotation in the function signature.

card_data/pipelines/definitions.py (4)

30-34: Job definition looks correct, pending asset verification.

The pricing_pipeline_job is properly structured using define_asset_job with an asset selection. This addresses the previous review feedback about targeting a job instead of plain functions.

However, ensure that build_dataframe is actually a Dagster asset (see verification script in the previous comment).


36-41: LGTM: Schedule properly targets the job.

The schedule definition correctly targets pricing_pipeline_job instead of plain functions, addressing the critical issue raised in previous reviews. The schedule will run daily at 10:10 AM Pacific time.


43-47: LGTM: Pricing definitions properly structured.

The defs_pricing object correctly includes all necessary components (assets, jobs, schedules), addressing the previous review feedback about including the assets in the definitions.


8-9: The review comment is incorrect—build_dataframe is properly decorated as a Dagster asset.

The verification shows that build_dataframe is decorated with @dg.asset(kinds={"API", "Polars", "Pydantic"}, name="build_pricing_dataframe") on line 93 of card_data/pipelines/defs/extract/extract_pricing_data.py. The decorator is present, so the AssetSelection.assets() call and assets list will work correctly at runtime.

Likely an incorrect or invalid review comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
card_data/pipelines/defs/load/load_pricing_data.py (1)

14-14: Dagster input mismatch: parameter name must be build_dataframe.

Auto-dependency resolution won’t wire from the build_dataframe asset. Rename param and usage.

Apply:

- def load_pricing_data(build_pricing_dataframe: pl.DataFrame) -> None:
+ def load_pricing_data(build_dataframe: pl.DataFrame) -> None:
@@
-        build_pricing_dataframe.write_database(
+        build_dataframe.write_database(
             table_name=table_name, connection=database_url, if_table_exists="replace"
         )

Also applies to: 19-21

🧹 Nitpick comments (4)
card_data/pipelines/poke_cli_dbt/models/sources.yml (1)

99-101: Avoid duplicate asset_key mapping (pick one source of truth).

This meta maps to the same AssetKey as the CustomDbtTranslator. Keep either the YAML meta or the translator override to prevent future drift; both is redundant.

card_data/pipelines/defs/transformation/transform_data.py (2)

13-23: Single mapping location for DBT source -> Dagster AssetKey.

You map sources here and also via YAML meta. Keep only one (prefer translator) to avoid divergence.


31-36: Clarify scope: function name vs behavior.

The asset is named “dbt_load_pricing_data” but runs dbt build for the entire project. Either rename to reflect scope or restrict selection.

Example rename:

-@dbt_assets(
+@dbt_assets(
     manifest=DBT_PROJECT_PATH / "target" / "manifest.json",
     dagster_dbt_translator=CustomDbtTranslator()
 )
-def dbt_load_pricing_data(context: dg.AssetExecutionContext, dbt: DbtCliResource):
+def dbt_project_assets(context: dg.AssetExecutionContext, dbt: DbtCliResource):
     ...
-dbt_resource = DbtCliResource(project_dir=DBT_PROJECT_PATH)
-defs = dg.Definitions(
-    assets=[dbt_load_pricing_data],
+dbt_resource = DbtCliResource(project_dir=DBT_PROJECT_PATH)
+defs = dg.Definitions(
+    assets=[dbt_project_assets],
     resources={"dbt": dbt_resource}
 )

Or restrict to pricing-only selection via dbt CLI args (e.g., select flag).

Also applies to: 39-41

card_data/pipelines/defs/load/load_pricing_data.py (1)

15-21: Prefer explicit schema over dotted table name.

Use schema="staging" and plain table name to avoid driver-quoting issues.

-    table_name: str = "staging.pricing_data"
+    table_name: str = "pricing_data"
@@
-        build_dataframe.write_database(
-            table_name=table_name, connection=database_url, if_table_exists="replace"
-        )
+        build_dataframe.write_database(
+            table_name=table_name,
+            connection=database_url,
+            if_table_exists="replace",
+            schema="staging",
+        )

If you’re on polars 1.34.0+, this signature is supported; confirm locally. Based on learnings.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 812c234 and bd9cf97.

📒 Files selected for processing (4)
  • card_data/pipelines/definitions.py (1 hunks)
  • card_data/pipelines/defs/load/load_pricing_data.py (1 hunks)
  • card_data/pipelines/defs/transformation/transform_data.py (1 hunks)
  • card_data/pipelines/poke_cli_dbt/models/sources.yml (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
card_data/pipelines/defs/transformation/transform_data.py (1)
card_data/pipelines/definitions.py (1)
  • defs (12-15)
card_data/pipelines/defs/load/load_pricing_data.py (1)
card_data/pipelines/utils/secret_retriever.py (1)
  • fetch_secret (8-18)
card_data/pipelines/definitions.py (2)
card_data/pipelines/defs/extract/extract_pricing_data.py (1)
  • build_dataframe (94-110)
card_data/pipelines/defs/load/load_pricing_data.py (1)
  • load_pricing_data (14-25)
🔇 Additional comments (3)
card_data/pipelines/definitions.py (3)

13-16: Merging dbt folder defs with pricing defs: LGTM.


23-28: Schedule and defs registration: LGTM.

The schedule targets the job and assets are registered in defs_pricing.

Optionally validate timezone/Cron in staging before prod flip.

Also applies to: 30-34


18-21: Asset decorations verified—no issues found.

Both build_dataframe and load_pricing_data are properly decorated with @dg.asset. The job selection will correctly recognize these assets and build the dependency graph as intended.

@digitalghost-dev digitalghost-dev merged commit cb66d36 into main Oct 20, 2025
5 of 6 checks passed
@digitalghost-dev digitalghost-dev deleted the 1.7.2 branch October 20, 2025 04:47
@coderabbitai coderabbitai bot mentioned this pull request Nov 3, 2025
@coderabbitai coderabbitai bot mentioned this pull request Nov 25, 2025
@coderabbitai coderabbitai bot mentioned this pull request Dec 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Manully update Egg Group names to match modern games

2 participants