Skip to content

Initial S3 timeseries tests#108

Merged
abkfenris merged 5 commits intomainfrom
initial_s3_pipeline
Mar 4, 2026
Merged

Initial S3 timeseries tests#108
abkfenris merged 5 commits intomainfrom
initial_s3_pipeline

Conversation

@abkfenris
Copy link
Member

@abkfenris abkfenris commented Dec 15, 2025

Adds some initial scaffolding for testing S3 timeseries pipelines.

This pipeline is likely to end up pretty complex in order to handle all the different ways that our data providers will provide us data, so we need to be able to make sure we don’t have major regressions in existing data processing when we add new capabilities.

This additionally adds Makefile commands to help out with testing. make test-all to run all of our tests, or make test-<thing> to try to test something specific.

Works on #101


This is part 1 of 3 in a stack made with GitButler:

@codecov
Copy link

codecov bot commented Dec 15, 2025

Codecov Report

❌ Patch coverage is 79.57746% with 29 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.34%. Comparing base (3f64732) to head (5b284b9).
⚠️ Report is 29 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
pipeline/s3_timeseries/pipeline.py 72.00% 14 Missing ⚠️
pipeline/s3_timeseries/tests/test_empire_met.py 80.82% 14 Missing ⚠️
common/test_utils/sensors.py 83.33% 1 Missing ⚠️

❌ Your patch status has failed because the patch coverage (50.00%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff             @@
##             main     #108       +/-   ##
===========================================
+ Coverage   53.20%   76.34%   +23.13%     
===========================================
  Files          19       29       +10     
  Lines         374      816      +442     
  Branches       19       21        +2     
===========================================
+ Hits          199      623      +424     
- Misses        171      191       +20     
+ Partials        4        2        -2     
Flag Coverage Δ
common 52.23% <0.00%> (-0.98%) ⬇️
hohonu 63.00% <50.00%> (?)
s3_timeseries 63.29% <79.57%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@abkfenris abkfenris force-pushed the initial_s3_pipeline branch from 37b39e5 to f020a50 Compare January 9, 2026 01:35
@abkfenris abkfenris added this to the Alpha milestone Jan 16, 2026
@abkfenris
Copy link
Member Author

@cheryldmorse I didn't feel like I got this to a ready to review/land state before switching over to #109 so let me know if you have any questions, want to take it over, or want me to do anything specific before you run with #101

@abkfenris
Copy link
Member Author

abkfenris commented Jan 16, 2026

Ok, now at least the tests are running enough to fail instead of failing because they weren't found.

xref #119

@abkfenris abkfenris force-pushed the initial_s3_pipeline branch from 20316dc to e48de07 Compare January 24, 2026 22:22
@abkfenris abkfenris marked this pull request as ready for review January 24, 2026 22:27
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds initial testing infrastructure for S3 timeseries pipelines, including test scaffolding, sensor implementation for detecting new S3 files, and Makefile commands to facilitate testing across the project.

Changes:

  • Adds comprehensive test infrastructure with pytest configuration, fixtures, and test data for Empire met station
  • Implements S3 sensor to automatically detect and process new files uploaded to S3 buckets
  • Migrates from pixi.toml to pyproject.toml for better Python ecosystem integration
  • Adds Makefile targets (test-all, test-s3-timeseries, etc.) for easier test execution

Reviewed changes

Copilot reviewed 13 out of 16 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
pipeline/s3_timeseries/tests/test_empire_met.py New test file with fixtures and tests for sensor, daily, and monthly assets
pipeline/s3_timeseries/tests/test_data/empire_met/*.csv,.nc Snapshot test data files for Empire met station
pipeline/s3_timeseries/tests/conftest.py Pytest configuration adding --aws flag for integration tests
pipeline/s3_timeseries/pyproject.toml New project config replacing pixi.toml with dependencies and pytest settings
pipeline/s3_timeseries/pixi.toml Removed in favor of pyproject.toml
pipeline/s3_timeseries/pipeline.py Adds S3 sensor implementation and data processing improvements
pipeline/s3_timeseries/Dockerfile Updated to copy pyproject.toml and test files
common/test_utils/sensors.py New utility for extracting sensors from Definitions
common/test_utils/init.py Exports get_sensor_by_name utility
Makefile Adds test-common, test-hohonu, test-s3-timeseries, and test-all targets
README.md Updated testing documentation with new commands
.gitignore Adds junit.xml to ignored files
.github/workflows/pipeline_s3_timeseries.yml Enables test execution with coverage reporting in CI

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

client = boto3.client(
"s3",
aws_access_key_id=s3_credentials.access_key_id,
aws_secret_access_key=s3_credentials.secret_access_key,
Copy link

Copilot AI Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The boto3 client is created without specifying a region_name parameter. This could lead to unexpected behavior or errors depending on the environment configuration. The S3FSResource fixture uses "us-east-1", but the sensor's boto3 client in line 241-245 does not specify a region, which is inconsistent. Consider adding a region_name parameter to the boto3.client() call for consistency and to ensure the sensor works correctly in all environments.

Suggested change
aws_secret_access_key=s3_credentials.secret_access_key,
aws_secret_access_key=s3_credentials.secret_access_key,
region_name="us-east-1",

Copilot uses AI. Check for mistakes.
Copy link

Copilot AI commented Jan 25, 2026

@abkfenris I've opened a new pull request, #122, to work on those changes. Once the pull request is ready, I'll request review from you.

@abkfenris abkfenris merged commit 09aa7b6 into main Mar 4, 2026
13 of 16 checks passed
@abkfenris abkfenris deleted the initial_s3_pipeline branch March 4, 2026 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants