Fix eval bug, need to do full hourly eval #28

amrit110 · 2025-12-04T19:24:07Z

This pull request introduces significant improvements to the evaluation and prediction workflow, focusing on enhanced logging for traceability, improved documentation and usage instructions for scripts, and added debugging output for metric computation. Additionally, a new utility script is provided for clearing static predictions in BigQuery. These changes collectively improve transparency, reproducibility, and reliability in the evaluation pipeline.

Logging and Debugging Enhancements

Added detailed logging throughout the get_static_evaluation and get_dynamic_evaluation endpoints in backend/app/main.py, including start/end markers, evaluation period, metric values, and error handling for better traceability and debugging. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]
Added verbose debugging output to compute_metrics_for_period in src/gaca_ews/evaluation/storage.py to print query parameters, per-horizon metrics, accumulation steps, and final overall metrics, aiding in diagnosing evaluation results. [1] [2] [3] [4]

Script Improvements and New Utilities

Added a new script scripts/clear_static_predictions.py to safely delete static evaluation predictions from BigQuery for a specified period, with row counting, confirmation prompts, and clear user messaging.
Improved documentation and usage instructions in scripts/generate_historical_predictions.py, emphasizing the importance of using hourly intervals (--interval 1) to avoid diurnal bias in evaluation metrics.
Changed the default prediction interval for batch prediction in both the CLI (src/gaca_ews/cli/main.py) and script (scripts/generate_historical_predictions.py) from 24 hours to 1 hour, with updated help messages and docstrings to guide users toward best practices for unbiased evaluation. [1] [2] [3]

Fix eval bug, need to do full hourly eval

b1c8456

amrit110 self-assigned this Dec 4, 2025

amrit110 added bug Something isn't working enhancement New feature or request labels Dec 4, 2025

amrit110 merged commit 2b9e870 into main Dec 4, 2025
10 checks passed

amrit110 deleted the fix_bug_eval branch December 4, 2025 19:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix eval bug, need to do full hourly eval #28

Fix eval bug, need to do full hourly eval #28

Uh oh!

amrit110 commented Dec 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix eval bug, need to do full hourly eval #28

Fix eval bug, need to do full hourly eval #28

Uh oh!

Conversation

amrit110 commented Dec 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants