Skip to content

Conversation

joseph-sentry
Copy link
Contributor

@joseph-sentry joseph-sentry commented Aug 1, 2025

deps on #354

  • remove TA GQL read from rollup function and tests
  • remove code that did TA GCS rollups from timescale
    • since the API is reading directly from timescale now, we don't need to do this anymore, so i'm deleting the related code
  • remove old CAs
  • update retention policy on new CAs
    • we don't need to store data for all time for the hourly CAs just more recent data, and we want to limit the retention on data in the new CAs

@joseph-sentry joseph-sentry requested a review from a team as a code owner August 1, 2025 21:21
@codecov-notifications
Copy link

codecov-notifications bot commented Aug 1, 2025

Codecov Report

❌ Patch coverage is 85.63218% with 25 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
apps/codecov-api/utils/timescale/test_results.py 77.41% 14 Missing ⚠️
libs/shared/shared/django_apps/migration_utils.py 28.57% 5 Missing ⚠️
...pps/codecov-api/utils/timescale/timescale_utils.py 90.32% 3 Missing ⚠️
...ps/codecov-api/utils/timescale/flake_aggregates.py 91.66% 2 Missing ⚠️
...pps/codecov-api/utils/timescale/test_aggregates.py 96.29% 1 Missing ⚠️

❌ Your patch status has failed because the patch coverage (85.63%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

📢 Thoughts on this report? Let us know!

Copy link

codecov bot commented Aug 1, 2025

Codecov Report

❌ Patch coverage is 85.63218% with 25 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.27%. Comparing base (5f7dee6) to head (f162514).
⚠️ Report is 13 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
apps/codecov-api/utils/timescale/test_results.py 77.41% 14 Missing ⚠️
libs/shared/shared/django_apps/migration_utils.py 28.57% 5 Missing ⚠️
...pps/codecov-api/utils/timescale/timescale_utils.py 90.32% 3 Missing ⚠️
...ps/codecov-api/utils/timescale/flake_aggregates.py 91.66% 2 Missing ⚠️
...pps/codecov-api/utils/timescale/test_aggregates.py 96.29% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #373      +/-   ##
==========================================
+ Coverage   94.17%   94.27%   +0.10%     
==========================================
  Files        1270     1275       +5     
  Lines       47088    46978     -110     
  Branches     1497     1497              
==========================================
- Hits        44344    44289      -55     
+ Misses       2438     2383      -55     
  Partials      306      306              
Flag Coverage Δ
apiunit 96.41% <86.57%> (+0.24%) ⬆️
sharedintegration 39.76% <8.69%> (-0.26%) ⬇️
sharedunit 88.90% <78.26%> (-0.04%) ⬇️
workerintegration 60.59% <100.00%> (-0.09%) ⬇️
workerunit 90.80% <100.00%> (+0.08%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

codspeed-hq bot commented Aug 1, 2025

CodSpeed Performance Report

Merging #373 will not alter performance

Comparing joey/tidy (f162514) with main (5f7dee6)

Summary

✅ 9 untouched benchmarks

I made a mistake in a previous migration where i wanted to run the creation of
an index using timescale's with transaction_per_chunk, which is its alternative
to create index concurrently, and since we let the hypertables be managed by
django i created the index on the model, but django's state was never updated.

so i introduce ts_create_index_on_managed_hypertable to avoid this situation in
the future and introduce a migration to fix the mistake i had made before
this contains the update to the application logic that uses the new CAs we
defined in the previous commit

we're also improving the filtering of skipped tests by using information from
the CAs which allows us to avoid doing more expensive queries during every API
call

we also use window function to calculate slow tests instead of subqueries

I also choose to refactor the code out into modules for each type of data
and a utils module for code reuse
since the API is reading directly from timescale now, we don't need to do this
anymore, so i'm deleting the related code

we're also reading from the new CAs so we can get rid of the old ones
Comment on lines +241 to +249
match parameter:
case "failed_tests":
test_data = test_data.filter(total_fail_count__gt=0)
case "flaky_tests":
test_data = test_data.filter(total_flaky_fail_count__gt=0)
case "skipped_tests":
test_data = test_data.filter(last_outcome="skip")
case _:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential bug: The get_test_results_queryset function applies a redundant filter for "failed_tests" that excludes tests with only flaky failures, leading to incomplete results.
  • Description: The get_test_results_queryset function first calls helper functions that correctly filter for failed tests using Q(total_fail_count__gt=0) | Q(total_flaky_fail_count__gt=0). However, it then applies a second, more restrictive filter total_fail_count__gt=0. This second filter incorrectly removes tests that only have flaky failures (total_flaky_fail_count &gt; 0) but no regular failures (total_fail_count = 0). As a consequence, the "failed_tests" filter will return an incomplete set of results, potentially causing users to overlook certain types of test failures.

  • Suggested fix: Remove the redundant and incorrect filtering logic from the get_test_results_queryset function. The filtering for "failed_tests" should be handled exclusively by the helper functions (get_test_data_queryset_via_ca and get_test_data_queryset_via_testrun), which already implement the correct logic to include both regular and flaky failures.
    severity: 0.6, confidence: 0.95

Did we get this right? 👍 / 👎 to inform future reviews.

Comment on lines +348 to +351
FROM ta_timeseries_test_aggregate_daily
WHERE repo_id = %s
AND timestamp_bin >= %s
AND timestamp_bin < %s
AND bucket_daily > %s
AND bucket_daily <= %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential bug: Raw SQL queries in get_flags_new and get_test_suites_new directly reference new database schema without error handling, risking crashes if migrations have not completed.
  • Description: The get_flags_new and get_test_suites_new functions use raw SQL queries that hardcode references to the new ta_timeseries_test_aggregate_daily table and bucket_daily column. These queries are executed via cursor.execute() without any try...except blocks. If this code is deployed before the corresponding database migrations have successfully run, the queries will fail with "table not found" or "column not found" errors. This will cause the GraphQL resolver to crash, resulting in an unhandled error being returned to the user for critical Test Analytics features.

  • Suggested fix: Wrap the cursor.execute() calls within a try...except block to catch potential DatabaseError exceptions. In the except block, log the error and return an empty result or raise a more specific, handled exception to prevent the entire GraphQL request from crashing. This ensures graceful degradation if the database schema is not yet updated.
    severity: 0.8, confidence: 0.98

Did we get this right? 👍 / 👎 to inform future reviews.

Copy link
Collaborator

@michelletran-sentry michelletran-sentry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just make sure to test this before shipping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants