Skip to content

Commit 2a52d66

Browse files
authored
[ez][TD] Fix historical failure json generations (#7157)
Example failure: https://github.com/pytorch/test-infra/actions/runs/17611845434/job/50035433293 ``` raise OperationalError(err_str) if retried else DatabaseError(err_str) from None clickhouse_connect.driver.exceptions.DatabaseError: HTTPDriver for *** received ClickHouse error code 241 Error: Process completed with exit code 1. ``` When running in the console, it OOMs My solution is to filter the workflow tables so there are fewer things to scan and join
1 parent 3536f9a commit 2a52d66

File tree

2 files changed

+4
-0
lines changed

2 files changed

+4
-0
lines changed

tools/torchci/td/historical_class_failure_correlation.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,8 @@
2222
where
2323
t.file is not null
2424
and t.time_inserted > CURRENT_TIMESTAMP() - interval 90 day
25+
# Slightly more relaxed time window just in case
26+
and j.started_at > now() - interval 100 day
2527
"""
2628

2729

tools/torchci/td/historical_file_failure_correlation.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,8 @@
1919
where
2020
t.metric_name = 'td_test_failure_stats_v2'
2121
and t.timestamp > CURRENT_TIMESTAMP() - interval 90 day
22+
# Slightly more relaxed time window just in case
23+
and w.created_at > now() - interval 100 day
2224
"""
2325

2426

0 commit comments

Comments
 (0)