-
Notifications
You must be signed in to change notification settings - Fork 102
[TD] Script to determine which reverts are caused by bad TD #6911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Skipped Deployment
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lintrunner found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
3c1b952
to
dbeace0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a script to analyze reverts and determine which were caused by bad TD (Test Dependency) exclusions, along with making the ClickHouse client thread-safe.
Key changes:
- Adds a comprehensive script to analyze Git commit history and correlate reverts with TD exclusions
- Removes the
@lru_cache
decorator fromget_clickhouse_client()
to enable thread-safe usage - Implements parallel processing using
ThreadPoolExecutor
for performance optimization
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
File | Description |
---|---|
tools/torchci/td/get_reverts_caused_by_td.py | New script that analyzes Git history, queries ClickHouse for job failures, and determines if reverts were caused by bad TD exclusions |
tools/torchci/clickhouse.py | Removes @lru_cache decorator from get_clickhouse_client() to make it thread-safe for concurrent usage |
month_groups[month] = (0, 0) | ||
month_groups[month] = (month_groups[month][0] + 1, month_groups[month][1]) | ||
for commit in commits_reverted: | ||
month = commit.timestamp_of_merge // (30 * 24 * 60 * 60) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line uses timestamp_of_merge
but the loop is iterating over commits_reverted
, not caused_by_bad_td
like the previous loop. This will incorrectly group all reverted commits by their merge timestamp instead of revert timestamp for the month calculation.
month = commit.timestamp_of_merge // (30 * 24 * 60 * 60) | |
month = commit.timestamp_of_revert // (30 * 24 * 60 * 60) |
Copilot uses AI. Check for mistakes.
failed_test=get_test_file(line), | ||
) | ||
) | ||
del futures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Explicit deletion of futures
is unnecessary as it will be garbage collected when it goes out of scope. This adds clutter without benefit.
del futures |
Copilot uses AI. Check for mistakes.
f"found last pr sha != alt, {commit.last_pr_sha} != {alt_last_pr_sha[0]}" | ||
) | ||
bad += 1 | ||
if commit.last_pr_sha is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If alt_last_pr_sha[0]
is empty string (when no matching commit is found), setting commit.last_pr_sha
to an empty string could cause issues in subsequent queries. Consider checking if alt_last_pr_sha[0]
is not empty before assignment.
if commit.last_pr_sha is None: | |
if commit.last_pr_sha is None and alt_last_pr_sha[0]: |
Copilot uses AI. Check for mistakes.
x.revert_commit_sha, | ||
x.merge_commit_sha, | ||
x.merge_commit_sha_prev, | ||
x.last_pr_sha, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] This commented-out line should either be removed if it's not needed or properly documented with a comment explaining why it's disabled.
x.last_pr_sha, | |
x.last_pr_sha, | |
# Disabled: revert_commit_sha_prev is not always available or needed for this analysis |
Copilot uses AI. Check for mistakes.
Honestly a pretty messy script
Determines a commit was caused by bad TD, basically:
Obviously not perfect, but spot checking it does seem to be ok
Counts are pretty volatile if granularity is too small (week granularity is very volatile), so I'm not sure if this can really be displayed. Maybe should make this periodic and still upload to clickhouse? idk
End of the output looks like this but theres a lot of extra output before it to help debug
Also make clickhouse.py client able to be used in thread pool executor (http client didn't like having 1 http client for multiple threads i think)