Commit 6ddeabd
authored
[Benchmark] Generate benchmark record for job failure (#9247)
# Description
compose failure benchmark record
# Related Query File
https://github.com/pytorch/test-infra/blob/main/torchci/clickhouse_queries/oss_ci_benchmark_llms/query.sql
# Details
when a job fails in git_job level, or a device fails in the benchmark
test,
we return a benchmark record to indicate the failure, so that we can
properly render the HUD UI ti distinguish metris not run and metris run
with failures
This pr may introduce `Unknown` to fields in the HUD execubench table
temporarily, will fix it in HUD UI to handle special benchmark value.
for both level of failure, the metric name will be "FAILURE_REPORT".
In HUD, We mainly use the special metric name to identify failure, if
more information needed, we get which level the job fails in
benchmark.extra_info.
## Step Failure
When a failure detected:
we try to extract model info from git_job_name, the step will fail if
the model info cannot be extracted.
# Example of benchmark record for failure benchmark:
## When a job failed at device-job-level,
- device_name: get name from job_report.name . for instance `iPhone 15`
- device_os: job_report.os with prefix "Android" or "iOS". this should
match both android and ios setting
- model.name: extract from git job name
- model.backend: extract from git job name
- metric.name: "FAILURE_REPORT"
```
{
"benchmark": {
"name": "ExecuTorch",
"mode": "inference",
"extra_info": {
"app_type": "IOS_APP",
"job_conclusion": **"FAILED"**,
"failure_type": **"DEVICE_JOB"**,
"job_report": "..."
},
"model": {
"name": **"ic4",**
"type": "OSS model",
"backend": **"mps"**
},
"metric": {
"name": **"FAILURE_REPORT"**,
"benchmark_values": 0,
"target_value": 0,
"extra_info": {
"method": ""
}
},
"runners": [
{
"name": **"iPhone 15"**,
"type": **"iOS 18.0"**,
}
]
}
```
## when a job failed at git-job-level (there is no job_reports)
this happens when a job fails before it runs the benchmark job
- device_name: device_pool_name from git job bane #exmaple:
sumsung_galaxy_22
- device_os: "Android" or "iOS"
- model.name: extract from git job name
- model.backend: extract from git job name
- metric.name: "FAILURE_REPORT"
the failure benchmark record looks like:
```
{
"benchmark": {
"name": "ExecuTorch",
"mode": "inference",
"extra_info": {
"app_type": "IOS_APP",
"job_conclusion": **"FAILURE"**,
"failure_type": **"GIT_JOB"**,
"job_report": "{}"
}
},
"model": {
"name": "ic4",
"type": "OSS model",
"backend": "mps"
},
"metric": {
"name": "FAILURE_REPORT",
...
},
"runners": [
{
"name": "samsung_galaxy_s22",
"type": "Android",
...
}
]
}
```1 parent 77752a4 commit 6ddeabd
File tree
5 files changed
+712
-47
lines changed- .ci/scripts
- .github
- scripts
- workflows
5 files changed
+712
-47
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
263 | 263 | | |
264 | 264 | | |
265 | 265 | | |
266 | | - | |
| 266 | + | |
| 267 | + | |
267 | 268 | | |
268 | 269 | | |
269 | 270 | | |
| |||
0 commit comments