- 
                Notifications
    
You must be signed in to change notification settings  - Fork 710
 
Fix reporting backends and dtyep to benchmark results #6023
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
          
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6023
 Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New FailuresAs of commit 22b823e with merge base b118d8e ( NEW FAILURES - The following jobs have failed:
 
 This comment was automatically generated by Dr. CI and updates every 15 minutes.  | 
    
| 
           @guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.  | 
    
c462b4e    to
    9e0d88f      
    Compare
  
    | 
           @guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.  | 
    
9e0d88f    to
    698e137      
    Compare
  
    | 
           @guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.  | 
    
e0f7b35    to
    b370ae7      
    Compare
  
    | 
           @guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.  | 
    
ef92cb7    to
    29aea6e      
    Compare
  
    29aea6e    to
    d515415      
    Compare
  
    | 
           I test this out in #5982, so its regex takes the first word as the model name, the last one as the dtype. And everything in between will be the backend. It's as as good as having proper JSON output from device, but I guess this will do for now  | 
    
| 
           Also, I learn from https://github.com/pytorch/executorch/pull/5710/files#r1788458509 that changing the export name might cause unexpected failures because some names are hardcoded in the repo. It's a good idea to double check them.  | 
    
          
 Yeah, I reverted the changes that renames the exported artifacts directly but append the dtype suffix in the test script instead. It works for now, we can clean it up later  | 
    
| 
           @guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.  | 
    
d515415    to
    c691225      
    Compare
  
    | 
           @guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.  | 
    
c691225    to
    22b823e      
    Compare
  
    | 
           @guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.  | 
    
| 
           @pytorchbot cherry-pick --onto release/0.4 -c fixnewfeature  | 
    
Summary: Couple minor fixes for reporting the benchmarking results: - qnn models are not reporting "backend" and "dtype" info in the benchmark_results.json (Android) - tinyllama mdoel is not reporting "backend" and "dtype" info in the benchmark_results.json (Android) - include compute precision to the exported coreml model name - rename "llama2" to "tinyllama" to eliminate confusion (many people thought it was llama2-7b) Pull Request resolved: #6023 Reviewed By: huydhn Differential Revision: D64074262 Pulled By: guangy10 fbshipit-source-id: c6c53d004c4fb3ad410a792639af2c22a6978b67 (cherry picked from commit 012cba9)
          Cherry picking #6023The cherry pick PR is at #6073 and it is recommended to link a fixnewfeature cherry pick PR with an issue. The following tracker issues are updated: Details for Dev Infra teamRaised by workflow job | 
    
Fix reporting backends and dtyep to benchmark results (#6023) Summary: Couple minor fixes for reporting the benchmarking results: - qnn models are not reporting "backend" and "dtype" info in the benchmark_results.json (Android) - tinyllama mdoel is not reporting "backend" and "dtype" info in the benchmark_results.json (Android) - include compute precision to the exported coreml model name - rename "llama2" to "tinyllama" to eliminate confusion (many people thought it was llama2-7b) Pull Request resolved: #6023 Reviewed By: huydhn Differential Revision: D64074262 Pulled By: guangy10 fbshipit-source-id: c6c53d004c4fb3ad410a792639af2c22a6978b67 (cherry picked from commit 012cba9) Co-authored-by: Guang Yang <[email protected]>
Couple minor fixes for reporting the benchmarking results: