- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.5k
[CANN] Add the n_graph_splits performance metric to llama-bench. #12994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
        
          
                examples/llama-bench/llama-bench.cpp
              
                Outdated
          
        
      | */ | ||
| static std::string get_modelfile_name(const std::string & path_str) { | ||
| namespace fs = std::filesystem; | ||
| fs::path path = path_str; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| fs::path path = path_str; | |
| std::filesystem::path path = path_str; | 
        
          
                examples/llama-bench/llama-bench.cpp
              
                Outdated
          
        
      | * @return Full name of the model. | ||
| */ | ||
| static std::string get_modelfile_name(const std::string & path_str) { | ||
| namespace fs = std::filesystem; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| namespace fs = std::filesystem; | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
        
          
                examples/llama-bench/llama-bench.cpp
              
                Outdated
          
        
      | gpu_info(get_gpu_info()) { | ||
|  | ||
| model_filename = inst.model; | ||
| model_filename = get_modelfile_name(inst.model); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a parameter? Output path or filename.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| It's better to add n_gpus to show how many gpus are used for this test. Ignore this comment. gou_info will show gpu numbers. | 
1279028    to
    8e06ed1      
    Compare
  
    | @slaren I'm very confused why this PR is failing so many CI checks for windows environments. If you have any solution for this please let me know, thanks. The reason for the error is basically something like this.  | 
| On Windows DLL symbols are not exported by default, you need to use  This information maybe could be exposed in a public function added to  | 
| 
 You are right. After studying the relevant materials, the reason we added n_graph_splits is to test the performance of a large batch of models or some specific models through scripts and generate tables. By including this parameter, we can more clearly see which models still have issues with hardware backend support. Otherwise, we would have to extract the number of splits manually. | 
llama-bench is an excellent model testing tool. This PR adds the n_graph_splits (graph partition count) parameter to the output of llama-bench. This parameter provides a more intuitive way to assess how well the hardware backend supports the model, helping developers improve model support.
Here is a comparison of the results with
-o json.Before
After