CANN: add optional support for ACL Graph execution #15065

noemotiovon · 2025-08-04T07:02:11Z

This commit adds support for executing ggml computational graphs using Huawei's ACL graph mode via the USE_ACL_GRAPH flag. The support can be enabled at compile time using the CMake option:

-DUSE_ACL_GRAPH=ON

By default, ACL graph execution is disabled, and the fallback path uses node-by-node execution.

Key additions:

CMake option to toggle graph mode
Graph capture and execution logic using
Tensor property matching to determine whether graph update is required
Safe fallback and logging if the environment variable LLAMA_SET_ROWS is unset or invalid

This prepares the backend for performance improvements in repetitive graph execution scenarios on Ascend devices.

noemotiovon · 2025-08-04T07:32:47Z

Model: Qwen2.5-0.5B

	with GGML_OP_CPY	with GGM_OP_SET_ROWS
with ACL Graph	fall back without ACL Graph	120tok/s
without ACL Graph	70tok/s	70tok/s

With ACL Graph Test：

user
Building a website can be done in 10 steps:
assistant
Certainly! Here are 10 steps to help you build a website:

1. **Define Your Purpose and Goals**
   - Identify what you want your website to do.
   - Determine your audience and what they want.

2. **Choose the Right Platform**
   - Select a web development platform that suits your needs.
   - Consider factors like stability, security, and scalability.

3. **Set Up a Domain Name**
   - Choose a domain name that is easy to remember and reflects your website.
   - Ensure it is available and easy to register.

4. **Choose a Domain Hosting Provider**
   - Select a reliable domain hosting service that meets your needs.
   - Consider features like speed, security, and scalability.

5. **Choose a Content Management System (CMS)**
   - Select a CMS that fits your needs and the complexity of your website.
   - Choose based on features, ease of use, and community support.

6. **Design Your Website**
   - Choose a design that is visually appealing and easy to navigate.
   - Consider the style, color scheme, and layout.

7. **Choose a Content Writer**
   - Hire a skilled content writer to create the content.
   - Consider the writer's skills, experience, and style.

8. **Choose a Website Builder**
   - Select a website builder that suits your needs.
   - Choose based on features, ease of use, and scalability.

9. **Implement Your Website**
   - Use your chosen content management system and website builder to create your website.
   - Test the website thoroughly for errors and usability.

10. **Launch and Promote Your Website**
    - Launch your website and start promoting it.
    - Engage with your audience through social media, email marketing, and other channels.

11. **Monitor and Update**
    - Monitor the performance of your website.
    - Keep your website updated with new content and features.

12. **Refine and Improve**
    - Regularly review and refine your website content and design.
    - Stay updated with the latest trends and best practices in web development.

These steps should help you build a website that meets your needs and helps drive traffic to your site.

> 
llama_perf_sampler_print:    sampling time =     117.75 ms /   469 runs   (    0.25 ms per token,  3982.91 tokens per second)
llama_perf_context_print:        load time =    7527.43 ms
llama_perf_context_print: prompt eval time =      40.20 ms /    20 tokens (    2.01 ms per token,   497.49 tokens per second)
llama_perf_context_print:        eval time =    3754.24 ms /   448 runs   (    8.38 ms per token,   119.33 tokens per second)
llama_perf_context_print:       total time =    4803.95 ms /   468 tokens
llama_perf_context_print:    graphs reused =        433

hipudding

Thank you for your contribution. Enabling acl_graph support greatly reduces NPU idle cycles, with particularly noticeable performance gains on models with smaller parameter sizes.
The code is basically fine; only a few minor changes are needed.

ggml/src/ggml-cann/CMakeLists.txt

ggml/src/ggml-cann/common.h

ggml/src/ggml-cann/ggml-cann.cpp

hipudding · 2025-08-05T07:34:34Z

ggml/src/ggml-cann/ggml-cann.cpp

-        if (ggml_is_empty(node) || node->op == GGML_OP_NONE) {
-            continue;
+#ifdef USE_CANN_GRAPH
+    bool use_cann_graph = true;


I think use_cann_graphis not necessary. You need only set cann_graph_update_required = true for both acl_graph on and off.

I think use_cann_graph is still needed. Even when cann_graph is enabled, we may fall back to eager mode if LLAMA_SET_ROWS is not set. So we need a way to track that cann_graph is not being used in this case.

This commit adds support for executing ggml computational graphs using Huawei's ACL graph mode via the USE_CANN_GRAPH flag. The support can be enabled at compile time using the CMake option: -DUSE_CANN_GRAPH=ON By default, ACL graph execution is **disabled**, and the fallback path uses node-by-node execution. Key additions: - CMake option to toggle graph mode - Graph capture and execution logic using - Tensor property matching to determine whether graph update is required - Safe fallback and logging if the environment variable LLAMA_SET_ROWS is unset or invalid This prepares the backend for performance improvements in repetitive graph execution scenarios on Ascend devices. Signed-off-by: noemotiovon <[email protected]>

Signed-off-by: noemotiovon <[email protected]>

github-actions bot added ggml changes relating to the ggml tensor library for machine learning Ascend NPU issues specific to Ascend NPUs labels Aug 4, 2025

noemotiovon force-pushed the sup_acl_graph branch from 07c34de to 794511d Compare August 4, 2025 07:18

hipudding self-requested a review August 4, 2025 09:08

noemotiovon mentioned this pull request Aug 5, 2025

[CANN]Support Acl Graph #13915

Closed

noemotiovon force-pushed the sup_acl_graph branch from 794511d to a8dc6ac Compare August 5, 2025 01:34

noemotiovon mentioned this pull request Aug 5, 2025

Llama.cpp acl graph 接入 cosdt/llama.cpp#18

Closed

hipudding reviewed Aug 5, 2025

View reviewed changes

hipudding changed the title ~~feat(cann): add optional support for ACL Graph execution~~ CANN: add optional support for ACL Graph execution Aug 5, 2025

noemotiovon added 3 commits August 6, 2025 02:41

Fix review comments

2022946

Signed-off-by: noemotiovon <[email protected]>

remane USE_CANN_GRAPH to USE_ACL_GRAPH

fed26f7

Signed-off-by: noemotiovon <[email protected]>

noemotiovon force-pushed the sup_acl_graph branch from 24e17d5 to fed26f7 Compare August 6, 2025 03:00

fix typo

f927644

Signed-off-by: noemotiovon <[email protected]>

hipudding approved these changes Aug 6, 2025

View reviewed changes

hipudding merged commit 2241453 into ggml-org:master Aug 6, 2025
48 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CANN: add optional support for ACL Graph execution #15065

CANN: add optional support for ACL Graph execution #15065

Uh oh!

noemotiovon commented Aug 4, 2025 •

edited

Loading

Uh oh!

noemotiovon commented Aug 4, 2025 •

edited

Loading

Uh oh!

hipudding left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hipudding Aug 5, 2025 •

edited

Loading

Uh oh!

noemotiovon Aug 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CANN: add optional support for ACL Graph execution #15065

CANN: add optional support for ACL Graph execution #15065

Uh oh!

Conversation

noemotiovon commented Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

noemotiovon commented Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hipudding left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hipudding Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

noemotiovon Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

noemotiovon commented Aug 4, 2025 •

edited

Loading

noemotiovon commented Aug 4, 2025 •

edited

Loading

hipudding Aug 5, 2025 •

edited

Loading