Skip to content

Conversation

@noemotiovon
Copy link
Collaborator

Add support for running operators in eager mode while ACL graph compilation is enabled. This allows bypassing graph execution and directly submitting ops, which is useful for debugging and reducing graph build overhead in certain scenarios.

Make sure to read the contributing guidelines before submitting a PR

@github-actions github-actions bot added documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning Ascend NPU issues specific to Ascend NPUs labels Sep 1, 2025

Converting the matmul weight format from ND to NZ can significantly improve performance on the 310I DUO NPU.

### GGML_CANN_EAGER_MODE
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better called GGML_CANN_ACL_GRAPH or some name like that. Default enable acl graph for perfermance.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! I think when compiling with ACL graph support, ACL graph should be enabled by default, and we use GGML_CANN_DISABLE_ACL_GRAPH to disable it. Would that be more appropriate?

@hipudding hipudding changed the title [CANN] Support eager execution mode under ACL graph compilation CANN: Support eager execution mode under ACL graph compilation Sep 2, 2025
Add support for running operators in eager mode while ACL graph
compilation is enabled. This allows bypassing graph execution
and directly submitting ops, which is useful for debugging and
reducing graph build overhead in certain scenarios.

Signed-off-by: noemotiovon <[email protected]>
Signed-off-by: noemotiovon <[email protected]>
Signed-off-by: noemotiovon <[email protected]>
@hipudding hipudding merged commit 2f85368 into ggml-org:master Sep 2, 2025
49 checks passed
walidbr pushed a commit to walidbr/llama.cpp that referenced this pull request Sep 7, 2025
…org#15712)

* [CANN] Support eager execution mode under ACL graph compilation

Add support for running operators in eager mode while ACL graph
compilation is enabled. This allows bypassing graph execution
and directly submitting ops, which is useful for debugging and
reducing graph build overhead in certain scenarios.

Signed-off-by: noemotiovon <[email protected]>

* fix typo

Signed-off-by: noemotiovon <[email protected]>

* rename to acl_graph_mode

Signed-off-by: noemotiovon <[email protected]>

---------

Signed-off-by: noemotiovon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ascend NPU issues specific to Ascend NPUs documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants