Skip to content

Conversation

@wangweixuan
Copy link

Description

Motivation: This PR refactors the LRU cache responsible for mapping GGML graph properties to CANN graphs. We have reorganized the graph property-related code from the lengthy ggml-cann.cpp file into dedicated structures ggml_graph_node_properties, ggml_cann_graph, and ggml_cann_graph_lru_cache.

Change summary: This change aims to improve code clarity by separating concerns. There are no functional changes. The code has been formatted with clang-format.

Testing

  • Regression tests on inference using ACL graph.

Notes

This PR is based on a cherry-pick of existing LRU cache implementation ggml-org#15814. The original changes are in the last commit.

CC @noemotiovon

noemotiovon and others added 4 commits December 4, 2025 13:53
* CANN: implement LRU cache for ACL graphs in CANN backend

- Introduce ggml_cann_graph_lru_cache to store multiple ggml_cann_graph objects.
- Graphs are loaded on demand and evicted using LRU policy when capacity is exceeded.
- Updated push, move_to_front, and clear methods to manage cached graphs efficiently.
- Ensures reuse of graphs, reducing graph reconstruction overhead in CANN backend.

* fix typo

* The LRU cache capacity can be configured via an env variable

Signed-off-by: noemotiovon <[email protected]>

* refactory acl graph

* refactory && fix review comments

Signed-off-by: noemotiovon <[email protected]>

---------

Signed-off-by: noemotiovon <[email protected]>
* CANN: improve ACL graph matching

Record `ne` and `nb` information for src tensors and include them in the
graph matching check. This enhances the robustness of ACL graph matching
by preventing incorrect matches when src tensors share the same data
address but differ in shape or stride.

* CANN: add op_params match
* CANN: Refactor `evaluate_and_capture_cann_graph`

**Description of the problem**

* `matched_graph` is obtained even if graph mode is disabled.
* End of graph capture and graph replay are unnecessarily placed in different `if` blocks.

**Proposed solution**

* Obtain `matched_graph` only if graph mode is enabled.
* Place end of graph capture and graph reply inside the same `if` block.
* Unify graph related comments.

* Remove trailing whitespace
**Description of the problem**

`cann_graph_update_required` is redundantly defined and
initialized as `false` inside two mutually exclusive macro branches.

**Proposed solution**

Define it right before the macro so that it could serve both
branches.
@wangweixuan wangweixuan force-pushed the acl_graph_cache_refactor branch from 1c8ddc8 to 79d3f3d Compare December 4, 2025 06:02
Move the graph property checking code into methods of LRU cache.

Signed-off-by: Wang Weixuan <[email protected]>
@wangweixuan wangweixuan force-pushed the acl_graph_cache_refactor branch from 79d3f3d to c406a52 Compare December 4, 2025 06:07
@noemotiovon
Copy link
Owner

LGTM,我的master落后上游很多,可以直接贡献到上游社区嘛

@wangweixuan
Copy link
Author

LGTM,我的master落后上游很多,可以直接贡献到上游社区嘛

I have opened new PR: ggml-org#17752

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants