Skip to content

Conversation

@jan-service-account
Copy link

Updates dev branch with latest release (b6360) from ggml-org/llama.cpp

hipudding and others added 9 commits September 2, 2025 14:05
…org#15712)

* [CANN] Support eager execution mode under ACL graph compilation

Add support for running operators in eager mode while ACL graph
compilation is enabled. This allows bypassing graph execution
and directly submitting ops, which is useful for debugging and
reducing graph build overhead in certain scenarios.

Signed-off-by: noemotiovon <[email protected]>

* fix typo

Signed-off-by: noemotiovon <[email protected]>

* rename to acl_graph_mode

Signed-off-by: noemotiovon <[email protected]>

---------

Signed-off-by: noemotiovon <[email protected]>
Previously, the slope tensor was set to fp16 to improve efficiency.
While this worked correctly in FA, it caused precision issues in soft_max.
This change applies different data types for different operators
to balance both accuracy and performance.
@jan-service-account jan-service-account merged commit 9e5345f into dev Sep 3, 2025
9 checks passed
@jan-service-account jan-service-account deleted the update-dev-from-master-2025-09-03-00-31 branch September 3, 2025 00:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants