Skip to content

Conversation

@okaris
Copy link

@okaris okaris commented Jun 24, 2025

llama : expose C API to get layer device type

Adds llama_model_dev_layer(model, il) to retrieve the backend device type (CPU, GPU, ACCEL) for a given layer. This allows consumers (e.g. Python bindings) to inspect layer placement without exposing internal structures.

Implements an int32_t-based interface matching the pattern of existing public API functions such as llama_model_n_layer.

Notes:

  • Follows naming convention: llama_model_dev_layer
  • Pure C-compatible API (int32_t return)
  • No changes to third-party deps, no extra headers
  • Scoped, focused, standalone change
  • Adheres to existing formatting and structure
  • Useful for inspection/debugging/perf profiling via bindings

@okaris okaris force-pushed the master branch 4 times, most recently from 162ce47 to f13fa9b Compare June 27, 2025 06:52
@okaris
Copy link
Author

okaris commented Sep 6, 2025

up

@okaris
Copy link
Author

okaris commented Sep 6, 2025

@ngxson @slaren @danbev not sure who to tag

@ggerganov
Copy link
Member

I think this or similar change was discussed at some point - it does not belong to the public API. From the points that you listed:

Useful for inspection/debugging/perf profiling via bindings

This is the only meaningful one and it means that this is just for debugging purposes which makes it not suitable for the public API of the library. Won't merge unless there is a better reasoning to need such an API.

@okaris
Copy link
Author

okaris commented Sep 7, 2025

@ggerganov I understand the concern about keeping the public API minimal. My point is that there’s currently no way to confirm if a model actually (fully) ended up on the chosen accelerator. Since that decision is user driven, some form of feedback feels necessary. If a new function is not acceptable, could this information be returned from the load methods instead? I’m happy to adjust the contribution in whatever way fits best.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants