loader: refactor tensor weights storage #9935

kylo5aby · 2024-10-18T09:10:33Z

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

ngxson

IMO this will be quite redundant, because:

get_weights is only used when loading model tensors, so it's not worth optimizing it
Total number of tensors is usually small (usually less than of 1000?), so a linear search is ok in most cases

slaren · 2024-10-28T16:03:32Z

I think the goal of the change is good, the way we use this list of tensors is effectively $O(N^2)$, which may not be a problem right now, but it doesn't take a lot to make an algorithm with this complexity a serious performance problem. However, duplicating the list of weights as a map is very error-prone and will make it harder to work with this code in future. A more ambitious refactor that completely replaces the vector of weights with a map would be welcome.

kylo5aby · 2024-10-31T08:40:29Z

updated. PTAL

slaren · 2024-10-31T18:42:20Z

The side effect of using a unordered_map was that it caused weights to appear in a random order when quantizing a model, which may be confusing. I changed it to a map and added a custom comparator that sorts the weight by layer. As a side benefit, this may reduce memory usage when using mmap since it ensures that the tensors used by the CPU are in a contiguous block at the beginning of the file, which allows unmapping the weights offloaded to a GPU cleanly.

* loader: refactor tensor weights storage * use sorted map, sort weights by layer --------- Co-authored-by: slaren <[email protected]>

ngxson reviewed Oct 18, 2024

View reviewed changes

loader: refactor tensor weights storage

f742e3c

kylo5aby force-pushed the weights branch from c12aef0 to f742e3c Compare October 31, 2024 08:39

kylo5aby changed the title ~~loader: use a map to find tensor by name from tensor weight~~ loader: refactor tensor weights storage Oct 31, 2024

slaren added 2 commits October 31, 2024 19:23

minor style changes

13eba91

use sorted map, sort weights by layer

9a99293

slaren approved these changes Oct 31, 2024

View reviewed changes

slaren merged commit ab3d71f into ggml-org:master Oct 31, 2024
52 checks passed

slaren mentioned this pull request Oct 31, 2024

quantize : fix --keep-split #10114

Merged

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024

loader: refactor tensor weights storage (ggml-org#9935)

d7ae370

* loader: refactor tensor weights storage * use sorted map, sort weights by layer --------- Co-authored-by: slaren <[email protected]>

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024

loader: refactor tensor weights storage (ggml-org#9935)

83ad0ab

* loader: refactor tensor weights storage * use sorted map, sort weights by layer --------- Co-authored-by: slaren <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

loader: refactor tensor weights storage #9935

loader: refactor tensor weights storage #9935

Uh oh!

kylo5aby commented Oct 18, 2024

Uh oh!

ngxson left a comment •

edited

Loading

Uh oh!

slaren commented Oct 28, 2024

Uh oh!

kylo5aby commented Oct 31, 2024

Uh oh!

slaren commented Oct 31, 2024 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

loader: refactor tensor weights storage #9935

loader: refactor tensor weights storage #9935

Uh oh!

Conversation

kylo5aby commented Oct 18, 2024

Uh oh!

ngxson left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

slaren commented Oct 28, 2024

Uh oh!

kylo5aby commented Oct 31, 2024

Uh oh!

slaren commented Oct 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ngxson left a comment •

edited

Loading

slaren commented Oct 31, 2024 •

edited

Loading