Skip to content

Conversation

@kylo5aby
Copy link
Contributor

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this will be quite redundant, because:

  • get_weights is only used when loading model tensors, so it's not worth optimizing it
  • Total number of tensors is usually small (usually less than of 1000?), so a linear search is ok in most cases

@slaren
Copy link
Member

slaren commented Oct 28, 2024

I think the goal of the change is good, the way we use this list of tensors is effectively $O(N^2)$, which may not be a problem right now, but it doesn't take a lot to make an algorithm with this complexity a serious performance problem. However, duplicating the list of weights as a map is very error-prone and will make it harder to work with this code in future. A more ambitious refactor that completely replaces the vector of weights with a map would be welcome.

@kylo5aby kylo5aby changed the title loader: use a map to find tensor by name from tensor weight loader: refactor tensor weights storage Oct 31, 2024
@kylo5aby
Copy link
Contributor Author

updated. PTAL

@slaren
Copy link
Member

slaren commented Oct 31, 2024

The side effect of using a unordered_map was that it caused weights to appear in a random order when quantizing a model, which may be confusing. I changed it to a map and added a custom comparator that sorts the weight by layer. As a side benefit, this may reduce memory usage when using mmap since it ensures that the tensors used by the CPU are in a contiguous block at the beginning of the file, which allows unmapping the weights offloaded to a GPU cleanly.

@slaren slaren merged commit ab3d71f into ggml-org:master Oct 31, 2024
52 checks passed
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
* loader: refactor tensor weights storage

* use sorted map, sort weights by layer

---------

Co-authored-by: slaren <[email protected]>
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024
* loader: refactor tensor weights storage

* use sorted map, sort weights by layer

---------

Co-authored-by: slaren <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants