Add a new `llama_load_model_from_buffer()` method to compliment `llama_load_model_from_file()`

**Note: This issue was copied from [https://github.com/ggml-org/llama.cpp/issues/6311](https://github.com/ggml-org/llama.cpp/issues/6311)**

**Original Author:** @asg017
**Original Issue Number:** #6311
**Created:** 2024-03-26T02:03:02Z

---

# Prerequisites

Please answer the following questions for yourself before submitting an issue.

- [x] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- [x] I carefully followed the [README.md](https://github.com/ggerganov/llama.cpp/blob/master/README.md).
- [x] I [searched using keywords relevant to my issue](https://docs.github.com/en/issues/tracking-your-work-with-issues/filtering-and-searching-issues-and-pull-requests) to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the [Discussions](https://github.com/ggerganov/llama.cpp/discussions), and have a new bug or useful enhancement to share.

# Feature Description

There should be a `llama_load_model_from_buffer()` function added to `llama.h`/`llama.cpp` to compliment `llama_load_model_from_file()`. Instead of loading a model from a file, it should instead read the model from a user-provided buffer. 

# Motivation

I'm working on a tool that can load multiple llama models from different sources. Ideally, I'd like to store these models in a SQLite database, and load them in full from memory. However, since the only way to load llama models is with `llama_load_model_from_file()`, I'll need to serialize them to disk first and pass in a path to that file. That's pretty wasteful, as they are already in memory and don't need to persist them to disk.

In my case, I'm working with small embedding models (10's to 100's of MB), but I'm sure this can be useful for larger models on larger computers. 

# Possible Implementation

Hmm looks like `gguf_init_from_buffer()` has ben commented out from `ggml.h`. So maybe this will be more difficult than I thought?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add a new `llama_load_model_from_buffer()` method to compliment `llama_load_model_from_file()` #280

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add a new llama_load_model_from_buffer() method to compliment llama_load_model_from_file() #280

Description

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Add a new `llama_load_model_from_buffer()` method to compliment `llama_load_model_from_file()` #280