Skip to content

changelog : libllama API #246

@jakexcosme

Description

@jakexcosme

Note: This issue was copied from ggml-org#9289

Original Author: @ggerganov
Original Issue Number: ggml-org#9289
Created: 2024-09-03T06:48:45Z


Overview

This is a list of changes to the public interface of the llama library. Collaborators are encouraged to edit this post in order to reflect important changes to the API that end up merged into the master branch.

If you are building a 3rd party project that relies on libllama, it is recommended to follow this issue and check it before upgrading to new versions.

See also:

Recent API changes (most recent at the top)

version PR desc
TBD ggml-org#15665 Remove llama_sampler_init_softmax() + dist sampler no longer implicitly sorts
b6239 ggml-org#15472 Remove llama_kv_self_... API
b6157 ggml-org#15293 Add llama_state_seq_..._ext API
b5913 ggml-org#14363 Update llama_context_params - add bool kv_unified
b5740 ggml-org#13037 Update llama_model_quantize_params
b5870 ggml-org#14631 Remove enum llama_vocab_pre_type
b5435 ggml-org#13653 Remove llama_kv_cache_view_* API
b5429 ggml-org#13194 Update llama_context_params - add bool swa_full
b5311 ggml-org#13284 Update llama_context_params - remove logits_all + rearrange flags
b5125 ggml-org#12511 Update llama_model_quantize_params
b5028 ggml-org#11397 Update llama_model_params
b4882 ggml-org#12181 Change llama_kv_cache_... -> llama_kv_self_...
b4599 ggml-org#9639 Add llama_sampler_init_grammar_lazy to support lazy grammars w/ trigger words & tokens
b4524 ggml-org#11016 Add name parameter to llama_model_chat_template (uses default template if NULL)
b4501 ggml-org#11262 Remove rpc_servers from llama_model and llama_model_params
b4464 ggml-org#11110 Add llama_vocab and rename various structs and calls
b4424 ggml-org#11063 Update llama_model API naming
b4357 ggml-org#10784 Remove llama_model_get_tensor()
b4337 ggml-org#10803 Change llama_sampler_init_penalties()
b4282 ggml-org#10446 Remove support for Q4_0_N_M model files in favor of automatic repacking of Q4_0
b4167 ggml-org#10497 Add devices to llama_model_params
b3948 ggml-org#9897 Deprecate softmax sampler and update dist sampler`
b3988 ggml-org#10071 Remove Tail-Free sampling
b3943 ggml-org#9745 Remove all_pos_0, all_pos_1, all_seq_id from llama_batch
b3908 ggml-org#9798 Update FIM-related API
b3841 ggml-org#9510 Add LLAMA_POOLING_TYPE_RANK
b3774 ggml-org#9512 Add llama_n_head()
b3750 ggml-org#9355 Add llama_perf API + param to disable internal profiling
b3749 ggml-org#9445 Add llama_sampler_chain_remove()
b3681 ggml-org#9294 Major changes to the sampling API (see PR for more info)
b3651 ggml-org#8980 Add LLAMA_VOCAB_TYPE_RWKV enum value
b3644 ggml-org#8672 Add llama_threadpool API + change uint32_t -> int32_t
b3614 ggml-org#8526 Add llama_model_is_recurrent

For older changes, use:

git log --oneline -p b3614 -- include/llama.h

(For collaborators) To link between PR number vs Build number:

git log --oneline | tail -r | nl

Upcoming API changes

  • TBD

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationroadmap

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions