Skip to content

Commit 3b4f6c0

Browse files
committed
llama : update comments for llama_decode/llama_encode
ggml-ci
1 parent 97b975d commit 3b4f6c0

File tree

1 file changed

+7
-2
lines changed

1 file changed

+7
-2
lines changed

include/llama.h

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -924,14 +924,19 @@ extern "C" {
924924
// Frees a batch of tokens allocated with llama_batch_init()
925925
LLAMA_API void llama_batch_free(struct llama_batch batch);
926926

927-
// Processes a batch of tokens with the ecoder part of the encoder-decoder model.
928-
// Stores the encoder output internally for later use by the decoder cross-attention layers.
927+
// Process a batch of tokens.
928+
// In contrast to llama_decode() - this call does not use KV cache.
929+
// For encode-decoder contexts, processes the batch using the encoder.
930+
// Can store the encoder output internally for later use by the decoder's cross-attention layers.
929931
// 0 - success
930932
// < 0 - error. the KV cache state is restored to the state before this call
931933
LLAMA_API int32_t llama_encode(
932934
struct llama_context * ctx,
933935
struct llama_batch batch);
934936

937+
// Process a batch of tokens.
938+
// Requires KV cache.
939+
// For encode-decoder contexts, processes the batch using the decoder.
935940
// Positive return values does not mean a fatal error, but rather a warning.
936941
// 0 - success
937942
// 1 - could not find a KV slot for the batch (try reducing the size of the batch or increase the context)

0 commit comments

Comments
 (0)