Skip to content

apply_chat_template method pessimizes perf at the interface layer, forcing 2 copies #641

@vlovich

Description

@vlovich

The function is defined as taking an Option<String> for tmpl but it only ever references the string as a reference since it immediately creates a CString copy internally for the purposes of calling the apply_chat_template function. Option<&str> would be better avoid forcing a copy by the caller just to give it an owned str.

Option<CStr> would actually be best - that way the caller could be responsible for allocating the CString once & then the chat template could be generated repeatedly using 1 in-memory copy of the template across all completions.

Given that the templates can be quite large, this isn't nothing although within the context of chat completion it's probably an overall minor speedup / memory reduction.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions