Skip to content

Integrating huggingface chat templates#281

Closed
SamGalanakis wants to merge 5 commits intoeth-sri:mainfrom
SamGalanakis:main
Closed

Integrating huggingface chat templates#281
SamGalanakis wants to merge 5 commits intoeth-sri:mainfrom
SamGalanakis:main

Conversation

@SamGalanakis
Copy link
Copy Markdown

Been having a lot of trouble with chat templates especially when switching between models frequently. This is a very rough implementation of how we might be able to integrate them into lmql using existing jinja templates from huggingface. Essentialy you just pass a jinja template as you do for huggingface tokenizers and when using the existing lmql role tags the appropriate chat template will be applied for you. Any feedback/ ideas welcome.

Can test it with the below script:

import lmql
from transformers import (
    AutoTokenizer,
)

tokenizer_string = "HuggingFaceH4/zephyr-7b-beta"

lmql_model = lmql.model(
    f"llama.cpp:/home/sam-dev/code/vectorizer/models/zephyr-7b-beta.Q5_K_M.gguf",
    endpoint="localhost:8080",
    tokenizer=tokenizer_string,trust_remote_code=True
)

tokenizer = AutoTokenizer.from_pretrained(tokenizer_string)

@lmql.query(model=lmql_model, name="lmql_chat",chat_template=tokenizer.chat_template)
def lmql_chat():
    '''argmax 
        "{:system} You are a bot"
        "{:user} {await input('Write to bot: ')}"
        "{:assistant} [ANSWER]" where len(ANSWER) < 100

    '''



out  = lmql_chat()
print(out.prompt)
    

@lbeurerkellner
Copy link
Copy Markdown
Collaborator

Thanks for starting on this. I made some smaller changes based on your fork and pushed it to branch chat-templates (https://github.com/eth-sri/lmql/tree/chat-templates). Essentially, with my additional changes, you don't have to specify the chat templates anymore (although you can), but it will be automatically inferred from the tokenizer/model used. Apart from this, I think this can almost be merged, there is just one change we have to make:

PromptInterpreter itself must not have any state like self.current_role or self.current_role_end, since it is stateless by design. This is required to enable branching decoders, where the interpreter tracks multiple execution branches (at different levels of progress) at a time.

Instead, all state in PromptInterpreter is encapsulated in class PromptState. Luckily this state is available when we call process_query_string, so we can just pass that in. When modifying these prompt states however, everything is immutable, so please have a look at how state is managed in advance() via updated and make sure we track current_role and current_role_end as part of this state.

Let me know if this makes sense, otherwise I can also have another look.

Thanks a lot.

@SamGalanakis
Copy link
Copy Markdown
Author

Thanks, yeah that makes sense will work on the branch and and let you know.

@lbeurerkellner
Copy link
Copy Markdown
Collaborator

Closing this in favour of the other more advanced PR #293.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants