-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Open
Labels
backendgpt4all-backend issuesgpt4all-backend issueschatgpt4all-chat issuesgpt4all-chat issuesenhancementNew feature or requestNew feature or request
Description
Feature Request
This is part of the solution to dealing with broken models that emit stop tokens that are not specified in their HF configuration (#2167) and is also a commonly used feature in other similar apps.
From an earlier related draft by @cebtenzzre
- some models output the wrong EOS token, so this is important
- special tokens show up as blank in output because we use llama_token_to_piece with special=False, so they aren't even considered for our current hardcoded reverse prompts
Related things to fix:
- Don't loop forever when the model generates a special token other than EOS (High CPU usage in chat with Hermes 2 Pro Mistral 7B after generation has finishedย #2167)
- Print special token content
- Honor both EOS tokens specified in the HF model's generation_config.json (requires llama.cpp modification)
benja0x40 and juanjgit
Metadata
Metadata
Assignees
Labels
backendgpt4all-backend issuesgpt4all-backend issueschatgpt4all-chat issuesgpt4all-chat issuesenhancementNew feature or requestNew feature or request