Skip to content

Update bos_token #4806

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion swift/llm/template/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -1039,8 +1039,16 @@ def _swift_encode(self, inputs: StdTemplateInputs):
idx = all_tokens.index(single_token[0])
bos_token = all_tokens[:idx]
sep_token = all_tokens[idx + 1:]
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto_add_bos is False for Qwen models
So the encode logic of Qwen models will not reach here

Copy link
Contributor Author

@aacedar aacedar Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function will not work for Qwen models because the qwen.py:QwenTemplateMeta class set auto_add_bos = False, it will be raise exception for deepseek model

swift/llm/template/template/qwen.py

@dataclass
class QwenTemplateMeta(ChatmlTemplateMeta):
    default_system: Optional[str] = DEFAULT_SYSTEM
    auto_add_bos: bool = False
    stop_words: List[Word] = field(default_factory=lambda: ['<|endoftext|>'])
    agent_template: str = 'hermes'

1. qwen model tokenizer.encode function parameter add_special_tokens=True/False,both of then not output special token, forexample <|im_start|>, this special token will be added when process system/user/assistant message ,so should judge if bos_token is none/empty or not
2. if bos_token is not none, means that current model should add specal token, old code has two errors:
2.1 besides using bos_token = all_tokens[:idx],will get list as a element for res_context_list, but when execute code `prompts_text.append(''.join(res_context_list))`, will raise expcept
2.2 element of res_context_list should be text(not token id), bos_token = all_tokens[:idx] will get token_id list, this will error when execute tokenizer.encode()(encode token_id)
so we use self.tokenizer.bos_token is the most reasonable and correct
"""
if bos_token:
res_context_list.append(bos_token)
# res_context_list.append(bos_token)
res_context_list.append(self.tokenizer.bos_token)
res_context_types.append(ContextType.OTHER)

if self.template_meta.is_post_system or not system:
Expand Down