Skip to content

Commit 5ec7117

Browse files
author
RageLtMan
committed
XML Matching Improvements and SpecialTokens
Reliance on single/guessed special tokens only goes so far when we use constrained outputs because masking-out a potential EOS token results in infinite generation: we have to account for all possible candidates in the mask which could normally end generation. Add SpecialTokens idiomatic extractor as a starting point for this work and utilize it to feed all EOS tokens to the grammar-building routines. Add the binary for this library element to examples/ for @guoqingbao and other developers to have rapid access to what the SpecialTokens struct actually extracts from any tokenizer.json provided in ARGV0 or from ./tokenizer.json if none are provided. Improve XML tool-sled generation. Remaining issue is potential of XML content within the XML envelope and no ability to mask possibly infinite strings as anything but infinite due to look-ahead and lazy regex tricks from interpreted languages not actually compiling to a finite mask. Use a simple matcher for now, enable env-override by the user while this gets sorted out (if possible) and critically enable the grammar generator to honor tool parser override at the CLI such that `--enforce-parser qwen` produces JSON-constrained schemas which the parser can then consume. XML finite masking tracked under: - guidance-ai/llguidance#306 Multiple EOS token concerns (handled in grammar) under: - guidance-ai/llguidance#304 - guidance-ai/llguidance#305
1 parent fd90c03 commit 5ec7117

File tree

7 files changed

+628
-469
lines changed

7 files changed

+628
-469
lines changed

src/core/engine.rs

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ use crate::transfer::PdRole;
2020
use crate::transfer::Transfer;
2121
use crate::utils::chat_template::Message;
2222
use crate::utils::config::{EngineConfig, EosTokenId, ModelType, SamplingParams};
23+
use crate::utils::special_tokens::SpecialTokens;
2324
use crate::utils::guidance::{build_llg_factory, load_toktrie_from_path};
2425
use crate::utils::heartbeat::heartbeat_worker;
2526
use crate::utils::image::{get_image_config, ImageData, ImageProcessConfig};
@@ -100,6 +101,9 @@ pub struct LLMEngine {
100101
pub model_type: ModelType,
101102
pub tool_config: ToolConfig,
102103
pub img_cfg: Option<ImageProcessConfig>,
104+
/// SpecialTokens parsed once at engine initialization
105+
/// Contains EOS, BOS, and other special token IDs and their string representations
106+
pub special_tokens: Arc<SpecialTokens>,
103107
}
104108

105109
impl LLMEngine {
@@ -466,6 +470,9 @@ impl LLMEngine {
466470
"default".to_string()
467471
};
468472

473+
// Initialize SpecialTokens once at engine startup
474+
let special_tokens = Arc::new(SpecialTokens::new(&tokenizer));
475+
469476
let engine = Arc::new(RwLock::new(Self {
470477
runners,
471478
scheduler,
@@ -488,6 +495,7 @@ impl LLMEngine {
488495
tool_config,
489496
img_cfg,
490497
model_name,
498+
special_tokens,
491499
}));
492500
Self::start_engine(engine.clone());
493501
Ok(engine)

src/server/server.rs

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -525,17 +525,17 @@ pub async fn chat_completion(
525525
if constraint_grammars.is_empty() && !engine_config.enable_tool_grammar {
526526
crate::log_debug!("[llg] No constraint or tool grammar - not setting guidance");
527527
} else {
528-
// Get EOS token IDs from engine scheduler for building TEXT pattern with EOS bounding
528+
// Get SpecialTokens from engine for building TEXT pattern with EOS bounding
529529
let engine = data.engine.read();
530-
let eos_token_ids = engine.scheduler.eos_token_ids();
530+
let special_tokens = &engine.special_tokens;
531531
let llg_grammar = compose_grammars(
532532
constraint_grammars,
533533
tool_gram,
534534
has_tools,
535535
tool_choice_required,
536536
forced_tool_name.clone(),
537537
Some(max_tokens.clone()),
538-
eos_token_ids,
538+
special_tokens,
539539
);
540540
drop(engine); // Explicitly drop the lock guard
541541
let lark_string = get_lark_from_top_level_grammar(&llg_grammar);

src/tools/mod.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ pub struct ToolBuilder {
2323
}
2424

2525
impl ToolBuilder {
26-
fn new(name: String, description: String) -> Self {
26+
pub fn new(name: String, description: String) -> Self {
2727
Self {
2828
name,
2929
description,

0 commit comments

Comments
 (0)