Skip to content

stop_regex not working as expected #1328

@luk400

Description

@luk400

The bug
Hi, i observed bad output quality and artifacts in a JSON generator that i created using guidance. I narrowed the problem down and it seems generation doesn't end when I would expect it to using the stop_regex argument. (I hope this is not just a misunderstanding on my part, but I do not think so)

To Reproduce
I narrowed it down to the following minimal example.

For debugging I added the following snippet in line 283 in guidance/models/_engine/_engine.py, to print the top tokens of the model

top_indices = np.argsort(logits[-1, :])[-5:][::-1]  # Get top 5 indices
top_logits = logits[-1, :][top_indices]
top_tokens = [self.tokenizer.decode([idx]) for idx in top_indices]
print("Top 5 tokens:", list(zip(top_tokens, top_logits.tolist())))

Then execute the following example

from guidance import system, user, assistant, gen
from guidance.models import Transformers

lm = Transformers("microsoft/Phi-4-mini-instruct")

with system():
    lm += "be helpful"

with user():
    lm += """Please just copy and reprint exactly the following json object, do not add anything else:
{
    word1: "test",
    word2: "test",
}
"""

with assistant():
    lm += '''
{
    word1: "'''.strip() # here i want the model to continue, and i expect it to complete this line with: test",

    lm += gen(stop_regex='"', max_tokens=10)

print(str(lm))

Now since I set the stop_regex to stop at ", I expect the model to stop after ...test".
However, it doesn't. Here's the output of the above program:

Top 5 tokens: [(b' "', 54.646568298339844), (b' "",\n', 39.89239501953125), (b' "\n', 39.43955612182617), (b' \xe2\x80\x9c', 39.37717056274414), (b' \\"', 38.85851287841797)]
Top 5 tokens: [(b'test', 52.59314727783203), (b'word', 39.43867111206055), (b'text', 38.878150939941406), (b't', 37.80302429199219), (b'testing', 36.417659759521484)]
Top 5 tokens: [(b'",\n', 59.446231842041016), (b'",', 46.492584228515625), (b',\n', 45.68318557739258), (b'"\n', 42.614723205566406), (b',', 42.359291076660156)]
Top 5 tokens: [(b'   ', 41.6693000793457), (b'}\n', 35.06786346435547), (b'}', 34.63892364501953), (b'}\n\n', 33.73918533325195), (b' ', 33.1707763671875)]
Top 5 tokens: [(b' word', 44.99620819091797), (b' "', 35.925697326660156), (b' w', 33.24827194213867), (b' wo', 32.69148254394531), (b' world', 32.17226028442383)]
Top 5 tokens: [(b'2', 47.63304138183594), (b'1', 39.263710021972656), (b'3', 37.8875732421875), (b'4', 34.72578811645508), (b':', 34.53622055053711)]
Top 5 tokens: [(b':', 48.771244049072266), (b'":', 39.95098876953125), (b':"', 37.4881477355957), (b':\n', 36.4210319519043), (b':",', 33.71699523925781)]
Top 5 tokens: [(b' "', 47.51124572753906), (b' test', 39.59502410888672), (b' "\n', 37.369503021240234), (b' ', 36.77366256713867), (b' ""', 36.00505828857422)]
<|system|>be helpful<|end|><|user|>Please just copy and reprint exactly the following json object, do not add anything else:
{
    word1: "test",
    word2: "test",
}
<|end|><|assistant|>{
    word1: "test,
    word2:

As you can see in the debug outputs, it correctly predicts ",\n as the top token after word1: "test, but it doesn't use it for some reason and also doesn't stop the generation even though it contains ", and only stops in the next line when the top predicted token is ".

It also happens when using greedy sampling with top_k=1, and it also happens when I use e.g. '.*".*' or ["] as regex.
It only does NOT happen if I exactly match the predicted token, i.e. if I specify stop_regex='",\n', but that's not actually how the feature is intended to work, is it?

Edit: Interestingly, the problem also occurs when I specify stop_regex=['"', '",\n'], even though it doesn't happen when I specify stop_regex='",\n'

System info:

  • Ubuntu 24.04.2
  • guidance: 0.2.4

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions