-
Notifications
You must be signed in to change notification settings - Fork 474
Description
Description
Hi,
I think i've tried about everything to get grammar to work properly to no avail. I have read the docs and looked at the issues, but for some reason I can't click the checkboxes. docs/Examples/GrammarJsonResponse.md is outdated btw.
I'm using 0.25 on a Stateless executor: Here's a link to the adapter i wrote
The same grammar that works on KoboldCpp always gives me an error when running LLamaSharp instead. A System.IndexOutOfRangeException: 'Index was outside the bounds of the array.' to be precise.
I'm probably doing something wrong, but I really don't see what. I tried with different Grammar Optimization values out of curiosity, but nope. Oh, and yeah, normal generation works perfectly fine.
Any help would be super welcome because I managed to get structured output to work for both KoboldAPI, and OpenAI-compatible backends. So, that'd very neat to get feature parity for LLamaSharp.
Reproduction Steps
So, example:
I have a BasicBitch class
public class BasicBitch
{
public string Response { get; set; } = string.Empty;
public int Confidence { get; set; } = 0;
}
The simplest possible JSON Schema -> GBNF conversion gives:
root ::= "{" space Response-kv "," space Confidence-kv "}" space
Response ::= string | null
Response-kv ::= "\"Response\"" space ":" space Response
Confidence-kv ::= "\"Confidence\"" space ":" space integer
char ::= [^"\\\x7F\x00-\x1F] | [\\] (["\\bfnrt] | "u" [0-9a-fA-F]{4})
integer ::= ("-"? integral-part) space
integral-part ::= [0] | [1-9] [0-9]{0,15}
null ::= "null" space
number ::= ("-"? integral-part) ("." [0-9]+)? ([eE] [-+]? [0-9]+)? space
space ::= | " " | "\n"{1,2} [ \t]{0,20}
string ::= "\"" char* "\"" space
Finally (ignore the implied data):
var pipeline = new DefaultSamplingPipeline()
{
Temperature = (float)input.Temperature,
TopP = (float)input.Top_p,
TopK = input.Top_k,
TypicalP = (float)input.Typical,
MinP = (float)input.Min_p,
RepeatPenalty = (float)input.Rep_pen,
PenaltyCount = input.Rep_pen_range,
// input.Grammar has the grammar string stated above. I also tried the one you provide in your json.gbnf file and got same error.
Grammar = string.IsNullOrWhiteSpace(input.Grammar) ? null : new Grammar(input.Grammar, "root"),
PenalizeNewline = false,
PreventEOS = input.Bypass_eos,
Seed = input.Sampler_seed <= 0 ? 0 : (uint)input.Sampler_seed
};
var sett = new InferenceParams()
{
MaxTokens = input.Max_length,
AntiPrompts = input.Stop_sequence?.ToArray() ?? [],
SamplingPipeline = pipeline
};
string response = string.Empty;
await foreach (var text in Executor.InferAsync(input.Prompt, sett, token)) // <<<---- CRASHES BEFORE FIRST LOOP
{
if (token.IsCancellationRequested)
break;
response += text;
}
I've tried on about 10 different json schemas of various level of complexity, and none work (while they all do in KoboldCpp, so the GBNF conversion should be solid)
Environment & Configuration
- Operating system: WIn11
- .NET runtime version: .NET 8
- LLamaSharp version: 0.25
- CUDA version (if you are using cuda backend): 12
- CPU & GPU device: RTX 3090 24GB
Known Workarounds
No response