-
Notifications
You must be signed in to change notification settings - Fork 82
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
When processing a string containing curly quote characters they are converted to straight quotes which produces invalid Json when the ToJson Method is called
To Reproduce
Catalyst.Models.English.Register();
Storage.Current = new DiskStorage("catalyst-models");
var nlp = await Pipeline.ForAsync(Language.English);
var doc = new Document("The \u201cquick\u201d brown fox jumps over the lazy dog", Language.English);
nlp.ProcessSingle(doc);
Console.WriteLine(doc.ToJson());
This outputs the following to the console:
{"Language":"en","Length":45,"Value":"The "quick" brown fox jumps over the lazy dog","TokensData":[[{"Bounds":[0,2],"Tag":"DET"},{"Bounds":[4,4],"Tag":"PUNCT"},{"Bounds":[5,9],"Tag":"ADJ"},{"Bounds":[10,10],"Tag":"PUNCT"},{"Bounds":[12,16],"Tag":"ADJ"},{"Bounds":[18,20],"Tag":"NOUN"},{"Bounds":[22,26],"Tag":"VERB"},{"Bounds":[28,31],"Tag":"ADP"},{"Bounds":[33,35],"Tag":"DET"},{"Bounds":[37,40],"Tag":"ADJ"},{"Bounds":[42,44],"Tag":"NOUN"}]]}
Note the straight quotes around the word quick. This json cannot be parsed.
Expected behavior
The ToJson method should retain the unicode curly quotes in the Value property and produce valid json output.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working