You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/_table_generation/table_generation.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -66,4 +66,6 @@ LITERAL_CAPTURE_INCREASE_LENGTH: True for t,r,u,e,f,a,l,s,n
66
66
GRAMMAR_CAPTURE_ERROR_FLAG
67
67
STRING_CAPTURE_ERROR_FLAG
68
68
NUMERIC_CAPTURE_ERROR_FLAG
69
-
LITERAL_CAPTURE_ERROR_FLAG
69
+
LITERAL_CAPTURE_ERROR_FLAG
70
+
71
+
PROCESS_RAW_TRANSCRIPT_TABLE: This table is used to post-process the raw transcript and add missing grammar tokens that were not captured during the initial scanning in build_transcript. Input: encoded_ascii of the last token in each entry (scan_mode + ascii character). Output: containing: token: The token type for this entry, new_grammar: Whether to add a missing grammar token, and scan_token: The type of grammar token to add (if needed), such as END_OBJECT_TOKEN }, or VALUE_SEPARATOR_TOKEN comma.
// while this assert is in an unconstrained function, the out of bounds accesss `raw_transcript[transcript_ptr]` in build_transcript also generates failing constraints
548
+
// while this assert is in an unconstrained function, the out of bounds access `raw_transcript[transcript_ptr]` in build_transcript also generates failing constraints
@@ -722,7 +730,8 @@ impl<let NumBytes: u32, let NumPackedFields: u32, let MaxNumTokens: u32, let Max
722
730
* @brief Check for missing tokens that we could have missed in `build_transcript`
723
731
* @details If we had a json string where a NUMERIC_TOKEN or LITERAL_TOKEN is directly succeeded by a VALUE_SEPARATOR_TOKEN, END_OBJECT_TOKEN, END_ARRAY_TOKEN,
724
732
* we will have missed the latter token.
725
-
* We pick these up via the lookup table PROCESS_RAW_TRANSCRIPT_TABLE
733
+
* We pick these up via the lookup table PROCESS_RAW_TRANSCRIPT_TABLE.
734
+
* The entries in self.raw_transcript currently look like false}, true], null, where the grammar tokens are counted as part of the token.
0 commit comments