fix(snowflake/parser): strict Parse rejects trailing tokens + 11 constructs the silent drop was hiding#303
Merged
Merged
Conversation
…s (best-effort unchanged) parseSingle returned as soon as one statement parsed, silently ignoring every token after it: SELECT * FFROM users / SELECT 1 2 3 parsed with ZERO errors as a bare statement prefix (flagged by the bytebase #20567 cloud review — Diagnose must reject these). Strict path (Parse / new ParseStrict / parseSingle): after a statement, the next token must be ; or end-of-segment; anything else emits "syntax error at or near <token>" at the stray token and recovers to the next statement boundary (per-statement reporting across segments). ParseBestEffort keeps the historical prefix tolerance for completion / partial-input callers. diagnostics.Analyze now runs ParseStrict so every offending statement diagnoses with accurate line/col. Running the 657-file corpus closure in strict mode exposed 26 files that "parsed clean" only via the silent drop — all real grammar gaps, fixed: - bare FROM VALUES (r1), (r2) row source (14 files) - multi-line single-quoted strings (Snowflake-legal; lexer was tearing them at the first newline — one-line rule removed everywhere) - file:// no longer opens a // line comment (:// guard), so PUT file:// @stage/; no longer swallows the following statement - INNER DIRECTED JOIN / NATURAL INNER JOIN / LEFT-RIGHT-FULL [OUTER] DIRECTED JOIN keyword orders - SHOW ... IN FAILOVER|REPLICATION GROUP <name> scopes - DROP FUNCTION/PROCEDURE overload signatures f(int, VARCHAR) - CREATE TABLE <name> IF NOT EXISTS (postfix placement) - CREATE DATABASE ... FROM SHARE provider.share - IDENTIFIER(<expr>) object names (USE WAREHOUSE / UNDROP TABLE / ...) - SHOW GRANTS TO <class> ROLE <instance>!<role> (new tokBang lexing) - ALTER VIEW comma-separated MODIFY COLUMN action lists and the DROP ROW ACCESS POLICY p1, ADD ROW ACCESS POLICY p2 ON (...) combo The one remaining file (create-dynamic-table/example_12) is a docs typo (missing comma) the strict check now CORRECTLY rejects — skip-listed as MALFORMED. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Strict Parse now rejects unconsumed trailing tokens (
SELECT * FFROM users,SELECT 1 2 3→ syntax error at the stray token; per-statement recovery; new ParseStrict; ParseBestEffort tolerance unchanged for completion). diagnostics.Analyze uses the strict path — requested by the bytebase #20567 cloud review.Running the corpus in strict mode exposed 26 files relying on the silent drop — 11 missing constructs implemented rather than exempted: bare FROM VALUES rows; multi-line single-quoted strings (parser lexer one-line rule removed, now agrees with Split);
file://vs//-comment guard; DIRECTED/NATURAL-INNER JOIN orders; SHOW ... IN FAILOVER|REPLICATION GROUP; DROP FUNCTION/PROCEDURE overload signatures; postfix IF NOT EXISTS; CREATE DATABASE FROM SHARE; IDENTIFIER($var) object names; SHOW GRANTS TO class!role (new tokBang); ALTER VIEW column-action lists. 1 corpus file is a genuine docs typo → MALFORMED skip. Corpus: 650 clean strict + 7 categorized skips; full suite green. Verified on fresh checkout of 04040d7.🤖 Generated with Claude Code