Skip to content

Conversation

@kyle-cheung
Copy link
Contributor

When transpiling a JSON extraction from Snowflake to DuckDB, SQLGlot uses the -> operator. In DuckDB, -> returns a JSON object. When comparing this result to a standard VARCHAR, DATE, or INT, DuckDB throws a Conversion Error: Malformed JSON.

D CREATE OR REPLACE TABLE test AS SELECT JSON('{"test_date": "2025-10-01"}') AS test_date;
D SELECT
    (
      test_date -> '$.test_date'
    ) = '2025-12-01'
  FROM test
  ;
Conversion Error:
Malformed JSON at byte 4 of input: unexpected content after document.  Input: "2025-12-01"

LINE 4:   ) = '2025-12-01'

However this works as expected in Snowflake:

CREATE OR REPLACE TEMPORARY TABLE test_data AS 
SELECT PARSE_JSON('{"test_date": "2025-12-01"') AS test;

SELECT test:test_date = '2025-12-01'
FROM test_data;

To resolve this we need to transpile to the ->> operator, which will work as it does in Snowflake

D SELECT
    (
      test_date ->> '$.test_date'
    ) = '2025-12-01'
  FROM test
  ;
┌────────────────────────────────────────────────┐
│ ((test_date ->> '$.test_date') = '2025-12-01') │
│                    boolean                     │
├────────────────────────────────────────────────┤
│ false                                          │
└────────────────────────────────────────────────┘

In DuckDB ->> still allows chaining

CREATE OR REPLACE TABLE test_json (data JSON);
INSERT INTO test_json VALUES ('{"outer": {"inner": "target_value"}}');
SELECT * FROM test_json;

D SELECT data->>'outer'->>'inner' FROM test_json;
┌────────────────────────────────────┐
│ (("data" ->> 'outer') ->> 'inner') │
│              varchar               │
├────────────────────────────────────┤
│ target_value                       │
└────────────────────────────────────┘
D SELECT data->'outer'->>'inner' FROM test_json;
┌───────────────────────────────────┐
│ (("data" -> 'outer') ->> 'inner') │
│              varchar              │
├───────────────────────────────────┤
│ target_value                      │
└───────────────────────────────────┘

Resolves #6661

Copy link
Collaborator

@VaggelisD VaggelisD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @kyle-cheung appreciate the PR but I'm hesitant for this change.

Both Snowflake & Databricks VARIANT accesses produce VARIANT values whereas DuckDB would now produce a VARCHAR:

snowflake> WITH tbl AS (SELECT PARSE_JSON('{"a": 1}') AS col) SELECT SYSTEM$TYPEOF(col:a) FROM tbl;
VARIANT[LOB]

databricks> WITH t AS (SELECT PARSE_JSON('{"a": 1}') AS col) SELECT typeof(col:a) FROM t;
VARIANT

duckdb> WITH tbl AS (SELECT '{"a": 1}'::JSON AS col) select typeof(col ->> '$.a') from tbl;
┌─────────────────────────┐
│ typeof((col ->> '$.a')) │
│         varchar         │
├─────────────────────────┤
│ VARCHAR                 │
└─────────────────────────┘

I'm hesitant for 2 reasons:

  1. Chaining may be allowed but it can alter the data types of nested objects, e.g this should probably be preserved as JSON instead of being turned into a VARCHAR
duckdb> WITH tbl AS (SELECT '{"a": {"b": 1}}'::JSON AS col) select col ->> '$.a' from tbl;
┌─────────────────┐
│ (col ->> '$.a') │
│     varchar     │
├─────────────────┤
│ {"b":1}         │
└─────────────────┘
  1. Binary operations in Snowflake and DuckDB (for ->>) may work due to coercions but this is not the case for Databricks:
databricks> WITH t AS (SELECT PARSE_JSON('{"a": 1}') AS col) SELECT col:a = 1 FROM t;

[DATATYPE_MISMATCH.BINARY_OP_DIFF_TYPES] Cannot resolve "(variant_get(col, $.a, 'VARIANT') = 1)" due to data type mismatch: the left and right operands of the binary operator have incompatible types ("VARIANT" and "INT"). SQLSTATE: 42K09; line 1, pos 56

cc: @georgesittas curious for your thoughts

@kyle-cheung
Copy link
Contributor Author

Thanks @georgesittas , I think I may have a workaround that I can follow up with in a separate PR. I'll close this and #6670

Wrapping all JSON extracts in paranthesis will resolve as expected.

CREATE OR REPLACE TABLE test AS SELECT JSON('{"test_bool" : false}') AS test_bool;

-- works
D SELECT test_bool FROM test WHERE test_bool -> '$.test_bool';
┌───────────┐
│ test_bool │
│   json    │
├───────────┤
│  0 rows   │
└───────────┘

-- will not work
D SELECT test_bool FROM test WHERE test_bool -> '$.test_bool' AND 1=1;
Binder Error:
No function matches the given name and argument types 'json_extract(JSON, BOOLEAN)'. You might need to add explicit type casts.
        Candidate functions:
        json_extract(VARCHAR, BIGINT) -> JSON
        json_extract(VARCHAR, VARCHAR) -> JSON
        json_extract(VARCHAR, VARCHAR[]) -> JSON[]
        json_extract(JSON, BIGINT) -> JSON
        json_extract(JSON, VARCHAR) -> JSON
        json_extract(JSON, VARCHAR[]) -> JSON[]


-- will not work
D SELECT test_bool FROM test WHERE NOT test_bool -> '$.test_bool';
Conversion Error:
Failed to cast value to numerical: {"test_bool":false}

LINE 1: SELECT test_bool FROM test WHERE NOT test_bool -> '$.test_bool';
                                             ^

-- works 
D SELECT test_bool FROM test WHERE NOT (test_bool -> '$.test_bool');
┌─────────────────────┐
│      test_bool      │
│        json         │
├─────────────────────┤
│ {"test_bool":false} │
└─────────────────────┘

-- works 
D SELECT test_bool FROM test WHERE NOT (test_bool -> '$.test_bool') AND 1=1;
┌─────────────────────┐
│      test_bool      │
│        json         │
├─────────────────────┤
│ {"test_bool":false} │
└─────────────────────┘

-- Try with another data type 
CREATE OR REPLACE TABLE test AS SELECT JSON('{"test_bool" : 1}') AS test_bool;

D SELECT test_bool FROM test WHERE test_bool -> '$.test_bool' = 1;
Binder Error:
No function matches the given name and argument types 'json_extract(JSON, BOOLEAN)'. You might need to add explicit type casts.
        Candidate functions:
        json_extract(VARCHAR, BIGINT) -> JSON
        json_extract(VARCHAR, VARCHAR) -> JSON
        json_extract(VARCHAR, VARCHAR[]) -> JSON[]
        json_extract(JSON, BIGINT) -> JSON
        json_extract(JSON, VARCHAR) -> JSON
        json_extract(JSON, VARCHAR[]) -> JSON[]

D SELECT test_bool -> '$.test_bool' FROM test WHERE (test_bool -> '$.test_bool') = 1;
┌──────────────────────────────┐
│ (test_bool -> '$.test_bool') │
│             json             │
├──────────────────────────────┤
│ 1                            │
└──────────────────────────────┘

@kyle-cheung kyle-cheung closed this Jan 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Snowflake to DuckDB JSON Extraction transpiles to -> which returns JSON type

2 participants