Skip to content

Conversation

@graup
Copy link
Contributor

@graup graup commented Jan 23, 2025

Fixes #1673. This is a breaking change.

TypedString contained a String without any knowledge of the used quote style. The parser used parse_literal_string to construct this, which doesn't support any quote styles other than single or double quotes. Namely, it doesn't support triple quotes from BigQuery, causing the issue reported in #1673. Additionally, it doesn't round-trip properly, always formatting its string using single quotes.

I think the most proper fix is to have TypedString contain a Value instead, similar to IntroducedString and others. This gives us immediate support for other quote styles and fixes the formatting to make it roundtrippable.

This is a breaking change but should be an easy fix in users' codebases, just (un)wrapping the value. Migration path:

  1. When constructing an AST node
Expr::TypedString {
    data_type: DataType::JSON,
--  value: r#"{"class" : {"students" : [{"name" : "Jane"}]}}"#.to_string()
++  value: Value::SingleQuotedString(
++      r#"{"class" : {"students" : [{"name" : "Jane"}]}}"#.to_string()
++  )
},],
  1. When using AST parser results
if let Expr::TypedString { data_type, value } = expr {
--  let string_value: String = value;
++  let string_value: String = value.into_string().unwrap();
}

For convenience, I have added a method into_string -> Option(String) to Value to get the underlying string value.

@graup
Copy link
Contributor Author

graup commented Jan 30, 2025

Thanks @iffyio, appreciate your suggestions. I've changed the method to into_string -> Option(String). Also updated the PR description.

@PrettyWood
Copy link

Thank you so much for working on that 🙏 We have PRQL/prql#5099 on prql side. Very much appreciated

Copy link
Contributor

@iffyio iffyio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @graup!
cc @alamb

@iffyio iffyio changed the title Make TypedString contain Value instead of String to support and preserve other quote styles Make TypedString preserve quote style Jan 31, 2025
@iffyio iffyio merged commit 447142c into apache:main Jan 31, 2025
9 checks passed
@graup graup deleted the typed-string-fix branch January 31, 2025 10:50
@alamb
Copy link
Contributor

alamb commented Jan 31, 2025

Werd! The PR / code 🚂 keeps on running. Thanks again @iffyio for all you do to keep this repo moving forward

Vedin pushed a commit to Embucket/datafusion-sqlparser-rs that referenced this pull request Feb 3, 2025
Vedin pushed a commit to Embucket/datafusion-sqlparser-rs that referenced this pull request Feb 3, 2025
Vedin added a commit to Embucket/datafusion-sqlparser-rs that referenced this pull request Feb 3, 2025
ayman-sigma pushed a commit to sigmacomputing/sqlparser-rs that referenced this pull request Apr 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fails to parse TypedString expressions in BigQuery with triple quoted strings

5 participants