Skip to content

Conversation

@mbwhite
Copy link
Contributor

@mbwhite mbwhite commented Mar 31, 2025

As a result of an (accidental) upgrade Calcite 1.39; conversions via Isthmus started to report that table names where unknown;

Isthmus has an external API that allows tables to be lookup via calls to a function - dynamic or 'lazy' lookup. A special subclass of the CalciteSimpleSchema was created to support this. This special subclass had to be in the same package as the CalciteSimpleSchema as it took advantage of being able to override a number of protected methods.

Calcite 1.39.0 introduced similar support for 'lazy' table lookups; that implementation, though changed the protected methods sufficiently Isthmus now failed (indeed rebuilding the code fails to compile).

This PR removes the special subclass, and allows Ishtmus to work as before; One complexity is how namespace tables are handled with the original Isthmus API, and Calcite new lookup API.

This does make the code a lot more complex; for future releases, a suggestion would be to deprecate this function interface and replace it with accepting the same Calcite lookup API.

  • SubstraitCalciteSchema is the replacement class; this is not an external but used internally; it's now using the Calcite lookup API
  • Other changes are just to accommodate the new class - it's fulling the same role as the previous one, so no other logic has really changed.
  • Added a couple of end-end tests to add additional validation - as this was how the issue was first seen; though even without that the Calcite change resulted in > 40% test case failures.

@mbwhite mbwhite force-pushed the calcite-1_39_0 branch 3 times, most recently from 2ce9d6f to 1894fb1 Compare March 31, 2025 08:48
@mbwhite mbwhite marked this pull request as ready for review March 31, 2025 09:18
public void substraitToSqlViaCalcite() throws IOException {

// create the protobuf substarit plan
byte[] planProtobuf = Resources.toByteArray(Resources.getResource("substrait_sql_003.plan"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other tests are using SubstraitBuilder to provide a Substrait plan for test cases which I think is more reliable than relying on a binary file in the tests resources which might have been encoded in an older version of the Substrait spec and which might break with future Substrait spec changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair point; will look at using that builder.

Signed-off-by: MBWhite <[email protected]>

Update isthmus/src/test/java/io/substrait/isthmus/api/TestIsthmusEndToEnd.java

Co-authored-by: Niels Pardon <[email protected]>

Update isthmus/src/test/java/io/substrait/isthmus/api/TestIsthmusEndToEnd.java

Co-authored-by: Niels Pardon <[email protected]>

Update isthmus/src/main/java/io/substrait/isthmus/SubstraitCalciteSchema.java

Co-authored-by: Niels Pardon <[email protected]>

Update isthmus/src/main/java/io/substrait/isthmus/SubstraitCalciteSchema.java

Co-authored-by: Niels Pardon <[email protected]>

Update isthmus/src/main/java/io/substrait/isthmus/SubstraitCalciteSchema.java

Co-authored-by: Niels Pardon <[email protected]>

Update isthmus/src/test/java/io/substrait/isthmus/api/TestIsthmusEndToEnd.java

Co-authored-by: Niels Pardon <[email protected]>

Update isthmus/src/main/java/io/substrait/isthmus/SubstraitCalciteSchema.java

Co-authored-by: Niels Pardon <[email protected]>
@vbarua
Copy link
Member

vbarua commented Mar 31, 2025

How urgent is the 1.39 upgrade for you? If deprecating / removing the TableLookupAPI simplifies this, I'd be open to doing that first. I can take a look at that this week as I have some other changes around SQL handling that might simplify this as well.

@mbwhite
Copy link
Contributor Author

mbwhite commented Mar 31, 2025

How urgent is the 1.39 upgrade for you? If deprecating / removing the TableLookupAPI simplifies this, I'd be open to doing that first. I can take a look at that this week as I have some other changes around SQL handling that might simplify this as well.

Not urgent - our 'version checker' wanted to bump the calcite version up, therefore this came up. No other reason to move to calcite version; just pre-empting a problem!

No problem with dropping the lookup first; in fact I was thinking of adding an API to accept the Calcite 1.39 loopup API instead.. The function of a dynamic schema is still present;

Interesting to see other SQL updates - there was one I was thinking of to make

  public static List<String> toSql(Plan plan, SqlDialect dialect) {
    SimpleExtension.ExtensionCollection extensions = SimpleExtension.loadDefaults();

    SubstraitToCalcite substrait2Calcite = new SubstraitToCalcite(
        extensions, new JavaTypeFactoryImpl(SubstraitTypeSystem.TYPE_SYSTEM));

    return plan.getRoots().stream().map(root -> {
      var calciteRel = substrait2Calcite.convert(root).project();
      return SubstraitToSql.toSql(calciteRel);
    }).collect(Collectors.toList());

  }

For reference it's this API that triggers the complexity

public RelNode substraitRelToCalciteRel(
      Rel relRoot, Function<List<String>, NamedStruct> tableLookup)

@vbarua
Copy link
Member

vbarua commented Apr 1, 2025

@mbwhite my PR/proposal to remove the troublesome API is in #363

@vbarua
Copy link
Member

vbarua commented Apr 3, 2025

#363 has been merged in. Hopefully it can simplify this update.

@mbwhite
Copy link
Contributor Author

mbwhite commented Apr 9, 2025

replaced with PR #380

@mbwhite mbwhite closed this Apr 9, 2025
@mbwhite mbwhite deleted the calcite-1_39_0 branch December 16, 2025 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants