Skip to content

Conversation

birschick-bq
Copy link
Contributor

@birschick-bq birschick-bq commented Aug 8, 2025

Refactor API to improve handling of request and responses to simplify number of overloads.
Refactor API to send the IResponse to the Reader (IArrowArrayStream).

  • The Stream/Reader is now responsible to close the operation.
  • The Statement is no longer responsible for keeping a singleton instance of the (most recent) response.

Replaces #2797

@birschick-bq
Copy link
Contributor Author

@CurtHagenlocher - Resolved the merge conflicts. Added a check to ensure the HiveServer2Statement doesn't get re-used as that would change the Response object and make it inconsistent for the QueryResult/Stream.

@jadewang-db Can you run the E2E test against a Databrick system to confirm no regressions?

@birschick-bq birschick-bq marked this pull request as ready for review August 11, 2025 18:55
@github-actions github-actions bot added this to the ADBC Libraries 20 milestone Aug 11, 2025
@birschick-bq birschick-bq marked this pull request as draft August 11, 2025 21:22
Copy link
Contributor

@CurtHagenlocher CurtHagenlocher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I really like the improved state management here. Please consider a few small changes as described.

@@ -44,8 +48,19 @@ protected BaseDatabricksReader(IHiveServer2Statement statement, Schema schema, b

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit, I wonder if this logic should go in DatabricksCompositeReader

Copy link
Contributor Author

@birschick-bq birschick-bq Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@toddmeng-db - When you say "this logic" are you referring to the CloseOperation call in the Dispose method?

Both BaseDatabricksReader and DatabricksCompositeReader inherit from TracingReader. So I've coded the CloseOperation call in both of their Dispose methods.

Am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@toddmeng-db - If you could run the E2E tests against a Databricks server to confirm I haven't caused any regressions. If you could check that there are no open operations after correct Dispose of streams - that would be great, too.

Copy link
Contributor

@toddmeng-db toddmeng-db Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry. DatabricksCompositeReader will end up calling duplicate CloseOperation during BaseDatabricksReader.Dispose() (through activeReader.dispose()), do we want to avoid this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Good catch. I didn't see the containment of BaseDatabricksReading inside DatabricksCompositeReader.
Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the duplicate call to CloseOperation in DatabricksCompositeReader. It should be in the BaseDatabricksReader so that inheritors get the benefit.

Copy link
Contributor

@toddmeng-db toddmeng-db Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it might be better if it's in DatabricksCompositeReader? DatabricksCompositeReader doesn't inherit from BaseDatabricksReader (sorry I know that's a bit confusing, probably needs a rename), and better represents the stream's lifecycle. It's possible the BaseDatabricksReader (activeReader) owned by DatabricksCompositeReader is not yet initialized when DatabricksCompositeReader is initialized.

Currently this is how they relate:
DatabricksCompositeReader is created first. If there is directresults, we initialize the BaseDatabricksReader. If there is no directresults, we wait for the first FetchResultsResponse and inspect the result to determine which BaseDatabricksReader we want to use

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@toddmeng-db - Added a "stateful" close operation on the BaseDatabricksReader. Let me know if this is too complicated a solution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests look good to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for confirming!

Copy link
Contributor

@CurtHagenlocher CurtHagenlocher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@CurtHagenlocher CurtHagenlocher merged commit 9e1d1c2 into apache:main Aug 13, 2025
6 checks passed
@birschick-bq birschick-bq deleted the CommonThriftApi branch August 13, 2025 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants