-
Notifications
You must be signed in to change notification settings - Fork 148
chore(csharp/src/Drivers/Apache): Cleanup HiveServer2-based code with shared Thrift request/response interfaces #3256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… shared Thrift request/response interfaces
… throw exception to disallow re-use of statement once it has a response from a previous query.
@CurtHagenlocher - Resolved the merge conflicts. Added a check to ensure the HiveServer2Statement doesn't get re-used as that would change the Response object and make it inconsistent for the QueryResult/Stream. @jadewang-db Can you run the E2E test against a Databrick system to confirm no regressions? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I really like the improved state management here. Please consider a few small changes as described.
csharp/src/Drivers/Databricks/Reader/DatabricksOperationStatusPoller.cs
Outdated
Show resolved
Hide resolved
@@ -44,8 +48,19 @@ protected BaseDatabricksReader(IHiveServer2Statement statement, Schema schema, b | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small nit, I wonder if this logic should go in DatabricksCompositeReader
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@toddmeng-db - When you say "this logic" are you referring to the CloseOperation
call in the Dispose
method?
Both BaseDatabricksReader
and DatabricksCompositeReader
inherit from TracingReader
. So I've coded the CloseOperation
call in both of their Dispose
methods.
Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@toddmeng-db - If you could run the E2E tests against a Databricks server to confirm I haven't caused any regressions. If you could check that there are no open operations after correct Dispose of streams - that would be great, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry. DatabricksCompositeReader will end up calling duplicate CloseOperation during BaseDatabricksReader.Dispose() (through activeReader.dispose()), do we want to avoid this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. Good catch. I didn't see the containment of BaseDatabricksReading
inside DatabricksCompositeReader
.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the duplicate call to CloseOperation
in DatabricksCompositeReader
. It should be in the BaseDatabricksReader so that inheritors get the benefit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it might be better if it's in DatabricksCompositeReader? DatabricksCompositeReader doesn't inherit from BaseDatabricksReader (sorry I know that's a bit confusing, probably needs a rename), and better represents the stream's lifecycle. It's possible the BaseDatabricksReader (activeReader) owned by DatabricksCompositeReader is not yet initialized when DatabricksCompositeReader is initialized.
Currently this is how they relate:
DatabricksCompositeReader is created first. If there is directresults, we initialize the BaseDatabricksReader. If there is no directresults, we wait for the first FetchResultsResponse and inspect the result to determine which BaseDatabricksReader we want to use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@toddmeng-db - Added a "stateful" close operation on the BaseDatabricksReader
. Let me know if this is too complicated a solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests look good to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for confirming!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Refactor API to improve handling of request and responses to simplify number of overloads.
Refactor API to send the IResponse to the Reader (
IArrowArrayStream
).Replaces #2797