Commit 0000e8d
authored
feat(csharp/src/Drivers/Databricks): Optimize GetColumnsExtendedAsync via DESC TABLE EXTENDED (#2953)
# Optimize GetColumnsExtendedAsync in Databricks via `DESC TABLE
EXTENDED`
## Motivation
Currently, `HiveServer2Statement.GetColumnsExtendedAsync` will make 3
thrift calls `GetColumns`, `GetPrimaryKeys` and `GetCrossReference` to
get all the information and then join them into one result set. Now
Databricks introduces a SQL `DESC TABLE EXTENDED <table> AS JSON` that
will return all of these information in one query, this can save 2 extra
roundtrips and improve the performance.
### Description
Override `HiveServer2Statement.GetColumnsExtendedAsync` in
`DatabricksStatement` by executing single SQL `DESC TABLE EXTENDED
<table> AS JSON` to get all the required column info for
`GetColumnsExtendedAsync` then combine and join these info into the
result. As this SQL `DESC TABLE EXTENDED <table> AS JSON` is only
available in Databricks Runtime 16.2 or above, it will check the
`ServerProtocolVersion`. if it does not meet the requirement, it will
fallback the implementation of base class.
### Change
- Added `DescTableExtendedResult` that represents the result of `DESC
TABLE EXTENDED <table> AS JSON` (see format
[here](https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-syntax-aux-describe-table#json-formatted-output)).
It also
- Parses the string-based `table_constraints` to the structured primary
key and foreign key constraints
- Adds calculated properties `DataType`, `ColumnSize`, `FullTypeName`
which are calculated from the column type and type specific properties
- Changed the access modifiers of some properties and methods in
`HiveServer2Statement` to protected so that they can be used/overridden
in subclass `DatabricksStatement`
- `PrimaryKeyFields`
- `ForeignKeyFields`
- `PrimaryKeyPrefix`
- `ForeignKeyPrefix`
- `GetColumnsExtendedAsync`
- `CreateEmptyExtendedColumnsResult`
- Updated `DatabricksStatement` with the changes below
- Added `GetColumnsExtendedAsync` in `DatabricksStatement`, it executes
query `DESC TABLE EXTENDED <table> AS JSON` to get all the required info
and then json them into the QueryResult
- Moved the column metadata schema creation logic from `GetColumnsAsync`
to a reusable method `CreateColumnMetadataSchema`
- Moved the column metadata data array creation logic from
`GetColumnsAsync` to a reusable method `CreateColumnMetadataEmptyArray`
- Added a Databricks connection Parameter
`adbc.databricks.use_desc_table_extended`
- Added an internal calculated property `CanUseDescTableExtended` in
`DatabricksConnection`, `DatabricksStatement` will call it to decide if
it will override `GetColumnsExtendedAsync`
- Improved the Databricks driver integration test
`StatementTest:CanGetColumnsExtended`
- Setup the required table resources during the test instead of
depending on manual setup before the test
- Support running multiple test cases from the inputs
- Add a deep result check to make sure all the properties of all the
columns are set correctly
### Testing
- Added Test class `DescTableExtendedResultTest` to cover all the
deserialization and parsing cases from raw result of `DESC TABLE
EXTENDED <table> AS JSON`
- Tested all the `ColumnsExtended` relevant test cases in
`Databricks/StatementTest`1 parent cde9e7b commit 0000e8d
File tree
11 files changed
+2273
-104
lines changed- csharp
- src/Drivers
- Apache/Hive2
- Databricks
- Result
- test/Drivers/Databricks
- Resources
- Result
11 files changed
+2273
-104
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
50 | | - | |
51 | | - | |
| 50 | + | |
| 51 | + | |
52 | 52 | | |
53 | 53 | | |
54 | | - | |
55 | | - | |
| 54 | + | |
| 55 | + | |
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
| |||
704 | 704 | | |
705 | 705 | | |
706 | 706 | | |
707 | | - | |
| 707 | + | |
708 | 708 | | |
709 | 709 | | |
710 | 710 | | |
| |||
797 | 797 | | |
798 | 798 | | |
799 | 799 | | |
800 | | - | |
| 800 | + | |
801 | 801 | | |
802 | 802 | | |
803 | 803 | | |
| |||
809 | 809 | | |
810 | 810 | | |
811 | 811 | | |
812 | | - | |
| 812 | + | |
| 813 | + | |
813 | 814 | | |
814 | 815 | | |
815 | 816 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
| 59 | + | |
59 | 60 | | |
60 | 61 | | |
61 | 62 | | |
| |||
145 | 146 | | |
146 | 147 | | |
147 | 148 | | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
148 | 161 | | |
149 | 162 | | |
150 | 163 | | |
| |||
223 | 236 | | |
224 | 237 | | |
225 | 238 | | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
226 | 244 | | |
227 | 245 | | |
228 | 246 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
173 | 173 | | |
174 | 174 | | |
175 | 175 | | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
176 | 182 | | |
177 | 183 | | |
178 | 184 | | |
| |||
0 commit comments