-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Add MAP type support to the ClickHouse connector #26957
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Reviewer's GuideThis PR implements native ClickHouse MAP type support by extending the client’s type mapping and write mapping logic. It injects type operators, refactors toColumnMapping to route MAP column types through a new mapColumnMapping method that builds a Trino MapType from ClickHouse key/value definitions, and adds corresponding read/write functions. Connector behavior flags and tests are updated to cover MAP round-trips. Sequence diagram for MAP type read/write round-tripsequenceDiagram
participant Client
participant ClickHouseClient
participant ClickHouseDB
Client->>ClickHouseClient: Write MAP column
ClickHouseClient->>ClickHouseDB: setObject(index, mapValue)
ClickHouseDB-->>ClickHouseClient: Returns MAP column data
ClickHouseClient->>Client: Read MAP column (buildMapValue)
Entity relationship diagram for ClickHouse MAP type mappingerDiagram
CLICKHOUSE_COLUMN {
string name
ClickHouseDataType dataType
int precision
int scale
}
TRINO_MAP_TYPE {
Type keyType
Type valueType
}
COLUMN_MAPPING {
Type type
ObjectReadFunction readFunction
ObjectWriteFunction writeFunction
}
CLICKHOUSE_COLUMN ||--o| TRINO_MAP_TYPE : "maps to"
TRINO_MAP_TYPE ||--o| COLUMN_MAPPING : "used in"
CLICKHOUSE_COLUMN ||--o| COLUMN_MAPPING : "key/value info"
Class diagram for updated ClickHouseClient MAP type supportclassDiagram
class ClickHouseClient {
- TypeOperators typeOperators
- Type uuidType
- Type ipAddressType
+ Optional<ColumnMapping> toColumnMapping(ConnectorSession, ConnectorColumnHandle, JdbcTypeHandle)
+ Optional<ColumnMapping> toColumnMapping(ConnectorSession, String, int, Optional<Integer>, Optional<Integer>)
+ Optional<ColumnMapping> mapColumnMapping(ConnectorSession, ClickHouseColumn, ClickHouseColumn)
+ WriteMapping toWriteMapping(ConnectorSession, Type)
+ static ObjectWriteFunction mapWriteFunction()
+ static void writeValue(Type, BlockBuilder, Object)
}
class MapType {
+ Type getKeyType()
+ Type getValueType()
}
class ColumnMapping {
+ static ColumnMapping objectMapping(Type, ObjectReadFunction, ObjectWriteFunction)
}
class ObjectReadFunction {
+ static ObjectReadFunction of(Class, Function)
}
class ObjectWriteFunction {
+ static ObjectWriteFunction of(Class, BiConsumer)
}
ClickHouseClient --> MapType : uses
ClickHouseClient --> ColumnMapping : returns
ClickHouseClient --> ObjectReadFunction : uses
ClickHouseClient --> ObjectWriteFunction : uses
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java:659-664` </location>
<code_context>
+ return columnMapping;
+ }
+
+ private Optional<ColumnMapping> toColumnMapping(
+ ConnectorSession session,
+ String typeName,
+ int jdbcType,
+ Optional<Integer> decimalDigits,
+ Optional<Integer> columnSize)
+ {
ClickHouseVersion version = getClickHouseServerVersion(session);
</code_context>
<issue_to_address>
**issue:** Consider documenting the contract for 'toColumnMapping' regarding Optional parameters.
Since 'orElseThrow()' is called on these Optionals, missing values will cause exceptions. Please verify that all callers supply these parameters, or update the method to handle missing values appropriately.
</issue_to_address>
### Comment 2
<location> `plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java:707-708` </location>
<code_context>
return Optional.of(varbinaryColumnMapping());
case UUID:
return Optional.of(uuidColumnMapping());
+ case Map:
+ return mapColumnMapping(session, column.getKeyInfo(), column.getValueInfo());
default:
// no-op
</code_context>
<issue_to_address>
**suggestion:** Map type handling may not cover all edge cases for key/value types.
Since unsupported key or value types result in an empty return, this may cause silent failures. Please consider adding explicit error handling or logging to improve user feedback.
Suggested implementation:
```java
switch (columnDataType) {
case Bool:
return Optional.of(varbinaryColumnMapping());
case UUID:
return Optional.of(uuidColumnMapping());
case Map: {
Optional<ClickHouseColumnMapping> mapping = mapColumnMapping(session, column.getKeyInfo(), column.getValueInfo());
if (!mapping.isPresent()) {
log.warn("Unsupported key or value type for Map column: keyInfo={}, valueInfo={}", column.getKeyInfo(), column.getValueInfo());
// Alternatively, you could throw an exception here if you want to fail fast:
// throw new IllegalArgumentException("Unsupported key or value type for Map column: keyInfo=" + column.getKeyInfo() + ", valueInfo=" + column.getValueInfo());
}
return mapping;
}
default:
// no-op
}
```
- Ensure that a logger named `log` is available in the class. If not, add:
`private static final Logger log = Logger.get(ClickHouseClient.class);`
- Decide whether you want to log a warning (as above) or throw an exception for unsupported types.
</issue_to_address>
### Comment 3
<location> `plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java:713` </location>
<code_context>
}
- switch (typeHandle.jdbcType()) {
+ switch (jdbcType) {
case Types.TINYINT:
return Optional.of(tinyintColumnMapping());
</code_context>
<issue_to_address>
**issue (bug_risk):** Switching on 'jdbcType' instead of 'typeHandle.jdbcType()' may introduce subtle bugs.
Please verify that 'jdbcType' and 'typeHandle.jdbcType()' will always have matching values to prevent incorrect type mapping.
</issue_to_address>
### Comment 4
<location> `plugin/trino-clickhouse/src/test/java/io/trino/plugin/clickhouse/TestClickHouseConnectorTest.java:668-675` </location>
<code_context>
}
}
+ @Test
+ @Override
+ public void testInsertMap()
+ {
+ // TODO: Add more types here
</code_context>
<issue_to_address>
**suggestion (testing):** Add more comprehensive MAP type tests, including edge cases.
Please add tests for empty maps, multiple key-value pairs, null values, complex key/value types, and error conditions like duplicate keys or unsupported types to improve coverage.
```suggestion
@Test
@Override
public void testInsertMap()
{
// Basic types
testMapRoundTrip("INTEGER", "2");
testMapRoundTrip("VARCHAR", "CAST('foobar' AS VARCHAR)");
// Empty map
try (TestTable table = newTrinoTable("test_insert_empty_map_", "(col map(INTEGER, INTEGER))")) {
assertUpdate("INSERT INTO " + table.getName() + " VALUES map(ARRAY[], ARRAY[])", 1);
assertThat(query("SELECT cardinality(col) FROM " + table.getName()))
.matches("VALUES 0");
}
// Multiple key-value pairs
try (TestTable table = newTrinoTable("test_insert_multi_map_", "(col map(INTEGER, VARCHAR))")) {
assertUpdate("INSERT INTO " + table.getName() + " VALUES map(ARRAY[1,2,3], ARRAY['a','b','c'])", 1);
assertThat(query("SELECT col[1], col[2], col[3] FROM " + table.getName()))
.matches("VALUES ('a', 'b', 'c')");
}
// Null values
try (TestTable table = newTrinoTable("test_insert_null_value_map_", "(col map(INTEGER, VARCHAR))")) {
assertUpdate("INSERT INTO " + table.getName() + " VALUES map(ARRAY[1,2], ARRAY['x', NULL])", 1);
assertThat(query("SELECT col[1], col[2] FROM " + table.getName()))
.matches("VALUES ('x', NULL)");
}
// Complex key/value types (e.g., ARRAY as value)
try (TestTable table = newTrinoTable("test_insert_complex_map_", "(col map(INTEGER, ARRAY(INTEGER)))")) {
assertUpdate("INSERT INTO " + table.getName() + " VALUES map(ARRAY[1,2], ARRAY[ARRAY[10,20], ARRAY[30]])", 1);
assertThat(query("SELECT col[1], col[2] FROM " + table.getName()))
.matches("VALUES (ARRAY[10,20], ARRAY[30])");
}
// Error: duplicate keys
try (TestTable table = newTrinoTable("test_insert_duplicate_key_map_", "(col map(INTEGER, INTEGER))")) {
assertQueryFails(
"INSERT INTO " + table.getName() + " VALUES map(ARRAY[1,1], ARRAY[10,20])",
"Duplicate keys are not allowed in map");
}
// Error: unsupported key type (e.g., map as key)
try (TestTable table = newTrinoTable("test_insert_unsupported_key_map_", "(col map(map(INTEGER, INTEGER), INTEGER))")) {
assertQueryFails(
"INSERT INTO " + table.getName() + " VALUES map(ARRAY[map(ARRAY[1], ARRAY[2])], ARRAY[10])",
".*map type is not supported as map key.*");
}
}
```
</issue_to_address>
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
private Optional<ColumnMapping> toColumnMapping( | ||
ConnectorSession session, | ||
String typeName, | ||
int jdbcType, | ||
Optional<Integer> decimalDigits, | ||
Optional<Integer> columnSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: Consider documenting the contract for 'toColumnMapping' regarding Optional parameters.
Since 'orElseThrow()' is called on these Optionals, missing values will cause exceptions. Please verify that all callers supply these parameters, or update the method to handle missing values appropriately.
case Map: | ||
return mapColumnMapping(session, column.getKeyInfo(), column.getValueInfo()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Map type handling may not cover all edge cases for key/value types.
Since unsupported key or value types result in an empty return, this may cause silent failures. Please consider adding explicit error handling or logging to improve user feedback.
Suggested implementation:
switch (columnDataType) {
case Bool:
return Optional.of(varbinaryColumnMapping());
case UUID:
return Optional.of(uuidColumnMapping());
case Map: {
Optional<ClickHouseColumnMapping> mapping = mapColumnMapping(session, column.getKeyInfo(), column.getValueInfo());
if (!mapping.isPresent()) {
log.warn("Unsupported key or value type for Map column: keyInfo={}, valueInfo={}", column.getKeyInfo(), column.getValueInfo());
// Alternatively, you could throw an exception here if you want to fail fast:
// throw new IllegalArgumentException("Unsupported key or value type for Map column: keyInfo=" + column.getKeyInfo() + ", valueInfo=" + column.getValueInfo());
}
return mapping;
}
default:
// no-op
}
- Ensure that a logger named
log
is available in the class. If not, add:
private static final Logger log = Logger.get(ClickHouseClient.class);
- Decide whether you want to log a warning (as above) or throw an exception for unsupported types.
} | ||
|
||
switch (typeHandle.jdbcType()) { | ||
switch (jdbcType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): Switching on 'jdbcType' instead of 'typeHandle.jdbcType()' may introduce subtle bugs.
Please verify that 'jdbcType' and 'typeHandle.jdbcType()' will always have matching values to prevent incorrect type mapping.
3cab91d
to
b520b7c
Compare
.matches("VALUES " + value); | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please update BaseClickHouseTypeMapping
as well? We use ***TypeMapping
class to test type mapping in JDBC-based connectors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! Added tests for all the primitives at least. Got all of them working except Timestamp w/ TZ, not exactly sure why but the precision for a timestamp in a Map is coming back as 29 but it should be 0.
Going to add some tests for nested maps tomorrow.
return Optional.of(varbinaryColumnMapping()); | ||
case UUID: | ||
return Optional.of(uuidColumnMapping()); | ||
case Map: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you update Type mapping
section in clickhouse.md?
|
||
private void testMapRoundTrip(String valueType, String value) | ||
{ | ||
try (TestTable table = newTrinoTable("test_insert_map_", "(col map(INTEGER, %s) NOT NULL)".formatted(valueType))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fails without the NOT NULL
right now, gotta figure that out or at least improve the error message
17fa9c9
to
db32831
Compare
db32831
to
889c0a6
Compare
Description
Add
MAP
type support to the ClickHouse connector.https://clickhouse.com/docs/sql-reference/data-types/map
ClickHouse maps are multi-value, but when I tried that out I only got one value in the client.
I'll add docs if this looks good to reviewers.
Additional context and related issues
#7103
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:
Summary by Sourcery
Add MAP type support to the ClickHouse connector by introducing read and write mappings for MapType, refactoring column mapping logic, and adding relevant tests.
New Features:
Enhancements:
Tests: