Skip to content

Conversation

alexjo2144
Copy link
Member

@alexjo2144 alexjo2144 commented Oct 14, 2025

Description

Add MAP type support to the ClickHouse connector.
https://clickhouse.com/docs/sql-reference/data-types/map

ClickHouse maps are multi-value, but when I tried that out I only got one value in the client.

I'll add docs if this looks good to reviewers.

Additional context and related issues

#7103

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

## Section
* Fix some things. ({issue}`issuenumber`)

Summary by Sourcery

Add MAP type support to the ClickHouse connector by introducing read and write mappings for MapType, refactoring column mapping logic, and adding relevant tests.

New Features:

  • Add support for ClickHouse MAP type in connector, including column mapping and write mapping.

Enhancements:

  • Consolidate column mapping logic into a dedicated toColumnMapping method.
  • Inject TypeOperators into ClickHouseClient to enable MapType creation.
  • Refine decimal and timestamp mapping to use optional decimalDigits and columnSize parameters.

Tests:

  • Enable SUPPORTS_MAP_TYPE for ClickHouse connector tests.
  • Add round-trip insert tests for MapType with integer and varchar value types.

@cla-bot cla-bot bot added the cla-signed label Oct 14, 2025
Copy link

sourcery-ai bot commented Oct 14, 2025

Reviewer's Guide

This PR implements native ClickHouse MAP type support by extending the client’s type mapping and write mapping logic. It injects type operators, refactors toColumnMapping to route MAP column types through a new mapColumnMapping method that builds a Trino MapType from ClickHouse key/value definitions, and adds corresponding read/write functions. Connector behavior flags and tests are updated to cover MAP round-trips.

Sequence diagram for MAP type read/write round-trip

sequenceDiagram
    participant Client
    participant ClickHouseClient
    participant ClickHouseDB
    Client->>ClickHouseClient: Write MAP column
    ClickHouseClient->>ClickHouseDB: setObject(index, mapValue)
    ClickHouseDB-->>ClickHouseClient: Returns MAP column data
    ClickHouseClient->>Client: Read MAP column (buildMapValue)
Loading

Entity relationship diagram for ClickHouse MAP type mapping

erDiagram
    CLICKHOUSE_COLUMN {
        string name
        ClickHouseDataType dataType
        int precision
        int scale
    }
    TRINO_MAP_TYPE {
        Type keyType
        Type valueType
    }
    COLUMN_MAPPING {
        Type type
        ObjectReadFunction readFunction
        ObjectWriteFunction writeFunction
    }
    CLICKHOUSE_COLUMN ||--o| TRINO_MAP_TYPE : "maps to"
    TRINO_MAP_TYPE ||--o| COLUMN_MAPPING : "used in"
    CLICKHOUSE_COLUMN ||--o| COLUMN_MAPPING : "key/value info"
Loading

Class diagram for updated ClickHouseClient MAP type support

classDiagram
    class ClickHouseClient {
        - TypeOperators typeOperators
        - Type uuidType
        - Type ipAddressType
        + Optional<ColumnMapping> toColumnMapping(ConnectorSession, ConnectorColumnHandle, JdbcTypeHandle)
        + Optional<ColumnMapping> toColumnMapping(ConnectorSession, String, int, Optional<Integer>, Optional<Integer>)
        + Optional<ColumnMapping> mapColumnMapping(ConnectorSession, ClickHouseColumn, ClickHouseColumn)
        + WriteMapping toWriteMapping(ConnectorSession, Type)
        + static ObjectWriteFunction mapWriteFunction()
        + static void writeValue(Type, BlockBuilder, Object)
    }
    class MapType {
        + Type getKeyType()
        + Type getValueType()
    }
    class ColumnMapping {
        + static ColumnMapping objectMapping(Type, ObjectReadFunction, ObjectWriteFunction)
    }
    class ObjectReadFunction {
        + static ObjectReadFunction of(Class, Function)
    }
    class ObjectWriteFunction {
        + static ObjectWriteFunction of(Class, BiConsumer)
    }
    ClickHouseClient --> MapType : uses
    ClickHouseClient --> ColumnMapping : returns
    ClickHouseClient --> ObjectReadFunction : uses
    ClickHouseClient --> ObjectWriteFunction : uses
Loading

File-Level Changes

Change Details Files
Inject typeOperators and refactor toColumnMapping for MAP support
  • Add typeOperators field via TypeManager injection
  • Refactor toColumnMapping into an overload handling typeName, jdbcType, decimalDigits, columnSize
  • Wrap unsupported types to VARCHAR when configured
plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java
Add MAP type handling in toColumnMapping
  • Detect ClickHouseDataType.Map and call new mapColumnMapping
  • Recursively derive key/value ColumnMappings
  • Construct Trino MapType with typeOperators for reading
plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java
Extend toWriteMapping to support MapType
  • Handle MapType in toWriteMapping by composing key/value WriteMappings
  • Define SQL data type string as Map(keyType, valueType)
  • Return objectMapping using mapWriteFunction
plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java
Implement mapColumnMapping, mapWriteFunction, and writeValue helpers
  • Create mapColumnMapping to read SqlMap into Trino map blocks
  • Provide mapWriteFunction to serialize Trino maps into JDBC Maps
  • Add writeValue to dispatch various primitive and slice types
plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java
Enable and test MAP type support in connector tests
  • Enable SUPPORTS_MAP_TYPE behavior flag
  • Add testInsertMap and round-trip assertions for map inserts and selects
  • Clean up merge conflicts around test string formatting
plugin/trino-clickhouse/src/test/java/io/trino/plugin/clickhouse/TestClickHouseConnectorTest.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@alexjo2144 alexjo2144 requested a review from krvikash October 14, 2025 23:22
@github-actions github-actions bot added the clickhouse ClickHouse connector label Oct 14, 2025
Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java:659-664` </location>
<code_context>
+        return columnMapping;
+    }
+
+    private Optional<ColumnMapping> toColumnMapping(
+            ConnectorSession session,
+            String typeName,
+            int jdbcType,
+            Optional<Integer> decimalDigits,
+            Optional<Integer> columnSize)
+    {
         ClickHouseVersion version = getClickHouseServerVersion(session);
</code_context>

<issue_to_address>
**issue:** Consider documenting the contract for 'toColumnMapping' regarding Optional parameters.

Since 'orElseThrow()' is called on these Optionals, missing values will cause exceptions. Please verify that all callers supply these parameters, or update the method to handle missing values appropriately.
</issue_to_address>

### Comment 2
<location> `plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java:707-708` </location>
<code_context>
                 return Optional.of(varbinaryColumnMapping());
             case UUID:
                 return Optional.of(uuidColumnMapping());
+            case Map:
+                return mapColumnMapping(session, column.getKeyInfo(), column.getValueInfo());
             default:
                 // no-op
</code_context>

<issue_to_address>
**suggestion:** Map type handling may not cover all edge cases for key/value types.

Since unsupported key or value types result in an empty return, this may cause silent failures. Please consider adding explicit error handling or logging to improve user feedback.

Suggested implementation:

```java
        switch (columnDataType) {
            case Bool:
                return Optional.of(varbinaryColumnMapping());
            case UUID:
                return Optional.of(uuidColumnMapping());
            case Map: {
                Optional<ClickHouseColumnMapping> mapping = mapColumnMapping(session, column.getKeyInfo(), column.getValueInfo());
                if (!mapping.isPresent()) {
                    log.warn("Unsupported key or value type for Map column: keyInfo={}, valueInfo={}", column.getKeyInfo(), column.getValueInfo());
                    // Alternatively, you could throw an exception here if you want to fail fast:
                    // throw new IllegalArgumentException("Unsupported key or value type for Map column: keyInfo=" + column.getKeyInfo() + ", valueInfo=" + column.getValueInfo());
                }
                return mapping;
            }
            default:
                // no-op
        }

```

- Ensure that a logger named `log` is available in the class. If not, add:
  `private static final Logger log = Logger.get(ClickHouseClient.class);`
- Decide whether you want to log a warning (as above) or throw an exception for unsupported types.
</issue_to_address>

### Comment 3
<location> `plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java:713` </location>
<code_context>
         }

-        switch (typeHandle.jdbcType()) {
+        switch (jdbcType) {
             case Types.TINYINT:
                 return Optional.of(tinyintColumnMapping());
</code_context>

<issue_to_address>
**issue (bug_risk):** Switching on 'jdbcType' instead of 'typeHandle.jdbcType()' may introduce subtle bugs.

Please verify that 'jdbcType' and 'typeHandle.jdbcType()' will always have matching values to prevent incorrect type mapping.
</issue_to_address>

### Comment 4
<location> `plugin/trino-clickhouse/src/test/java/io/trino/plugin/clickhouse/TestClickHouseConnectorTest.java:668-675` </location>
<code_context>
         }
     }

+    @Test
+    @Override
+    public void testInsertMap()
+    {
+        // TODO: Add more types here
</code_context>

<issue_to_address>
**suggestion (testing):** Add more comprehensive MAP type tests, including edge cases.

Please add tests for empty maps, multiple key-value pairs, null values, complex key/value types, and error conditions like duplicate keys or unsupported types to improve coverage.

```suggestion
    @Test
    @Override
    public void testInsertMap()
    {
        // Basic types
        testMapRoundTrip("INTEGER", "2");
        testMapRoundTrip("VARCHAR", "CAST('foobar' AS VARCHAR)");

        // Empty map
        try (TestTable table = newTrinoTable("test_insert_empty_map_", "(col map(INTEGER, INTEGER))")) {
            assertUpdate("INSERT INTO " + table.getName() + " VALUES map(ARRAY[], ARRAY[])", 1);
            assertThat(query("SELECT cardinality(col) FROM " + table.getName()))
                    .matches("VALUES 0");
        }

        // Multiple key-value pairs
        try (TestTable table = newTrinoTable("test_insert_multi_map_", "(col map(INTEGER, VARCHAR))")) {
            assertUpdate("INSERT INTO " + table.getName() + " VALUES map(ARRAY[1,2,3], ARRAY['a','b','c'])", 1);
            assertThat(query("SELECT col[1], col[2], col[3] FROM " + table.getName()))
                    .matches("VALUES ('a', 'b', 'c')");
        }

        // Null values
        try (TestTable table = newTrinoTable("test_insert_null_value_map_", "(col map(INTEGER, VARCHAR))")) {
            assertUpdate("INSERT INTO " + table.getName() + " VALUES map(ARRAY[1,2], ARRAY['x', NULL])", 1);
            assertThat(query("SELECT col[1], col[2] FROM " + table.getName()))
                    .matches("VALUES ('x', NULL)");
        }

        // Complex key/value types (e.g., ARRAY as value)
        try (TestTable table = newTrinoTable("test_insert_complex_map_", "(col map(INTEGER, ARRAY(INTEGER)))")) {
            assertUpdate("INSERT INTO " + table.getName() + " VALUES map(ARRAY[1,2], ARRAY[ARRAY[10,20], ARRAY[30]])", 1);
            assertThat(query("SELECT col[1], col[2] FROM " + table.getName()))
                    .matches("VALUES (ARRAY[10,20], ARRAY[30])");
        }

        // Error: duplicate keys
        try (TestTable table = newTrinoTable("test_insert_duplicate_key_map_", "(col map(INTEGER, INTEGER))")) {
            assertQueryFails(
                    "INSERT INTO " + table.getName() + " VALUES map(ARRAY[1,1], ARRAY[10,20])",
                    "Duplicate keys are not allowed in map");
        }

        // Error: unsupported key type (e.g., map as key)
        try (TestTable table = newTrinoTable("test_insert_unsupported_key_map_", "(col map(map(INTEGER, INTEGER), INTEGER))")) {
            assertQueryFails(
                    "INSERT INTO " + table.getName() + " VALUES map(ARRAY[map(ARRAY[1], ARRAY[2])], ARRAY[10])",
                    ".*map type is not supported as map key.*");
        }
    }
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +659 to +701
private Optional<ColumnMapping> toColumnMapping(
ConnectorSession session,
String typeName,
int jdbcType,
Optional<Integer> decimalDigits,
Optional<Integer> columnSize)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: Consider documenting the contract for 'toColumnMapping' regarding Optional parameters.

Since 'orElseThrow()' is called on these Optionals, missing values will cause exceptions. Please verify that all callers supply these parameters, or update the method to handle missing values appropriately.

Comment on lines +707 to +745
case Map:
return mapColumnMapping(session, column.getKeyInfo(), column.getValueInfo());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Map type handling may not cover all edge cases for key/value types.

Since unsupported key or value types result in an empty return, this may cause silent failures. Please consider adding explicit error handling or logging to improve user feedback.

Suggested implementation:

        switch (columnDataType) {
            case Bool:
                return Optional.of(varbinaryColumnMapping());
            case UUID:
                return Optional.of(uuidColumnMapping());
            case Map: {
                Optional<ClickHouseColumnMapping> mapping = mapColumnMapping(session, column.getKeyInfo(), column.getValueInfo());
                if (!mapping.isPresent()) {
                    log.warn("Unsupported key or value type for Map column: keyInfo={}, valueInfo={}", column.getKeyInfo(), column.getValueInfo());
                    // Alternatively, you could throw an exception here if you want to fail fast:
                    // throw new IllegalArgumentException("Unsupported key or value type for Map column: keyInfo=" + column.getKeyInfo() + ", valueInfo=" + column.getValueInfo());
                }
                return mapping;
            }
            default:
                // no-op
        }
  • Ensure that a logger named log is available in the class. If not, add:
    private static final Logger log = Logger.get(ClickHouseClient.class);
  • Decide whether you want to log a warning (as above) or throw an exception for unsupported types.

}

switch (typeHandle.jdbcType()) {
switch (jdbcType) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Switching on 'jdbcType' instead of 'typeHandle.jdbcType()' may introduce subtle bugs.

Please verify that 'jdbcType' and 'typeHandle.jdbcType()' will always have matching values to prevent incorrect type mapping.

@alexjo-dd alexjo-dd force-pushed the alex.jo/map-type-clickhouse branch from 3cab91d to b520b7c Compare October 14, 2025 23:32
.matches("VALUES " + value);
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please update BaseClickHouseTypeMapping as well? We use ***TypeMapping class to test type mapping in JDBC-based connectors.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! Added tests for all the primitives at least. Got all of them working except Timestamp w/ TZ, not exactly sure why but the precision for a timestamp in a Map is coming back as 29 but it should be 0.

Going to add some tests for nested maps tomorrow.

return Optional.of(varbinaryColumnMapping());
case UUID:
return Optional.of(uuidColumnMapping());
case Map:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you update Type mapping section in clickhouse.md?


private void testMapRoundTrip(String valueType, String value)
{
try (TestTable table = newTrinoTable("test_insert_map_", "(col map(INTEGER, %s) NOT NULL)".formatted(valueType))) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails without the NOT NULL right now, gotta figure that out or at least improve the error message

@github-actions github-actions bot added the docs label Oct 15, 2025
@alexjo-dd alexjo-dd force-pushed the alex.jo/map-type-clickhouse branch from 17fa9c9 to db32831 Compare October 16, 2025 17:45
@alexjo-dd alexjo-dd force-pushed the alex.jo/map-type-clickhouse branch from db32831 to 889c0a6 Compare October 16, 2025 18:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Development

Successfully merging this pull request may close these issues.

3 participants