Skip to content

Add embedded test for object-storage-encoding format for nested field#18818

Merged
cecemei merged 19 commits intoapache:masterfrom
cecemei:nested-it
Dec 9, 2025
Merged

Add embedded test for object-storage-encoding format for nested field#18818
cecemei merged 19 commits intoapache:masterfrom
cecemei:nested-it

Conversation

@cecemei
Copy link
Contributor

@cecemei cecemei commented Dec 5, 2025

Description

Add embedded test for object-storage-encoding format for nested field.


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Copy link
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the new test, @cecemei !
I have left some minor suggestions, even though the PR is still in draft.

@cecemei cecemei marked this pull request as ready for review December 8, 2025 19:26
<failsOnError>true</failsOnError>
<excludes>
*com/fasterxml/jackson/databind/*
*com/fasterxml/jackson/databind/*,**/NestedDataFormatsTest.java
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the new test excluded from checkstyle?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parsing error

final String scanQuery = "select event, to_json_string(agent) as agent from %s where json_value(event, '$.type') = 'PercentClear' and json_value(agent, '$.os') = 'Android' order by __time asc limit 1";
final String scanQueryResultDefaultFormat = cluster.callApi().runSql(scanQuery, defaultFormat);
final String scanQueryResultNoneObjectStorage = cluster.callApi().runSql(scanQuery, dataSource);
// CHECKSTYLE: text blocks not supported in current Checkstyle version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's consider updating the checkstyle version in that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only later version of checkstyle support the """ style (java 15+), and had to exclude this file since the error thrown was parsing error. in the later version of checkstyle a bunch of other things failed as well, so prefer to make checkstyle a separate pr besides this one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thanks for the details.

I think you can also use // CHECKSTYLE.OFF to turn off checkstyle without having to exclude the entire file in pom. e.g. see StringUtils class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it still complains about error, i think parsing was before style check

Failed to execute goal org.apache.maven.plugins:maven-checkstyle-plugin:3.0.0:checkstyle (default-cli) on project druid-embedded-tests: An error has occurred in Checkstyle report generation.: Failed during checkstyle configuration: Exception was thrown while processing /Users/cecemei/codespace/bitmap/embedded-tests/src/test/java/org/apache/druid/testing/embedded/indexing/NestedDataFormatsTest.java: IllegalStateException occurred while parsing file /Users/cecemei/codespace/bitmap/embedded-tests/src/test/java/org/apache/druid/testing/embedded/indexing/NestedDataFormatsTest.java. /Users/cecemei/codespace/bitmap/embedded-tests/src/test/java/org/apache/druid/testing/embedded/indexing/NestedDataFormatsTest.java:118:5: unexpected token: Assertions -> [Help 1]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for checking. Let's just exclude it in the pom until we upgrade the checkstyle version in the follow up PR.

cecemei and others added 8 commits December 8, 2025 20:09
…/indexing/NestedDataFormatsTest.java

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
…/indexing/NestedDataFormatsTest.java

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
@cecemei
Copy link
Contributor Author

cecemei commented Dec 9, 2025

Thanks for adding the new test, @cecemei ! I have left some minor suggestions, even though the PR is still in draft.

Updated, please take another look, thx!

Copy link
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the updates, @cecemei !

cluster.callApi().waitForAllSegmentsToBeAvailable(dataSource, coordinator, broker);

// Test ingesting with skipping raw json smile format works, same row count, with ~20% storage saving
final String metadata = "select sum(num_rows), sum(size) from sys.segments where datasource = '%s' group by datasource";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the GROUP BY clause can be omitted.

Suggested change
final String metadata = "select sum(num_rows), sum(size) from sys.segments where datasource = '%s' group by datasource";
final String metadataSql = "select sum(num_rows), sum(size) from sys.segments where datasource = '%s'";

Assertions.assertEquals(queryResultDefaultFormat, queryResultNoneObjectStorage);

// Test reconstruct json column works, the ordering of the fields has changed, but all values are perserved.
final String scanQuery = "select event, to_json_string(agent) as agent from %s where json_value(event, '$.type') = 'PercentClear' and json_value(agent, '$.os') = 'Android' order by __time asc limit 1";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please try to break the SQL string into multiple lines.

@cecemei cecemei merged commit 8a2d2ed into apache:master Dec 9, 2025
55 checks passed
@jtuglu1
Copy link
Contributor

jtuglu1 commented Dec 10, 2025

Hey folks, looks like this updated to JDK 17 compiler. This was a change previously reverted here: #18759 as it breaks a few things.

@kgyrtkirk kgyrtkirk added this to the 36.0.0 milestone Jan 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Comments