Skip to content

Improve error message when column's serialized size exceed the smoosher’s maximum#18845

Merged
GWphua merged 1 commit intomasterfrom
column_name_smoosher_error
Dec 31, 2025
Merged

Improve error message when column's serialized size exceed the smoosher’s maximum#18845
GWphua merged 1 commit intomasterfrom
column_name_smoosher_error

Conversation

@abhishekrb19
Copy link
Contributor

@abhishekrb19 abhishekrb19 commented Dec 16, 2025

When the serialized size of a column exceeds 2 GB (the smoosher’s maximum) due to large blobs in a specific column, the ingestion task fails with the following stack trace - it doesn't indicate the problematic column:

java.lang.RuntimeException: org.apache.druid.java.util.common.IAE: Asked to add buffers[2,171,458,617] larger than configured max[2,147,483,647]
	at org.apache.druid.segment.realtime.appenderator.StreamAppenderator.mergeAndPush(StreamAppenderator.java:1015) ~[druid-server-32.0.1.jar:32.0.1]
	at org.apache.druid.segment.realtime.appenderator.StreamAppenderator.lambda$push$1(StreamAppenderator.java:826) ~[druid-server-32.0.1.jar:32.0.1]
	at com.google.common.util.concurrent.AbstractTransformFuture$TransformFuture.doTransform(AbstractTransformFuture.java:252) ~[guava-32.0.1-jre.jar:?]
	at com.google.common.util.concurrent.AbstractTransformFuture$TransformFuture.doTransform(AbstractTransformFuture.java:242) ~[guava-32.0.1-jre.jar:?]
	at com.google.common.util.concurrent.AbstractTransformFuture.run(AbstractTransformFuture.java:123) [guava-32.0.1-jre.jar:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
Caused by: org.apache.druid.java.util.common.IAE: Asked to add buffers[2,171,458,617] larger than configured max[2,147,483,647]
	at org.apache.druid.java.util.common.io.smoosh.FileSmoosher.addWithSmooshedWriter(FileSmoosher.java:176) ~[druid-processing-32.0.1.jar:32.0.1]
	at org.apache.druid.segment.IndexMergerV9.makeColumn(IndexMergerV9.java:788) ~[druid-processing-32.0.1.jar:32.0.1]
	at org.apache.druid.segment.IndexMergerV9.makeIndexFiles(IndexMergerV9.java:291) ~[druid-processing-32.0.1.jar:32.0.1]
	at org.apache.druid.segment.IndexMergerV9.merge(IndexMergerV9.java:1359) ~[druid-processing-32.0.1.jar:32.0.1]
	at org.apache.druid.segment.IndexMergerV9.multiphaseMerge(IndexMergerV9.java:1177) ~[druid-processing-32.0.1.jar:32.0.1]
	at org.apache.druid.segment.IndexMergerV9.mergeQueryableIndex(IndexMergerV9.java:1119) ~[druid-processing-32.0.1.jar:32.0.1]
	at org.apache.druid.segment.realtime.appenderator.StreamAppenderator.mergeAndPush(StreamAppenderator.java:957) ~[druid-server-32.0.1.jar:32.0.1]

With this patch, the DruidException message includes the column name along with a few remediation suggestions as follows:

Serialized buffer size[10] for column[foo] exceeds the maximum[5]. Consider adjusting the tuningConfig - for example, reduce maxRowsPerSegment, or partition your data further.
CleanShot 2025-12-18 at 20 03 32@2x

This PR has:

  • been self-reviewed.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.

- Include column name and a suggestion on how to remediate
Copy link
Contributor

@GWphua GWphua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@GWphua GWphua merged commit ccfcb1c into master Dec 31, 2025
142 of 145 checks passed
@kgyrtkirk kgyrtkirk added this to the 36.0.0 milestone Jan 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants