Skip to content

Conversation

@kavirajk
Copy link
Contributor

@kavirajk kavirajk commented Oct 20, 2025

Changelog category (leave one):

  • Backward Incompatible Change

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Support exception tagging for HTTP results response to enable clients to parse exceptions more reliably. Resolves #75175. The setting http_write_exception_in_output_format is disabled by default for consistency across formats.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Details

This PR adds a consistent and reliable way to return exceptions that occur mid-stream during HTTP request serving. The behavior is enabled by default and adds an X-ClickHouse-Exception-Tag header to every response.

When an exception occurs, it will be written in the following form which makes it easier for clients to parse the exception message reliably:

HTTP/1.1 200 OK
X-ClickHouse-Exception-Tag: PU1FNUFH98

bodybodybodybody
__exception__PU1FNUFH98
Big bam accrued right while reading the data
45 PU1FNUFH98__exception__

Here 45 is the size of the exception message, and PU1FNUFH98 is the exception token.

Resolves #75175

@clickhouse-gh
Copy link

clickhouse-gh bot commented Oct 20, 2025

Workflow [PR], commit [848832b]

Summary:

job_name test_name status info comment
Stateless tests (arm_binary, sequential) failure
00167_read_bytes_from_fs FAIL cidb
Integration tests (amd_asan, old analyzer, 1/6) failure
test_storage_s3_queue/test_parallel_inserts.py::test_parallel_inserts_with_failures[1] FAIL cidb, flaky
Integration tests (amd_asan, old analyzer, 4/6) failure
test_replicated_database_recover_digest_mismatch/test.py::test_recover_digest_mismatch FAIL cidb
Integration tests (amd_binary, 2/5) failure
test_merge_tree_s3/test.py::test_merge_canceled_by_s3_errors[node-broken_s3_always_multi_part] FAIL cidb, flaky

@clickhouse-gh clickhouse-gh bot added the pr-improvement Pull request with some product improvements label Oct 20, 2025
Fixes: ClickHouse#75175

This PR adds consistent and reliable way to return exceptions occured in
mid-stream during serving HTTP request.

It implements the proposal agreed on the attached issue. Everytime the
exception occurs, it will be written in following form (which makes it
easier for clients to parse the exception message reliably)

```
HTTP/1.1 200 OK↵
X-ClickHouse-Exception-Tag: PU1FNUFH98↵
↵
bodybodybodybody↵
__exception__PU1FNUFH98↵
Big bam accrued right while reading the data↵
45 PU1FNUFH98__exception__↵
```

Here 45 is the size of the Big bam accrued right while reading the data↵ message, and PU1FNUFH98 is the exception token.

Also this behavior change comes into play only when the settings
`http_exception_tagging` is 1 (default 0)
Otherwise it's a no-op

Signed-off-by: Kaviraj <[email protected]>
@kavirajk kavirajk force-pushed the kavirajk/support-exception-tag-consistent-error-handling branch from d056aec to b8ee6cc Compare October 20, 2025 21:56
Copy link
Contributor

@slvrtrn slvrtrn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tavplubix PTAL

DECLARE(UInt64, http_response_buffer_size, 0, R"(
The number of bytes to buffer in the server memory before sending a HTTP response to the client or flushing to disk (when http_wait_end_of_query is enabled).
)", 0) \
DECLARE(Bool, http_exception_tagging, false, R"(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also test if we can enable it for a readonly user

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also curious: does it throw if it is set on an older CH version, e.g. do we need to track CH version on the language client side before setting it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a chat with @slabko and @tavplubix. We decided not to have this settings at all.

Given, we made sure we have a \r\n before between exception marker and a tag. It wouldn't break any existing behavior and this feature can be enabled by default without needing any toggle.

I will remove the this setting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although it's not supposed to break any existing behavior (in the worst case, a weird string will be added to exception message), it still makes sense to use "Backward Incompatible Change" as the changelog category just for awareness (because who knows how some trash-scripts may parse the exception message)

Also, afair we were considering disabling http_write_exception_in_output_format by default (but it's also okay to keep this setting as is and disable it on the client side if the client expects new exception format for all output formats including XML/JSON*)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(but it's also okay to keep this setting as is and disable it on the client side if the client expects new exception format for all output formats including XML/JSON*)

As a side note, we strive to minimise the number of settings configured by language clients to prevent changes to the default server behaviour and potential permission issues for read-only users.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example of a setting that is supposed to be changed by language clients (like other format/protocol settings). Because if you use a language client - you don't care about how the client communicates with the server, the client hides this from you. But the client may need to change settings like this.

potential permission issues for read-only users

This is the main problem. We can make such settings changeable for readonly users (it's already possible with <changeable_in_readonly/> settings constraint) if you think that it can be useful for other format/protocol settings as well. If not - we can change the default of this setting (but old versions will still have the old default, and you will have similar problems for a year until the old versions reach EOL)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a chat with @tavplubix about this. We have following options.

  1. Disable http_write_exception_in_output_format by default so that language clients will have consistent exception block (as implemented in this PR) for all the formats including JSON/XML without sending any additional settings. But it will be only available in future clickhouse server versions. Language clients still have to deal with it for old clickhouse server anyway. And this would still be a problem for ReadOnly user (who cannot change this settings) for older clickhouse server.

  2. We keep the setting as it is now and Make this settings http_write_exception_in_output_format changeable even by ReadOnly user. But again we cannot backport this and still only be available in new clickhouse server and language clients have to handle it for older clickhouse servers.

  3. Do both (1) and (2). Again only for future clickhouse server. No backport.

@tavplubix also mentioned, for option (2), we can backport it on the ClickHouse cloud though. But still we have to tell our both OSS users and Cloud users about this difference in our docs.

IMHO, I personally think (1) is good option without much "docs" complexity. And on the client side, we can do always pass http_write_exception_in_output_format=0 for now till older versions reach EOL. And for the ReadOnly user, if the parsing failed -> we look for __exception__ marker.

I will bring this up in today's team sync and see what everyone thinks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for option #1 to change the default

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option 1 for sure

@slvrtrn slvrtrn requested a review from tavplubix October 21, 2025 07:17
@mshustov mshustov requested a review from slabko October 21, 2025 10:31
After a discussion with people, we agreed this can be default behavior
without needing any toggle by new settings.

Signed-off-by: Kaviraj <[email protected]>
@kavirajk kavirajk marked this pull request as ready for review October 22, 2025 20:25
@tavplubix tavplubix self-assigned this Oct 28, 2025
Comment on lines 13 to 15
echo "Test 2: Verify exception tag is alphanumeric"
TAG=$(${CLICKHOUSE_CURL} -sS -D - "${CLICKHOUSE_URL}" --data-binary "SELECT 1" | grep "X-ClickHouse-Exception-Tag" | cut -d':' -f2 | tr -d ' \r\n')
if [[ "$TAG" =~ ^[A-Za-z0-9]+$ ]] && [[ ${#TAG} -eq 8 ]]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed that getRandomASCIIString (unlike FunctionRandomPrintableASCII) only uses [a-z], which means 26^8 = 37.6 bits of entropy, which is probably good enough, but it's much less than UUIDv4 has, for example

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. given it's "per-request", do you think it may cause collision at our peek scale? happy to change it in that case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should we care about collisions between different requests?

The main purposeof having a random string is to be able to distinguish the exception marker from the data in the response. It's already quite unlikely to find the same random string in the data, but why not make it longer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad. @tavplubix explained the actual need for bigger tag :) . It's not to avoid any collision. It's to make sure tag is random enough not to appear in actual dataset.

updated to 16 byte (from 8 bytes previously)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use UUID? Do we have any performance concerns in this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can. It's just not that useful I think. Given we have entropy of 26^16 = 75.2 bits now, it's more than good enough.


echo "Test 3: Check exception marker format on error"
# Get a fresh TAG from the error response itself
RESPONSE=$(${CLICKHOUSE_CURL} -sS -D - "${CLICKHOUSE_URL}&http_response_buffer_size=1" --data-binary "SELECT throwIf(number=3, 'there is a exception') FROM system.numbers SETTINGS max_block_size=1" 2>&1 || true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was failing because of buffering: the result is too small, so the exception happens before clickhouse-server starts sending the response (so clickhouse-server can respond 500 and just send the exception in the body)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tavplubix for this change :). Makes sense.

Also I notice the same two tests still failing now.

Digging into it, I noticed. It's always Test 3 or Test 4. It's always exception marker is missing in the response sometimes. I think reason is still sometimes the test get directly 500 from the server. Looks like we may also tweak other settings to run this deterministically?

2025-10-28 19:22:02  Test 4: Check reverse marker format (length + tag + __exception__)
2025-10-28 19:22:02 -Reverse marker found: OK
2025-10-28 19:22:02 +Reverse marker not found
025-10-29 02:31:45  Test 3: Check exception marker format on error
2025-10-29 02:31:45 -Tagged exception marker found: OK
2025-10-29 02:31:45 +Tagged exception marker not found

And for the buffering failures. It does expect the exception in the mid-stream may be?

2025-10-28 23:55:02 Reason: having exception in stdout:  
2025-10-28 23:55:02 FAIL 4 max_result_bytes=4000000&http_response_buffer_size=2000000&http_wait_end_of_query=0 5000000

kavirajk and others added 5 commits October 29, 2025 11:57
1. Remove unused `getExceptionTag()`
2. Add exception tag to `Access-Control-Expose-Headers`
3. Calculate the proper trim size of exception message. Making sure the
   whole exception block doesn't exceed MAX_EXCEPTION_SIZE
4. Update the tests accordingly

Signed-off-by: Kaviraj <[email protected]>
Needed as we added exception tag header X-ClickHouse-Exception-Tag in Access-Control-Expose-Headers

Signed-off-by: Kaviraj <[email protected]>
1. Uses the http buffering, response size, block size consistently
2. Remove any additional LIMIT clause

Signed-off-by: Kaviraj <[email protected]>

// 2 bytes represents - \r\n
// 1 byte represents - ' ' (space between <message_length> <TAG>) in the above exception block format
size_t size_message_excluded = 2 + EXCEPTION_MARKER.size() + 2 + EXCEPTION_TAG_LENGTH + 2 + sizeof(size_t) + 1 + EXCEPTION_TAG_LENGTH + 2 + EXCEPTION_MARKER.size() + 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sizeof(size_t) = 8, while toString(size_t_variable).size() can be up to 20 (18446744073709551615 fits into size_t). Although limited_message.size() is limited and 8 bytes is more than enough, this looks confusing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but limited_message.size() return size_t and the actual size depends on the arch no? say 4-byte in 32-bit machine and 8 byte in 64-bit machine.

That was main rationale for me to do that way.

I'm happy to hard-code it to 8 byte if you can confirm this won't be a problem :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but limited_message.size() return size_t and the actual size depends on the arch no? say 4-byte in 32-bit machine and 8 byte in 64-bit machine.

Yes, you're right, but we don't build clickhouse for 32-bit architectures, it's not supported

However, if we did - this code would be incorrect:

  • sizeof(size_t) would be 4, but this is for binary representation of the number
  • limited_message.size() can be, for example, 10000 (it's still way below the limit), but you need 5 bytes to write this number as string because there are 5 decimal digits, and it doesn't fit into 4 bytes

So this would be a bug

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense. Updated to 8 bytes.

1. Change exception tag size to 16 bytes
2. Hardcode the exception size to 8 bytes

Signed-off-by: Kaviraj <[email protected]>
@tavplubix
Copy link
Member

Could you please update docs as well?

@slvrtrn
Copy link
Contributor

slvrtrn commented Oct 31, 2025

@kavirajk, @tavplubix, @mshustov: before merging this one, let's double-check the implementation with one of the language clients and see if it is ergonomic enough to use. Since 25.10 is out, we still have plenty of time before 25.11.

@clickhouse-gh clickhouse-gh bot added pr-backward-incompatible Pull request with backwards incompatible changes and removed pr-improvement Pull request with some product improvements labels Nov 2, 2025
@kavirajk
Copy link
Contributor Author

kavirajk commented Nov 2, 2025

@tavplubix when implementing the support on client side (in Golang client), I found a subtle bug.

Bug: Currently during exception, CH server doesn't send zero-chunk at the end. Which breaks the HTTP RFC 7230 Chunked Transfer Encoding protocol.

Steps to reproduce with CH server built on this PR

Notice there is no zero chunk 0\r\n at the end

$ curl -i --raw -s --max-time 120 \
        'http://localhost:8123/?database=default&wait_end_of_query=0&max_threads=1' \
        --data-binary "SELECT throwIf(number=3, 'there is a exception') FROM system.numbers SETTINGS max_block_size=1 FORMAT Native" \
        --output broken_response.raw

$ tail -c 60 broken_response.raw | hexdump -c
0000000       (   v   e   r   s   i   o   n       2   5   .   1   1   .
0000010   1   .   1   )  \n   3   0   3       r   w   z   h   v   n   a
0000020   m   h   q   c   f   y   f   b   h  \r  \n   _   _   e   x   c
0000030   e   p   t   i   o   n   _   _  \r  \n  \r  \n
000003c

$ tail -c 60 broken_response.raw | cat -v
 (version 25.11.1.1)
303 rwzhvnamhqcfyfbh^M
__exception__^M
^M

Go client (like any standard HTTP client), fail to parse the last chunk(returning EOF early), but it won't return the "partial chunk" because it didn't find the last zero-chunk. Hence we cannot extract the "exception message" from the partial chunk. You can find the demo_zero_chunk.go here

$ go run demo_zero_chunk.go
❌ Error reading body: unexpected EOF
   This happens because ClickHouse doesn't send proper chunk terminator!
   Captured 0 bytes before error
exit status 1

I fixed the flush code and you see the zero-chunk in the hexdump and cat -v

$ curl -i --raw -s --max-time 120 \
        'http://localhost:8123/?database=default&wait_end_of_query=0&max_threads=1' \
        --data-binary "SELECT throwIf(number=3, 'there is a exception') FROM system.numbers SETTINGS max_block_size=1 FORMAT Native" \
        --output correct_response.raw

$ tail -c 60 correct_response.raw | hexdump -c
0000000   s   i   o   n       2   5   .   1   1   .   1   .   1   )  \n
0000010   3   0   3       a   i   o   z   l   b   c   t   b   f   l   s
0000020   f   g   d   o  \r  \n   _   _   e   x   c   e   p   t   i   o
0000030   n   _   _  \r  \n  \r  \n   0  \r  \n  \r  \n
000003c

$ tail -c 60 correct_response.raw | cat -v
sion 25.11.1.1)
303 aiozlbctbflsfgdo^M
__exception__^M
^M
0^M
^M

Also Go client is happy as well 🎉

$ go run demo_zero_chunk.go
✅ Successfully read complete response (555 bytes)

⚠️  Query completed with exception (streamed in body)
Exception: Code: 395. DB::Exception: there is a exception: while executing 'FUNCTION throwIf(equals(__table1.number, 3_UInt8) :: 0, 'there is a exception'_String :: 2) -> throwIf(equals(__table1.number, 3_UInt8), 'there is a exception'_String) UInt8 : 1'. (FUNCTION_THROW_IF_VALUE_IS_NON_ZERO) (version 25.11.1.1)
exit status 1

Here is the full (draft) Go client support implementation.

Let me know what you think?

EDIT: After internal discussion with the team. We decided not to add zero-chunk to explicitly communicate to the client that there is an exception and it's not a normal chunked response.

@tavplubix
Copy link
Member

Bug: Currently during exception, CH server doesn't send zero-chunk at the end.

ClickHouse does this intentionally. See #68800

```

Where `<TAG>` is a random tag of 16bytes and `<error message>` can be up to 16 KiB max (exact length can be found in `<message_length>`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<TAG> is a random tag

Should we mention the X-ClickHouse-Exception-Tag header here?

<error message> can be up to 16 KiB max

The whole block is up to 16 KiB max, the error message is slightly shorter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes to both 👍

@kavirajk
Copy link
Contributor Author

kavirajk commented Nov 5, 2025

@tavplubix thanks for the review ❤️ Addressed the doc comments. The failing tests looks unrelated. Let me know if needed anything before merging the PR :)

@tavplubix tavplubix enabled auto-merge November 5, 2025 09:54
@tavplubix tavplubix added this pull request to the merge queue Nov 5, 2025
Merged via the queue into ClickHouse:master with commit 239ac0e Nov 5, 2025
120 of 125 checks passed
@robot-ch-test-poll robot-ch-test-poll added the pr-synced-to-cloud The PR is synced to the cloud repo label Nov 5, 2025
slabko added a commit to ClickHouse/clickhouse-odbc that referenced this pull request Nov 6, 2025
ClickHouse now implements an improved format for mid-stream exceptions:
ClickHouse/ClickHouse#88818

This change adopts the new format and uses it when possible. Otherwise, it
falls back to the older approach of searching for the __exception__ marker
at the end of the buffered data.
slabko added a commit to ClickHouse/clickhouse-odbc that referenced this pull request Nov 6, 2025
ClickHouse now implements an improved format for mid-stream exceptions:
ClickHouse/ClickHouse#88818

This change adopts the new format and uses it when possible. Otherwise, it
falls back to the older approach of searching for the __exception__ marker
at the end of the buffered data.
slabko added a commit to ClickHouse/clickhouse-odbc that referenced this pull request Nov 6, 2025
ClickHouse now implements an improved format for mid-stream exceptions:
ClickHouse/ClickHouse#88818

This change adopts the new format and uses it when possible. Otherwise, it
falls back to the older approach of searching for the __exception__ marker
at the end of the buffered data.
slabko added a commit to ClickHouse/clickhouse-odbc that referenced this pull request Nov 6, 2025
ClickHouse now implements an improved format for mid-stream exceptions:
ClickHouse/ClickHouse#88818

This change adopts the new format and uses it when possible. Otherwise, it
falls back to the older approach of searching for the __exception__ marker
at the end of the buffered data.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backward-incompatible Pull request with backwards incompatible changes pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HTTP interface: more reliable way to detect errors in the middle of a stream

5 participants