Skip to content

Conversation

pja-ant
Copy link
Contributor

@pja-ant pja-ant commented Sep 10, 2025

Motivation and Context

httpx-sse has incorrect handling of newlines in its parsing, particularly around Unicode newline characters that are incorrectly treated as newlines. See florimondmanca/httpx-sse#34 for details.

Our bug, #1356, explains that this causes incorrect output and also just hangs the stream processing entirely (due to latent exception handling issues, which also need fixing).

I'm fixing the SSE parsing on our side, but will also look into fixing at the source. Not sure how long that will take to get pushed through.

How Has This Been Tested?

Two unit tests added.

Breaking Changes

None

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

@pja-ant pja-ant requested review from a team and dsp-ant September 10, 2025 13:52
@Kludex
Copy link
Member

Kludex commented Sep 11, 2025

I'm fixing the SSE parsing on our side, but will also look into fixing at the source. Not sure how long that will take to get pushed through.

Florimond (httpx-sse) replied in an hour... Please create a PR on that side.

@pja-ant pja-ant marked this pull request as draft September 11, 2025 11:05
@pja-ant
Copy link
Contributor Author

pja-ant commented Sep 11, 2025

httpx-sse PR: florimondmanca/httpx-sse#37

@felixweinberger felixweinberger added pending publish Draft PRs need to be published for team to review pending dependency updates This change depends on updates to dependencies needs more work Not ready to be merged yet, needs additional changes. needs sync Needs sync with latest main branch to ensure CI passes labels Sep 26, 2025
Use compliant_aiter_sse in streamable_http.py to handle Unicode
line separator characters correctly, preventing the same issue
that affected the SSE client.
Previously, the SSE parser could incorrectly handle CRLF line endings when
\r appeared at the end of one chunk and \n at the beginning of the next
chunk, potentially treating them as two separate line breaks instead of
a single CRLF sequence.

This fix implements proper CRLF handling by:
- Tracking when a chunk ends with \r using a skip_leading_lf flag
- Skipping a leading \n in the next chunk if the previous ended with \r
- Ensuring Unicode line/paragraph separators (U+2028/U+2029) are treated
  as regular content, not line breaks, per the SSE specification

Added comprehensive test coverage for the edge case of split CRLF sequences
across chunk boundaries.
The Unicode line separator issue (U+2028 and U+2029 characters being
incorrectly treated as newlines) has been fixed in httpx-sse 0.4.2.
See: florimondmanca/httpx-sse#39

Revert the compliant_aiter_sse workaround and use the standard
event_source.aiter_sse() method again. Upgrade httpx-sse to >=0.4.2
to get the fix. Keep the high-level issue test to ensure the problem
doesn't regress.

Github-Issue:#1356
@pja-ant pja-ant force-pushed the fix/sse-unicode-line-separator branch from 4021da2 to 45109d2 Compare October 7, 2025 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs more work Not ready to be merged yet, needs additional changes. needs sync Needs sync with latest main branch to ensure CI passes pending dependency updates This change depends on updates to dependencies pending publish Draft PRs need to be published for team to review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants