Skip to content

Comments

fix: propagate SSE stream errors to waiting requests#2122

Open
heyhayes wants to merge 5 commits intomodelcontextprotocol:mainfrom
heyhayes:fix/sse-error-propagation-1401
Open

fix: propagate SSE stream errors to waiting requests#2122
heyhayes wants to merge 5 commits intomodelcontextprotocol:mainfrom
heyhayes:fix/sse-error-propagation-1401

Conversation

@heyhayes
Copy link

@heyhayes heyhayes commented Feb 21, 2026

Summary

Fixes #1401. Also fixes #1789 (closed as duplicate).

When an SSE read timeout occurs during a StreamableHTTP POST request, the pending send_request call hangs indefinitely. The transport catches the exception but never sends an error back through the read stream, leaving the caller blocked on response_stream_reader.receive() with nothing to receive.

This PR fixes error propagation at the transport level so that SSE stream failures produce a JSONRPCError keyed to the original request ID. BaseSession._handle_response routes it to the correct per-request response stream, and send_request surfaces it as MCPError to the caller. This approach keeps failures isolated to the affected request rather than tearing down the entire session.

What changed

_handle_sse_response now sends a JSONRPCError(INTERNAL_ERROR, "SSE stream ended without a response") when the SSE stream ends without delivering a complete response, whether due to a read timeout, network error, or unexpected server close. If a last_event_id was received, reconnection is attempted first; the error is only sent after reconnection is exhausted.

_handle_reconnection returns bool instead of None so callers can distinguish success (response delivered) from failure (attempts exhausted). The method also fixes an infinite recursion bug: the attempt counter was reset to 0 on every stream end (even when no complete response was delivered), which combined with httpx read timeouts causing graceful stream termination meant the reconnection loop could run forever.

handle_get_stream applies the same fix to the GET stream's reconnection loop: the attempt counter only resets when events were actually received during the connection. Empty connections that close immediately count toward MAX_RECONNECTION_ATTEMPTS.

_default_message_handler now logs a warning for exceptions instead of silently discarding them, providing observability for transport errors not tied to a specific request.

Test plan

  • New E2E test: client with 0.5s SSE read timeout reads a slow resource (2s server delay), asserts MCPError is raised instead of hanging
  • New unit test: _handle_reconnection returns False when called at max attempts
  • All 51 existing streamable HTTP tests pass (including reconnection, polling, and multi-reconnection tests)
  • All 233 client tests pass
  • Linting (ruff check .) and type checking (pyright) clean

…rsion

_handle_reconnection previously returned None, making it impossible for
callers to distinguish between a successful response delivery and
exhausted retries. This changes the return type to bool (True on success,
False when max attempts exceeded) and fixes two issues:

- The attempt counter at line 426 was reset to 0 on each reconnection,
  causing infinite recursion when streams kept ending without delivering
  a response. Now increments attempt on each recursion.

- All recursive calls now use `return await` so the result propagates
  back to the original caller.

MAX_RECONNECTION_ATTEMPTS increased from 2 to 5 to accommodate
legitimate multi-reconnection patterns where the server intentionally
closes streams between checkpoints.

Github-Issue: modelcontextprotocol#1401
When an SSE stream ends prematurely (e.g. due to a read timeout), the
client would hang forever waiting for a response that will never arrive.

Now _handle_sse_response checks the return value of _handle_reconnection
and, if reconnection did not deliver a response, sends a JSONRPCError
with INTERNAL_ERROR to the read stream. This unblocks the waiting
request and surfaces the failure as an MCPError to the caller.

Github-Issue: modelcontextprotocol#1401
Only reset the attempt counter when events were actually received
during the connection. Connections that close immediately without
delivering events now count toward MAX_RECONNECTION_ATTEMPTS.

Github-Issue:modelcontextprotocol#1401
Transport errors that are not tied to a specific pending request (e.g.,
GET stream failures) were silently swallowed by the default message
handler. Add a warning log so these exceptions are at least visible in
logs as an observability safety net.

Github-Issue: modelcontextprotocol#1401
@heyhayes heyhayes marked this pull request as draft February 21, 2026 13:22
@heyhayes heyhayes force-pushed the fix/sse-error-propagation-1401 branch 2 times, most recently from a20a405 to c1fffe8 Compare February 21, 2026 13:51
Add test_sse_error_when_reconnection_exhausted to exercise the
_handle_sse_response path where SSE events are received (setting
last_event_id) but reconnection fails, ensuring the JSONRPCError
is sent to unblock the waiting request.
@heyhayes heyhayes force-pushed the fix/sse-error-propagation-1401 branch from 90c334d to f0af07e Compare February 21, 2026 14:03
@heyhayes heyhayes marked this pull request as ready for review February 21, 2026 14:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ClientSession Error Handling Client hangs forever

1 participant