Skip to content

Widget Chat Streaming API Contract (O5)

This document defines the API contract for streaming widget chat responses over Server-Sent Events (SSE). It extends the existing non-streaming POST /api/widgets/:tenantId/chat behavior.

Reference: Implementation plan docs/plans/2026-03-01-O4-O5-summarization-streaming.md (Part B — O5).
Related: Non-streaming chat is documented in Live Chat API Reference.


Endpoint

http
POST /api/widgets/:tenantId/chat
  • Path parameter: tenantId — Tenant (widget) identifier; same as widgetId in the non-streaming doc.
  • Streaming is enabled by request body: include "stream": true in the JSON body. When stream is not true, the endpoint behaves as the existing non-streaming chat (JSON response).

Request body

Same fields as the non-streaming widget chat, plus stream: true:

FieldTypeRequiredDescription
sessionIdstringYesUnique session identifier.
messagestringYesUser message text.
streambooleanFor streamingMust be true to receive an SSE stream; otherwise a normal JSON response is returned.
conversationIdstringNoConversation to continue; omitted or null for a new conversation.
preferredConversationIdstringNoPreferred conversation for resume/stitching.
assistantIdstringNoAssistant to use; defaults to tenant default if not provided.
playgroundModebooleanNoIf true, conversation is marked for playground TTL cleanup; requires playground auth.
emailstringNoContact email when available.
tracking_cursor_last_event_idstringNoTracking cursor for analytics.

Example (streaming):

json
{
  "sessionId": "session_abc123",
  "message": "What are your opening hours?",
  "stream": true,
  "conversationId": "conv-456",
  "preferredConversationId": "conv-456",
  "assistantId": "asst-123",
  "playgroundMode": false,
  "email": "user@example.com",
  "tracking_cursor_last_event_id": "evt_xyz"
}

Authentication

Same as current widget chat:

  • Production widget: Authorization: Bearer <widget_token> (tenant widget token).
  • Playground: When playgroundMode is true, X-Widget-Playground-Token: <playground_token> (in addition to or instead of Bearer, per current implementation).

Domain checks and playground validation apply as in the non-streaming endpoint. Unauthorized or invalid auth returns 401 or 403 with a JSON body before any stream is started.


Rate limits

Unchanged from non-streaming widget chat. The same limits apply to streaming requests.


Response (when stream: true)

  • Status: 200 OK
  • Content-Type: text/event-stream
  • Cache-Control: no-cache
  • Body: No JSON body; the response is an SSE stream (UTF-8). Clients must parse the stream by lines (event: and data:).

Each event is sent as one or more lines ending with a blank line:

  • event: <event-type>
  • data: <JSON payload>
  • (blank line)

SSE events

1. content (zero or more)

Incremental assistant reply text.

  • event: content
  • data: JSON object: {"delta":"..."}
    • delta: string — next segment of the assistant message (UTF-8). Multiple content events are sent until the full reply is streamed.

Example:

event: content
data: {"delta":"Hello, "}

event: content
data: {"delta":"I can help with that.\n"}

2. done (exactly one, on success)

Sent when the streamed reply is complete.

  • event: done
  • data: JSON object: {"conversationId":"...","message":"..."}
    • conversationId: string — conversation ID (same as or derived from the request).
    • message: string — full assistant message text (complete reply).

Example:

event: done
data: {"conversationId":"conv-456","message":"Hello, I can help with that.\n"}

3. error (optional)

Sent when an error occurs (e.g. LLM failure, validation error after stream start). If the error happens before the stream is started, the server responds with a normal HTTP error status and JSON body instead of SSE.

  • event: error
  • data: JSON object: {"error":"...","code":"..."}
    • error: string — human-readable error message.
    • code: string — (optional) machine-readable error code.

Example:

event: error
data: {"error":"Service temporarily unavailable","code":"LLM_UNAVAILABLE"}

Ordering and client behavior

  1. content events may appear multiple times (or zero if the reply is empty).
  2. done is sent once when the full reply is streamed; after done, the stream may be closed by the server.
  3. error may be sent instead of done if something fails during streaming. After error, the stream may be closed.

Clients should:

  • Decode the stream as UTF-8.
  • Parse SSE by splitting on line breaks and blank lines; for each event, read event: and data:.
  • For data:, parse the value as JSON (after the data: prefix).
  • Append each content.delta to the displayed assistant message.
  • On done, finalize the message and apply tracking/conversation handling (e.g. store conversationId, update tracking cursor).
  • On error, show the error to the user and optionally retry.

Non-streaming behavior

When the request body does not include "stream": true, the endpoint behaves exactly as the existing non-streaming widget chat: response is 200 OK with Content-Type: application/json and a JSON body (e.g. success, conversationId, message, metadata). See Live Chat API Reference.


Summary

ItemValue
EndpointPOST /api/widgets/:tenantId/chat
Stream triggerBody stream: true
Response (streaming)200, Content-Type: text/event-stream, Cache-Control: no-cache, no JSON body
Eventscontent{"delta":"..."}; done{"conversationId","message"}; error{"error","code"?}
AuthSame as non-streaming (Bearer widget token; X-Widget-Playground-Token for playground)
Rate limitsSame as non-streaming widget chat

autoch.at Documentation