Widget Chat Streaming API Contract (O5)

This document defines the API contract for streaming widget chat responses over Server-Sent Events (SSE). It extends the existing non-streaming POST /api/widgets/:tenantId/chat behavior.

Reference: Implementation plan docs/plans/2026-03-01-O4-O5-summarization-streaming.md (Part B — O5).
Related: Non-streaming chat is documented in Live Chat API Reference.

Endpoint

http

POST /api/widgets/:tenantId/chat

Path parameter: tenantId — Tenant (widget) identifier; same as widgetId in the non-streaming doc.
Streaming is enabled by request body: include "stream": true in the JSON body. When stream is not true, the endpoint behaves as the existing non-streaming chat (JSON response).

Request body

Same fields as the non-streaming widget chat, plus stream: true:

Field	Type	Required	Description
`sessionId`	string	Yes	Unique session identifier.
`message`	string	Yes	User message text.
`stream`	boolean	For streaming	Must be `true` to receive an SSE stream; otherwise a normal JSON response is returned.
`conversationId`	string	No	Conversation to continue; omitted or null for a new conversation.
`preferredConversationId`	string	No	Preferred conversation for resume/stitching.
`assistantId`	string	No	Assistant to use; defaults to tenant default if not provided.
`playgroundMode`	boolean	No	If `true`, conversation is marked for playground TTL cleanup; requires playground auth.
`email`	string	No	Contact email when available.
`tracking_cursor_last_event_id`	string	No	Tracking cursor for analytics.

Example (streaming):

json

{
  "sessionId": "session_abc123",
  "message": "What are your opening hours?",
  "stream": true,
  "conversationId": "conv-456",
  "preferredConversationId": "conv-456",
  "assistantId": "asst-123",
  "playgroundMode": false,
  "email": "user@example.com",
  "tracking_cursor_last_event_id": "evt_xyz"
}

Authentication

Same as current widget chat:

Production widget: Authorization: Bearer <widget_token> (tenant widget token).
Playground: When playgroundMode is true, X-Widget-Playground-Token: <playground_token> (in addition to or instead of Bearer, per current implementation).

Domain checks and playground validation apply as in the non-streaming endpoint. Unauthorized or invalid auth returns 401 or 403 with a JSON body before any stream is started.

Rate limits

Unchanged from non-streaming widget chat. The same limits apply to streaming requests.

Response (when `stream: true`)

Status: 200 OK
Content-Type: text/event-stream
Cache-Control: no-cache
Body: No JSON body; the response is an SSE stream (UTF-8). Clients must parse the stream by lines (event: and data:).

Each event is sent as one or more lines ending with a blank line:

event: <event-type>
data: <JSON payload>
(blank line)

SSE events

1. `content` (zero or more)

Incremental assistant reply text.

event: content
data: JSON object: {"delta":"..."}
- delta: string — next segment of the assistant message (UTF-8). Multiple content events are sent until the full reply is streamed.

Example:

event: content
data: {"delta":"Hello, "}

event: content
data: {"delta":"I can help with that.\n"}

2. `done` (exactly one, on success)

Sent when the streamed reply is complete.

event: done
data: JSON object: {"conversationId":"...","message":"..."}
- conversationId: string — conversation ID (same as or derived from the request).
- message: string — full assistant message text (complete reply).

Example:

event: done
data: {"conversationId":"conv-456","message":"Hello, I can help with that.\n"}

3. `error` (optional)

Sent when an error occurs (e.g. LLM failure, validation error after stream start). If the error happens before the stream is started, the server responds with a normal HTTP error status and JSON body instead of SSE.

event: error
data: JSON object: {"error":"...","code":"..."}
- error: string — human-readable error message.
- code: string — (optional) machine-readable error code.

Example:

event: error
data: {"error":"Service temporarily unavailable","code":"LLM_UNAVAILABLE"}

Ordering and client behavior

content events may appear multiple times (or zero if the reply is empty).
done is sent once when the full reply is streamed; after done, the stream may be closed by the server.
error may be sent instead of done if something fails during streaming. After error, the stream may be closed.

Clients should:

Decode the stream as UTF-8.
Parse SSE by splitting on line breaks and blank lines; for each event, read event: and data:.
For data:, parse the value as JSON (after the data: prefix).
Append each content.delta to the displayed assistant message.
On done, finalize the message and apply tracking/conversation handling (e.g. store conversationId, update tracking cursor).
On error, show the error to the user and optionally retry.

Non-streaming behavior

When the request body does not include "stream": true, the endpoint behaves exactly as the existing non-streaming widget chat: response is 200 OK with Content-Type: application/json and a JSON body (e.g. success, conversationId, message, metadata). See Live Chat API Reference.

Summary

Item	Value
Endpoint	`POST /api/widgets/:tenantId/chat`
Stream trigger	Body `stream: true`
Response (streaming)	200, `Content-Type: text/event-stream`, `Cache-Control: no-cache`, no JSON body
Events	`content` → `{"delta":"..."}`; `done` → `{"conversationId","message"}`; `error` → `{"error","code"?}`
Auth	Same as non-streaming (Bearer widget token; X-Widget-Playground-Token for playground)
Rate limits	Same as non-streaming widget chat

Widget Chat Streaming API Contract (O5) ​

Endpoint ​

Request body ​

Authentication ​

Rate limits ​

Response (when stream: true) ​

SSE events ​

1. content (zero or more) ​

2. done (exactly one, on success) ​

3. error (optional) ​

Ordering and client behavior ​

Non-streaming behavior ​

Summary ​

Widget Chat Streaming API Contract (O5)

Endpoint

Request body

Authentication

Rate limits

Response (when `stream: true`)

SSE events

1. `content` (zero or more)

2. `done` (exactly one, on success)

3. `error` (optional)

Ordering and client behavior

Non-streaming behavior

Summary