Streaming
Server‑Sent Events (SSE) streaming.
Set stream: true to receive incremental deltas as they are generated. Streaming reduces perceived latency and enables real-time UIs.
How to enable
POST /api/v1/chat/completions
{
"model": "provider/model",
"messages": [{"role":"user","content":"Stream this response."}],
"stream": true
}SSE format
Events
The response is an SSE stream where each event line begins with data:. The stream terminates with data: [DONE].
data: {"id":"...","choices":[{"index":0,"delta":{"content":"Hel"}}]}
data: {"id":"...","choices":[{"index":0,"delta":{"content":"lo"}}]}
data: [DONE]Client-side tips
- Set sensible timeouts; streaming responses can be long-lived.
- Handle partial UTF‑8 chunks safely (most SDKs already do).
- Reconnect logic should be request-level, not chunk-level.
- On 429, apply exponential backoff and consider lowering concurrency.