| Capability | chat.agent() | chat.createSession() | Raw primitives |
|---|---|---|---|
| Turn loop, stop signals, accumulation | Managed | Managed | You write it |
| Lifecycle hooks | Yes | No — inline code per turn | No |
| Continuation recovery on new runs | Automatic | Manual seeding | Manual seeding |
| Compaction / steering | Built-in | Built-in | Manual |
| Head Start, actions, tool approvals | Yes | No | No |
| Custom stream conversion | No | Limited | Full control |
| Agent dashboard visibility | Yes | Yes (via customAgent) | Yes |
chat.customAgent() as the wrapper, which is what makes the task visible to the agent dashboard.
Start with chat.agent(). Drop to chat.createSession() when you want to own the per-turn code (model routing, persistence, custom telemetry) without rebuilding the turn loop. Drop to raw primitives only when you need full control over stream conversion or a custom protocol.
chat.agent()
The highest-level approach. Handles message accumulation, stop signals, turn lifecycle, and auto-piping automatically.Simple: return a StreamTextResult
Return thestreamText result from run and it’s automatically piped to the frontend:
Using chat.pipe() for complex flows
For complex agent flows wherestreamText is called deep inside your code, use chat.pipe(). It works from anywhere inside a task — even nested function calls.
trigger/agent-chat.ts
Custom data parts
Add customdata-* parts to the assistant’s response message via chat.response.write() (from run()) or the writer parameter in lifecycle hooks. Non-transient data-* chunks are automatically added to responseMessage.parts and surface in onTurnComplete for persistence:
transient: true to data chunks that should stream to the frontend but NOT persist in the response message. Use this for progress indicators, loading states, and other temporary UI:
This matches the AI SDK’s semantics:
data-* chunks persist to message.parts by default. Only transient: true chunks are ephemeral. Non-data chunks (text-delta, tool-*, etc.) are handled by streamText and captured via onFinish — they don’t need chat.response.chat.response and the writer accumulation behavior work with chat.agent and chat.createSession. If you’re using chat.customAgent, you own the accumulator — see the raw-task example for the manual pattern.Raw streaming with chat.stream
For low-level stream access (piping from subtasks, reading streams by run ID), use chat.stream. Chunks written via chat.stream go directly to the realtime output — they are NOT accumulated into the response message regardless of the transient flag.
chat.stream exposes the full stream API:
| Method | Description |
|---|---|
chat.stream.writer(options) | Write individual chunks via a callback |
chat.stream.pipe(stream, options?) | Pipe a ReadableStream or AsyncIterable |
chat.stream.append(value, options?) | Append raw data |
chat.stream.read(runId, options?) | Read the stream by run ID |
target: "root"), see the Sub-agents pattern.
Backed by a Session
Everychat.agent conversation is backed by a durable Session: externalId is your chatId, type is "chat.agent", and taskIdentifier is the agent’s task ID. The session is the run manager. It owns the chat’s runs, persists across run lifecycles, and orchestrates handoffs (idle continuation, chat.requestUpgrade). You rarely touch it directly, since chat.stream, chat.messages, and chat.stopSignal wrap everything, but payload.sessionId is there when you need to reach in, e.g. sessions.open(payload.sessionId) to write from a sub-agent or from outside the turn loop.
Tools
Declare your tools on the agent config, then read them back (typed) from therun() payload. Declaring them on the config, not just on streamText, is what lets the SDK re-apply each tool’s toModelOutput when it re-converts history on later turns.
toModelOutput across turns, per-turn dynamic tools, the typed run payload, and how config tools relate to skills.
Lifecycle hooks
chat.agent({ ... }) accepts hooks that fire in a fixed order around each turn, plus dedicated suspend/resume hooks. The full reference lives on its own page:
- Lifecycle hooks —
onPreload,onChatStart,onValidateMessages,hydrateMessages,onTurnStart,onBeforeTurnComplete,onTurnComplete,onChatSuspend/onChatResume,exitAfterPreloadIdle, plus howctxplumbs through every callback.
onValidateMessages → hydrateMessages → onChatStart (chat’s first message only) → onTurnStart → run() → onBeforeTurnComplete → onTurnComplete.
Using prompts
Use AI Prompts to manage your system prompt as versioned, overridable config. Store the resolved prompt in a lifecycle hook withchat.prompt.set(), then spread chat.toStreamTextOptions() into streamText — it includes the system prompt, model, config, and telemetry automatically.
chat.toStreamTextOptions() returns an object with system, model (resolved via the registry), temperature, and experimental_telemetry — all from the stored prompt. Properties you set after the spread (like a client-selected model) take precedence.
Which form to call:
| Form | Use when |
|---|---|
chat.toStreamTextOptions() | Default. Wires up prepareStep (compaction, steering, background injection), the stored prompt’s system / model / config, and telemetry metadata. |
chat.toStreamTextOptions({ registry }) | You’re using Prompts with a provider-prefixed model string (e.g. "anthropic:claude-sonnet-4-5"). The registry resolves the prefix to a real model instance via createProviderRegistry({ anthropic, openai, ... }). |
chat.toStreamTextOptions({ tools }) | You want HITL tool approvals — pass the same tools object you give to streamText. The SDK then knows which tool calls need to pause on needsApproval: true. |
chat.toStreamTextOptions({ registry, tools }) | Both of the above. |
Stop generation
How stop works
Callingstop() from useChat sends a stop signal to the running task via input streams. The task’s streamText call aborts (if you passed signal or stopSignal), but the run stays alive and waits for the next message. The partial response is captured and accumulated normally.
Abort signals
Therun function receives three abort signals:
| Signal | Fires when | Use for |
|---|---|---|
signal | Stop or cancel | Pass to streamText — handles both cases. Use this in most cases. |
stopSignal | Stop only (per-turn, reset each turn) | Custom logic that should only run on user stop, not cancellation |
cancelSignal | Run cancel, expire, or maxDuration exceeded | Cleanup that should only happen on full cancellation |
Detecting stop in callbacks
TheonTurnComplete event includes a stopped boolean that indicates whether the user stopped generation during that turn:
chat.isStopped(). This is useful inside streamText’s onFinish callback where the AI SDK’s isAborted flag can be unreliable (e.g. when using createUIMessageStream + writer.merge()):
Cleaning up aborted messages
When stop happens mid-stream, the captured response message can contain parts in an incomplete state — tool calls stuck inpartial-call, reasoning blocks still marked as streaming, etc. These can cause UI issues like permanent spinners.
chat.agent automatically cleans up the responseMessage when stop is detected before passing it to onTurnComplete. If you use chat.pipe() manually and capture response messages yourself, use chat.cleanupAbortedParts():
partial-call state and marks any streaming text or reasoning parts as done.
Stop signal delivery is best-effort. There is a small race window where the model may finish
before the stop signal arrives, in which case the turn completes normally with
stopped: false.
This is expected and does not require special handling.Tool approvals
Tools withneedsApproval: true pause execution until the user approves or denies via the frontend. Define the tool as normal and pass it to streamText — chat.agent handles the rest:
approval-requested state. After the user approves on the frontend, the updated message is sent back and chat.agent replaces it in the conversation accumulator by matching the message ID. streamText then executes the approved tool and continues.
See Tool approvals in the frontend docs for the UI setup.
Persistence
To build a chat app that survives page refreshes you persist two things, both server-side from inside the agent:- Conversation state. Full
UIMessage[]keyed bychatId. Written fromonTurnStart(so the user message is durable before streaming begins) andonTurnComplete(so the assistant reply lands). - Session state. The transport’s reconnect metadata:
publicAccessTokenandlastEventId. Written alongside the messages from the same hooks.
Sessions let the transport reconnect to an existing run after a page refresh. Without them, every page load would start a new run, losing the conversation context that was accumulated in the previous run.
lastEventId writes, why not to use chat.defer in onTurnStart), token renewal via the accessToken callback, and an end-to-end three-file example, see Database persistence.
Pending messages (steering)
Users can send messages while the agent is executing tool calls. WithpendingMessages, these messages are injected between tool-call steps, steering the agent mid-execution:
usePendingMessages hook handles sending, tracking, and rendering injection points.
Background injection
Inject context from background work into the conversation usingchat.inject(). Combine with chat.defer() to run analysis between turns and inject results before the next response — self-review, RAG augmentation, safety checks, etc.
Actions
Custom actions let the frontend send structured commands (undo, rollback, edit, regenerate) that modify the conversation state. Actions are not turns: they firehydrateMessages (if set) and onAction only. The full surface (defining actionSchema, returning a model response from onAction, gating against pending HITL tool calls, and sending actions from the frontend) lives on its own page.
See Actions.
Chat history
Imperative API for reading and modifying the accumulated message history. Works from any hook (onAction, onTurnStart, onBeforeTurnComplete, onTurnComplete, hydrateMessages) or from run() and AI SDK tools.
The agent’s accumulator — not
session.out — is the source of truth for the full conversation. The .out stream is a bounded sliding window (roughly one turn at steady state, see Records on session.out); the durable history lives in the agent’s accumulator and is persisted to S3 between turns for fast next-run boots. chat.history reads and mutates that accumulator directly.| Method | Description |
|---|---|
chat.history.all() | Returns a copy of the current accumulated UI messages. |
chat.history.getChain() | Same as all(). Use whichever name reads better in context. |
chat.history.findMessage(messageId) | Returns the message with that id, or undefined. |
chat.history.getPendingToolCalls() | Tool calls on the most recent assistant message that are still in input-available state (waiting on addToolOutput). |
chat.history.getResolvedToolCalls() | All tool calls in the chain in output-available or output-error state. |
chat.history.extractNewToolResults(message) | Tool results in message whose toolCallId is not already resolved in the chain. Most useful in hydrateMessages against an incoming wire message, before the runtime merges it. |
{ toolCallId, toolName, messageId }. Each new-result entry is { toolCallId, toolName, output, errorText? }, where errorText is set only for output-error parts.
Mutations. Applied at lifecycle checkpoints (after hooks return). Multiple mutations in the same hook compose correctly.
| Method | Description |
|---|---|
chat.history.set(messages) | Replace all messages. Same as chat.setMessages(). |
chat.history.remove(messageId) | Remove a specific message by ID. |
chat.history.rollbackTo(messageId) | Keep messages up to and including the given ID (undo). |
chat.history.replace(messageId, message) | Replace a specific message by ID (edit). |
chat.history.slice(start, end?) | Keep only messages in the given range. |
extractNewToolResults compares against the current chain. Inside onTurnComplete, the chain already contains the just-finished responseMessage, so it returns []. Use it where the message is from outside the accumulator: hydrateMessages (incoming wire), onAction if the action carries a message, or any custom pre-merge code path.
prepareMessages
Transform model messages before they’re used anywhere — inrun(), in compaction rebuilds, and in compaction results. Define once, applied everywhere.
Use this for Anthropic cache breaks, injecting system context, stripping PII, etc.
reason field tells you why messages are being prepared:
| Reason | Description |
|---|---|
"run" | Messages being passed to run() for streamText |
"compaction-rebuild" | Rebuilding from a previous compaction summary |
"compaction-result" | Fresh compaction just produced these messages |
Version upgrades
Chat agent runs are pinned to the worker version they started on. When you deploy a new version, suspended runs resume on the old code. Callchat.requestUpgrade() in onTurnStart to skip run() and exit immediately — the transport re-triggers the same message on the latest version. See the Version Upgrades pattern for the full guide.
Ending a run on your terms
By default, a chat agent stays idle after each turn waiting for the next user message. Callchat.endRun() from run(), chat.defer(), onBeforeTurnComplete, or onTurnComplete to exit the loop once the current turn finishes — no upgrade signal, no idle wait.
onBeforeTurnComplete / onTurnComplete fire, the turn-complete chunk is written, and the run exits instead of suspending. The next user message on the same chatId starts a fresh run via the standard continuation flow.
Use this when the agent knows its work is done (budget exhausted, goal achieved, one-shot response) rather than relying on the idle timeout. Unlike chat.requestUpgrade(), no upgrade-required signal is sent to the client, so there’s no version-migration semantics.
Runtime configuration
chat.setTurnTimeout()
Override how long the run stays suspended waiting for the next message. Call from insiderun():
chat.setIdleTimeoutInSeconds()
Override how long the run stays idle (active, using compute) after each turn:Longer idle timeout means faster responses but more compute usage. Set to
0 to suspend
immediately after each turn (minimum latency cost, slight delay on next message).Stream options
Control howstreamText results are converted to the frontend stream via toUIMessageStream(). Set static defaults on the task, or override per-turn.
Error handling with onError
WhenstreamText encounters an error mid-stream (rate limits, API failures, network errors), the onError callback converts it to a string that’s sent to the frontend as an { type: "error", errorText } chunk. The AI SDK’s useChat receives this via its onError callback.
By default, the raw error message is sent to the frontend. Use onError to sanitize errors and avoid leaking internal details:
onError is also called for tool execution errors, so a single handler covers both LLM errors and tool failures.
On the frontend, handle the error in useChat:
Reasoning and sources
Control which AI SDK features are forwarded to the frontend:Custom message IDs
By default, response message IDs are generated using the AI SDK’s built-ingenerateId. Pass a custom generateMessageId function to use your own ID format (e.g. UUID-v7):
.withUIMessage() builder, set it under streamOptions:
The generated ID is sent to the frontend in the stream’s
start chunk, so frontend and backend
always reference the same ID for each message. This is important for features like tool
approvals, where the frontend resends an assistant message and the backend needs to match it
by ID in the conversation accumulator.Per-turn overrides
Override per-turn withchat.setUIMessageStreamOptions() — per-turn values merge with the static config (per-turn wins on conflicts). The override is cleared automatically after each turn.
chat.setUIMessageStreamOptions() works across all abstraction levels — chat.agent(), chat.createSession() / turn.complete(), and chat.pipeAndCapture().
See ChatUIMessageStreamOptions for the full reference.
onFinish is managed internally for response capture and cannot be overridden here. Use
streamText’s onFinish callback for custom finish handling, or use raw task
mode for full control over toUIMessageStream().Manual mode with task()
If you need full control over task options, use the standardtask() with ChatTaskPayload and chat.pipe():
Custom agents
Both lower levels —chat.createSession() (managed turn iterator, your turn body) and chat.customAgent() with raw primitives (hand-rolled loop, full stream-conversion control) — are covered together on the Custom agents page, including the ChatTurn surface, the continuation-seeding pattern, and the hand-rolled-loop checklist:
Custom agents
Build agents without the managed lifecycle — createSession or raw primitives.

