5 min readResonate HQJust published

Durable multi-turn chatbot with checkpointed LLM calls in TypeScript

How a multi-turn chatbot stops re-prompting users and re-charging tokens when every LLM call is a Resonate durable checkpoint.

Resonate brand card on a dark background with a teal spectrum wave at the bottom and the post headline in white Sansation.

A multi-turn chatbot has to call an LLM once per turn, keep the full conversation history available even when the API call fails mid-flight, and avoid both re-prompting the user and double-paying for the same completion when the worker retries. On Resonate, each turn is invoked as a workflow keyed by session-{ts}/turn-{n}, and the single Claude API call inside that workflow is wrapped in ctx.run — so a transient failure is retried by the SDK against the same workflow ID, the conversation history is already pinned as the workflow's input, and a successful call's response is cached against the promise store and never re-issued on replay. This example (example-durable-chatbot-ts, @resonatehq/sdk ^0.10.0) implements the full pattern in ~130 lines of TypeScript, with a processTurn generator that is 10 lines of body and contains zero retry logic, zero try/catch blocks, and no external state store.

The shape of the solution

export function* processTurn(
  ctx: Context,
  history: ChatMessage[],
  turnKey: string,
  isCrashTurn: boolean,
): Generator<any, string, any> {
  // This is the only line that matters for durability.
  // ctx.run creates a durable promise for this LLM call.
  // Success: result cached, won't call LLM again on replay.
  // Failure: Resonate retries with exponential backoff — automatically.
  const response = yield* ctx.run(callClaude, history, turnKey, isCrashTurn);
  return response;
}
// from example-durable-chatbot-ts/src/workflow.ts:22

The workflow is invoked from src/index.ts:51 as resonate.run(turnKey, processTurn, [...history], turnKey, false) where turnKey = `${sessionId}/turn-${turn}` and sessionId = `session-${Date.now()}` (src/index.ts:18, 49). The same construction is used in crash mode at src/index.ts:88, 93-99. The string turnKey is the workflow ID — it is the identifier Resonate uses to deduplicate concurrent invocations of the same turn and to find the cached result on replay.

The durable primitives in play

  • new Resonate() — embedded-mode SDK handle. No external server, no transport configuration. src/index.ts:10.
  • resonate.register(processTurn) — registers the generator as an addressable workflow function. src/index.ts:11.
  • resonate.run(id, fn, ...args) — starts (or attaches to) the workflow keyed by id. Returns the workflow's final value. The ID ${sessionId}/turn-${turn} is the idempotency key for the turn; calling it twice with the same ID returns the cached result, not a second invocation. src/index.ts:51, 93-99.
  • ctx.run(callClaude, history, turnKey, isCrashTurn) — the durable child step. On success, the returned string is persisted against the workflow's promise store; on replay, it returns immediately from cache without re-calling Claude. On thrown error, the SDK schedules a retry with exponential backoff and re-invokes callClaude against the same step. src/workflow.ts:32.
  • Generator yield* delegation — the mechanism by which ctx.run participates in the workflow's execution position. The generator only advances past yield* once that step's promise is resolved. src/workflow.ts:32.
  • resonate.stop() — graceful shutdown after the REPL or crash demo finishes. src/index.ts:29.

There is no ctx.sleep, no ctx.detached, no scheduled invocation, no signal, no RFI. The example deliberately uses the smallest possible primitive set: register a generator, run one workflow per turn under a stable ID, wrap the single LLM call in ctx.run.

What the SDK handles vs. what you write

You writeThe SDK handles
The generator processTurn and the single yield* ctx.run(callClaude, ...) inside it (src/workflow.ts:22-34)Recording the LLM response against the workflow ID's promise store and returning it on any subsequent replay without re-invoking callClaude
callClaude as a plain async function — no retry loop, no backoff (src/llm.ts:18-48)Catching thrown errors out of ctx.run, persisting the failure, scheduling a retry with backoff, and re-invoking callClaude
A stable workflow ID per turn (${sessionId}/turn-${turn} at src/index.ts:49, 88)Deduplicating concurrent invocations of the same ID and serving the cached result on replay
The outer REPL loop that maintains history: ChatMessage[] and re-passes a snapshot on each turn (src/index.ts:43-56)Pinning that snapshot as workflow input so any retry of the in-flight turn sees the exact same history without the user re-sending
Nothing in processTurn or callClaude for the retry case — neither file contains try or catchThe runtime line Runtime. Function 'callClaude' failed with '...' (retrying in N secs) shown in the README's crash demo (README.md:89) — emitted by the SDK, not by application code

The workflow body is one line of behaviour: yield* ctx.run(callClaude, ...). Every retry, every cache, every persistence concern is absorbed by that single primitive plus the workflow ID.

Failure modes covered

  • The Claude API call fails transiently mid-turn. In crash mode, callClaude throws Error: LLM API connection timeout (simulated) on its first attempt for the designated crash turn (src/llm.ts:30-33). There is no try/catch in processTurn and no retry policy declared anywhere in the example. The SDK catches the thrown error from ctx.run, schedules a retry, and re-invokes callClaude against the same step. On the second attempt, attempt === 2 so the simulated-failure branch is skipped and the real Anthropic call goes through (src/llm.ts:35-40). The user never re-sends the message; the prior turns never re-run.
  • The worker crashes between turns. Each turn is its own top-level resonate.run invocation keyed by ${sessionId}/turn-${turn} (src/index.ts:51, 93-99). A crash after turn N has completed but before turn N+1 starts loses no work: turn N's response is persisted under its workflow ID, and turn N+1 begins fresh when the next resonate.run call is made.
  • The same turn is invoked twice with the same ID. Two resonate.run calls with the same turnKey resolve to the same workflow and receive the same cached result. The LLM is called once, not twice. This is workflow idempotency at the ID level — the README states it directly: "Idempotent turns: each turn has a stable promise ID — calling it twice returns the cached result, not two LLM calls" (README.md:21).
  • A successful turn is replayed. The ctx.run(callClaude, ...) call returns the persisted response from the promise store without re-issuing the Claude API request. Tokens are paid for once per (sessionId, turn), not once per retry.

What this example deliberately does NOT cover: concurrent mutation of the same session by multiple callers under a lock. The README points to the distributed-mutex pattern for that case (README.md:156). Conversation state in this example lives in a plain JavaScript variable inside the outer loop; it is passed into every resonate.run call as the workflow's input, so it survives the workflow's own retries but is not itself a Resonate-managed entity.

When to reach for this pattern

  • If you have a multi-turn LLM conversation where mid-turn failures would otherwise force the user to re-send their message or your worker to re-pay for a completion already issued.
  • If you want the entire retry-and-checkpoint machinery for an LLM call expressed as one line of workflow code (yield* ctx.run(callClaude, ...)) rather than a try/catch/backoff loop in application code.
  • If conversation history is naturally a value you can pass as workflow input on each turn — one session, long-lived, linear turn progression — and you don't need a separate state store to track which turn you're on.
  • If you want each turn to be an idempotency boundary (same turnKey returns the cached completion, not a duplicate call), without writing the dedupe yourself.
  • If you don't need concurrent-mutation exclusivity across overlapping callers on the same session key — that is a different pattern (distributed mutex) and this example is explicit about deferring it (README.md:156).

Sources