May 25, 20265 min readResonate HQ

Multi-agent pipeline with durable handoffs in TypeScript on Resonate

How three sequential LLM calls collapse to three lines of generator code when every yield is a Resonate checkpoint.

typescript agent human-in-the-loop for-agents

A pipeline of three LLM agents — researcher, writer, reviewer — must survive any single agent failure without re-running the earlier agents, because each call is slow, costly, and non-deterministic. The Resonate shape of the solution is to register each agent as a plain async function and orchestrate them from a generator workflow where every yield* ctx.run(...) is a durable checkpoint; an agent that throws is retried in place while siblings stay cached in the promise store. The example shows the pipeline under the happy path and under a forced first-attempt failure on the writer, plus a commented-out swap to a real human-in-the-loop step via ctx.promise.

The shape of the solution

// OrchestrationResult defined at src/workflow.ts:21-27
export function* orchestrate(
  ctx: Context,
  topic: string,
  crashOnWriter: boolean,
): Generator<any, OrchestrationResult, any> {
  // Step 1: Research — gather findings
  const findings = yield* ctx.run(researcher, topic);
 
  // Step 2: Write — produce a draft from findings
  // If crashOnWriter=true, the writer fails on first attempt and retries.
  // The researcher does NOT re-run on retry — its result is cached.
  const draft = yield* ctx.run(writer, topic, findings, crashOnWriter);
 
  // Step 3: Review — check the draft quality
  const review = yield* ctx.run(reviewer, draft);
 
  // Step 4: Human approval (simulated in demo)
  // In production:
  //   const approval = yield* ctx.promise({});
  //   // surface approval.id externally (email, dashboard) so a human can resolve it
  //   const decision = yield* approval;
  // This blocks until an external system resolves the promise.
  const approved = review.toUpperCase().includes("APPROVED");
 
  return {
    status: approved ? "published" : "rejected",
    topic,
    findings,
    draft,
    review,
  };
}
// from example-multi-agent-orchestration-ts/src/workflow.ts:29-60

The orchestrator is a generator function (function*), not async. Each yield* ctx.run(agent, ...) invokes the child agent under a durable promise and suspends the orchestrator until the result is checkpointed in the promise store.

The durable primitives in play

new Resonate() — constructs the Resonate client embedded in the worker process; no external server required for the default run. src/index.ts:8.
resonate.register(orchestrate) — registers the top-level workflow so the worker can claim and execute it under a caller-supplied id. src/index.ts:9.
resonate.run(runId, orchestrate, topic, crashMode) — invokes the registered workflow with a caller-supplied id; the runId is defined at src/index.ts:27 (orchestration-${Date.now()}) and the call itself is at src/index.ts:29. Resolves to the workflow's return value.
yield* ctx.run(fn, ...args) — runs an async function as a durable child step. The return value is persisted at the call site; on replay the SDK returns the cached value rather than re-invoking fn. Used for all three agent calls. src/workflow.ts:35, :40, :43.
yield* ctx.promise({}) — referenced in the commented production block (src/workflow.ts:47-49, README:107-140) as the human-in-the-loop primitive. Returns a durable promise with an auto-generated id; yielding the promise blocks the workflow until something external resolves it.
resonate.stop() — shuts down the underlying network transport and message source after the run completes; the client should not be used for further operations. src/index.ts:43; SDK doc at resonate.d.ts:154-156.

ctx.sleep, ctx.detached, ctx.beginRun, and resonate.schedule are not used in this example.

What the SDK handles vs. what you write

SDK handles	You write
Checkpointing the return value of each `yield* ctx.run(agent, ...)` in the durable promise store	The three `yield* ctx.run(...)` lines and the three agent functions (`researcher`, `writer`, `reviewer`)
Suspending the generator after each `yield*` and resuming when the child promise resolves	The straight-line orchestrator body (`findings`, `draft`, `review`, `approved`)
Replaying the orchestrator after a crash using cached step values rather than re-running completed agents	No replay code — the orchestrator is written as if it never crashes
Retrying a child function passed to `ctx.run` that throws, and emitting `Runtime. Function 'writer' failed with ... (retrying in 2 secs)` — only `orchestrate` is registered (`src/index.ts:9`); the agents are imported in `src/workflow.ts:2` and passed by reference to `ctx.run(...)`, and the SDK applies its default `Exponential` retry policy to them	The actual failure (`throw new Error("Writer agent connection reset (simulated)")` at `src/agents.ts:58`) — no retry decorator, no try/catch
Holding the worker on a durable promise via `ctx.promise({})` until it is resolved externally	The promise reference (`const approval = yield* ctx.promise({})`) and the resolver call (`POST /promises/<id>/resolve` from outside)

The orchestrator body is three assignments, one boolean, and a return. The retry behaviour, the cached intermediate results, the resume-after-crash semantics, and the blocking on external approval all live in the SDK and (for ctx.promise resolution) the Resonate server, not in the code the author writes. README:103 frames it as "The entire orchestrator is 15 lines."

Failure modes covered

Writer throws on its first attempt. src/agents.ts:56-59 throws Error("Writer agent connection reset (simulated)") when crashOnFirst is set and attempt === 1. The SDK retries the writer function that was passed by reference to ctx.run (only orchestrate is registered at src/index.ts:9; writer is imported in src/workflow.ts:2). The README crash-mode transcript (README:70-85) shows [researcher] Complete (312 chars) printed once, then [writer] Writing article (attempt 1), then the Resonate retry log line Runtime. Function 'writer' failed with 'Error: Writer agent connection reset (simulated)' (retrying in 2 secs), then [writer] Writing article (attempt 2) — the researcher does not re-run because its return value is already checkpointed at src/workflow.ts:35.
Worker crashes between agents. The orchestrator is invoked with a caller-supplied id (src/index.ts:27-29). On restart, the SDK replays the orchestrator under that id and finds the prior ctx.run(...) results in the promise store; only the unfinished step re-enters the agent function. README:98 states this explicitly: "add process.exit(1) after any agent call and restart — resumes from there".
The orchestration is invoked twice with the same id. This is a property of the SDK, not one this example exercises: when resonate.run(runId, ...) is called with a previously-used runId, it resolves against the existing durable promise rather than starting a parallel pipeline. The example regenerates the id per process with orchestration-${Date.now()} at src/index.ts:27, so the deduplication is not actually demonstrated here — a stable caller-side id would make it observable.
Human approval never arrives (production extension). Swapping in yield* ctx.promise({}) per src/workflow.ts:47-49 and README:125-130 makes the workflow block indefinitely on the durable promise. The worker can restart, the server can restart — the promise stays in the store until something external POSTs to http://localhost:8001/promises/<approvalPromise.id>/resolve (README:135-138). Port 8001 is the Resonate server's default HTTP port; the URL only works once a Resonate server is running, which the example does not start. The default bun start does not enable this path; it is documented as the swap-in for the server-backed deployment.

The example does not pass idempotency keys to the Anthropic API. A retry of ctx.run(writer, ...) will call the Anthropic SDK again — retries are at the agent level, not the LLM-call level.

When to reach for this pattern

If you are chaining multiple LLM agents in sequence and re-running an earlier agent on a later-step failure is unacceptable (cost, latency, non-determinism).
If you want straight-line orchestration code for a sequential agent pipeline instead of a DAG framework or a hosted multi-agent runtime.
If the pipeline needs to survive worker restarts mid-run and resume at the failed step.
If a downstream step (review, approval, publish) needs to block on a human decision that may take minutes, hours, or days — ctx.promise({}) makes the wait durable across restarts and across services.
If you need per-agent retry without writing per-agent retry decorators or try/catch scaffolding around each call.
If the agents themselves are plain async functions that happen to call an LLM, and you want the orchestration concern lifted out of them entirely.

Sources

Example repo: https://github.com/resonatehq-examples/example-multi-agent-orchestration-ts
TypeScript SDK repo: https://github.com/resonatehq/resonate-sdk-ts
Resonate documentation: https://docs.resonatehq.io
Files cited in this post:
- src/workflow.ts:29-60 — the orchestrate generator workflow
- src/workflow.ts:35, :40, :43 — the three yield* ctx.run(...) calls
- src/workflow.ts:47-49 — commented production ctx.promise({}) swap
- src/agents.ts:19-36 — researcher
- src/agents.ts:44-80 — writer (failure injection at :56-59)
- src/agents.ts:87-104 — reviewer
- src/index.ts:8-9 — new Resonate() + resonate.register(orchestrate)
- src/index.ts:27 — runId definition; src/index.ts:29 — resonate.run(runId, orchestrate, topic, crashMode) call
- package.json:14 — @resonatehq/sdk: ^0.10.0 pin
Related example: https://github.com/resonatehq-examples/example-human-in-the-loop-ts — referenced from README:167 for the production human-in-the-loop pattern.