A pipeline of three LLM agents — researcher, writer, reviewer — must survive any single agent failure without re-running the earlier agents, because each call is slow, costly, and non-deterministic. The Resonate shape of the solution is to register each agent as a plain async function and orchestrate them from a generator workflow where every yield* ctx.run(...) is a durable checkpoint; an agent that throws is retried in place while siblings stay cached in the promise store. The example shows the pipeline under the happy path and under a forced first-attempt failure on the writer, plus a commented-out swap to a real human-in-the-loop step via ctx.promise.
The shape of the solution
// OrchestrationResult defined at src/workflow.ts:21-27
export function* orchestrate(
ctx: Context,
topic: string,
crashOnWriter: boolean,
): Generator<any, OrchestrationResult, any> {
// Step 1: Research — gather findings
const findings = yield* ctx.run(researcher, topic);
// Step 2: Write — produce a draft from findings
// If crashOnWriter=true, the writer fails on first attempt and retries.
// The researcher does NOT re-run on retry — its result is cached.
const draft = yield* ctx.run(writer, topic, findings, crashOnWriter);
// Step 3: Review — check the draft quality
const review = yield* ctx.run(reviewer, draft);
// Step 4: Human approval (simulated in demo)
// In production:
// const approval = yield* ctx.promise({});
// // surface approval.id externally (email, dashboard) so a human can resolve it
// const decision = yield* approval;
// This blocks until an external system resolves the promise.
const approved = review.toUpperCase().includes("APPROVED");
return {
status: approved ? "published" : "rejected",
topic,
findings,
draft,
review,
};
}
// from example-multi-agent-orchestration-ts/src/workflow.ts:29-60The orchestrator is a generator function (function*), not async. Each yield* ctx.run(agent, ...) invokes the child agent under a durable promise and suspends the orchestrator until the result is checkpointed in the promise store.
The durable primitives in play
new Resonate()— constructs the Resonate client embedded in the worker process; no external server required for the default run.src/index.ts:8.resonate.register(orchestrate)— registers the top-level workflow so the worker can claim and execute it under a caller-supplied id.src/index.ts:9.resonate.run(runId, orchestrate, topic, crashMode)— invokes the registered workflow with a caller-supplied id; therunIdis defined atsrc/index.ts:27(orchestration-${Date.now()}) and the call itself is atsrc/index.ts:29. Resolves to the workflow's return value.yield* ctx.run(fn, ...args)— runs an async function as a durable child step. The return value is persisted at the call site; on replay the SDK returns the cached value rather than re-invokingfn. Used for all three agent calls.src/workflow.ts:35,:40,:43.yield* ctx.promise({})— referenced in the commented production block (src/workflow.ts:47-49, README:107-140) as the human-in-the-loop primitive. Returns a durable promise with an auto-generated id; yielding the promise blocks the workflow until something external resolves it.resonate.stop()— shuts down the underlying network transport and message source after the run completes; the client should not be used for further operations.src/index.ts:43; SDK doc atresonate.d.ts:154-156.
ctx.sleep, ctx.detached, ctx.beginRun, and resonate.schedule are not used in this example.
What the SDK handles vs. what you write
| SDK handles | You write |
|---|---|
Checkpointing the return value of each yield* ctx.run(agent, ...) in the durable promise store | The three yield* ctx.run(...) lines and the three agent functions (researcher, writer, reviewer) |
Suspending the generator after each yield* and resuming when the child promise resolves | The straight-line orchestrator body (findings, draft, review, approved) |
| Replaying the orchestrator after a crash using cached step values rather than re-running completed agents | No replay code — the orchestrator is written as if it never crashes |
Retrying a child function passed to ctx.run that throws, and emitting Runtime. Function 'writer' failed with ... (retrying in 2 secs) — only orchestrate is registered (src/index.ts:9); the agents are imported in src/workflow.ts:2 and passed by reference to ctx.run(...), and the SDK applies its default Exponential retry policy to them | The actual failure (throw new Error("Writer agent connection reset (simulated)") at src/agents.ts:58) — no retry decorator, no try/catch |
Holding the worker on a durable promise via ctx.promise({}) until it is resolved externally | The promise reference (const approval = yield* ctx.promise({})) and the resolver call (POST /promises/<id>/resolve from outside) |
The orchestrator body is three assignments, one boolean, and a return. The retry behaviour, the cached intermediate results, the resume-after-crash semantics, and the blocking on external approval all live in the SDK and (for ctx.promise resolution) the Resonate server, not in the code the author writes. README:103 frames it as "The entire orchestrator is 15 lines."
Failure modes covered
- Writer throws on its first attempt.
src/agents.ts:56-59throwsError("Writer agent connection reset (simulated)")whencrashOnFirstis set andattempt === 1. The SDK retries thewriterfunction that was passed by reference toctx.run(onlyorchestrateis registered atsrc/index.ts:9;writeris imported insrc/workflow.ts:2). The README crash-mode transcript (README:70-85) shows[researcher] Complete (312 chars)printed once, then[writer] Writing article (attempt 1), then the Resonate retry log lineRuntime. Function 'writer' failed with 'Error: Writer agent connection reset (simulated)' (retrying in 2 secs), then[writer] Writing article (attempt 2)— the researcher does not re-run because its return value is already checkpointed atsrc/workflow.ts:35. - Worker crashes between agents. The orchestrator is invoked with a caller-supplied id (
src/index.ts:27-29). On restart, the SDK replays the orchestrator under that id and finds the priorctx.run(...)results in the promise store; only the unfinished step re-enters the agent function. README:98 states this explicitly: "addprocess.exit(1)after any agent call and restart — resumes from there". - The orchestration is invoked twice with the same id. This is a property of the SDK, not one this example exercises: when
resonate.run(runId, ...)is called with a previously-usedrunId, it resolves against the existing durable promise rather than starting a parallel pipeline. The example regenerates the id per process withorchestration-${Date.now()}atsrc/index.ts:27, so the deduplication is not actually demonstrated here — a stable caller-side id would make it observable. - Human approval never arrives (production extension). Swapping in
yield* ctx.promise({})persrc/workflow.ts:47-49and README:125-130 makes the workflow block indefinitely on the durable promise. The worker can restart, the server can restart — the promise stays in the store until something external POSTs tohttp://localhost:8001/promises/<approvalPromise.id>/resolve(README:135-138). Port8001is the Resonate server's default HTTP port; the URL only works once a Resonate server is running, which the example does not start. The defaultbun startdoes not enable this path; it is documented as the swap-in for the server-backed deployment.
The example does not pass idempotency keys to the Anthropic API. A retry of ctx.run(writer, ...) will call the Anthropic SDK again — retries are at the agent level, not the LLM-call level.
When to reach for this pattern
- If you are chaining multiple LLM agents in sequence and re-running an earlier agent on a later-step failure is unacceptable (cost, latency, non-determinism).
- If you want straight-line orchestration code for a sequential agent pipeline instead of a DAG framework or a hosted multi-agent runtime.
- If the pipeline needs to survive worker restarts mid-run and resume at the failed step.
- If a downstream step (review, approval, publish) needs to block on a human decision that may take minutes, hours, or days —
ctx.promise({})makes the wait durable across restarts and across services. - If you need per-agent retry without writing per-agent retry decorators or try/catch scaffolding around each call.
- If the agents themselves are plain async functions that happen to call an LLM, and you want the orchestration concern lifted out of them entirely.
Sources
- Example repo: https://github.com/resonatehq-examples/example-multi-agent-orchestration-ts
- TypeScript SDK repo: https://github.com/resonatehq/resonate-sdk-ts
- Resonate documentation: https://docs.resonatehq.io
- Files cited in this post:
src/workflow.ts:29-60— theorchestrategenerator workflowsrc/workflow.ts:35,:40,:43— the threeyield* ctx.run(...)callssrc/workflow.ts:47-49— commented productionctx.promise({})swapsrc/agents.ts:19-36—researchersrc/agents.ts:44-80—writer(failure injection at:56-59)src/agents.ts:87-104—reviewersrc/index.ts:8-9—new Resonate()+resonate.register(orchestrate)src/index.ts:27—runIddefinition;src/index.ts:29—resonate.run(runId, orchestrate, topic, crashMode)callpackage.json:14—@resonatehq/sdk: ^0.10.0pin
- Related example: https://github.com/resonatehq-examples/example-human-in-the-loop-ts — referenced from README:167 for the production human-in-the-loop pattern.
