AWS Lambda functions cannot run longer than 15 minutes, cannot hold state across timeouts, and restart full-function on retry — none of which is acceptable for a document-processing pipeline that downloads from S3, OCRs pages, calls an LLM, writes to a database, and notifies a requester. The shape of the Resonate solution is to treat Lambda as a stateless trigger that calls resonate.run against a Resonate Server and returns 202 immediately, while the durable workflow runs on a separate long-lived worker process keyed by an idempotent promise id. This example shows the Lambda handler, the registered workflow, and the GET endpoint that reads the durable result back through the same promise id.
The shape of the solution
// ...
import { Resonate } from "@resonatehq/sdk";
import { processDocument, type DocumentJob, type DocumentResult } from "./workflow.js";
const resonate = new Resonate({ url: process.env["RESONATE_URL"] ?? "http://localhost:8001" });
resonate.register("processDocument", processDocument);
// ...
async function handleProcessDocument(event: APIGatewayEvent): Promise<LambdaResponse> {
const body = JSON.parse(event.body ?? "{}") as Partial<DocumentJob>;
// ... validation ...
const job: DocumentJob = { /* ... */ };
// Fire-and-forget: workflow runs on the Resonate worker, not in Lambda.
// This returns immediately — the Lambda function exits without waiting.
resonate.run(`doc/${job.jobId}`, processDocument, job).catch(console.error);
return {
statusCode: 202,
headers: JSON_HEADERS,
body: JSON.stringify({
status: "accepted",
jobId: job.jobId,
statusUrl: `/status/${job.jobId}`,
message: "Processing in background. Poll statusUrl for results.",
}),
};
}
// from example-aws-lambda-ts/src/handler.ts:30-31, 40-41, 66-99The registered workflow itself is a generator that yields each step through ctx.run:
// ...
import type { Context } from "@resonatehq/sdk";
// ... step functions: downloadDocument, extractText, analyzeDocument, storeResults, notifyRequester ...
export function* processDocument(
ctx: Context,
job: DocumentJob,
): Generator<any, DocumentResult, any> {
const pageCount = yield* ctx.run(downloadDocument, job);
const text = yield* ctx.run(extractText, job, pageCount);
const { summary, data } = yield* ctx.run(analyzeDocument, job, text);
const storedAt = yield* ctx.run(storeResults, job, summary, data);
const notifiedAt = yield* ctx.run(notifyRequester, job, storedAt);
return {
jobId: job.jobId,
type: job.type,
pageCount,
summary,
extractedData: data,
storedAt,
notifiedAt,
};
}
// from example-aws-lambda-ts/src/workflow.ts:1, 114-133The durable primitives in play
resonate.register("processDocument", processDocument)— names the workflow so the Resonate Server can resolve and run it. Called at module scope so it runs once per Lambda container cold start.src/handler.ts:41.resonate.run(id, fn, args)— starts (or rejoins) a durable workflow keyed by the promise iddoc/${job.jobId}. The call isasyncand returns a Promise that resolves when the workflow completes; the Lambda handler does notawaitit, so the handler exits while the workflow continues on the worker.src/handler.ts:88.ctx.run(stepFn, ...args)— each invocation is a durable checkpoint. The result of a successful step is persisted; on resume, completed checkpoints are not re-executed. Five of them: download, extract, analyze, store, notify.src/workflow.ts:118-122.resonate.get(id)+handle.done()+handle.result()— the GET /status endpoint fetches the durable promise by the same id Lambda created it with and reads its state.src/handler.ts:118-129.
What the SDK handles vs. what you write
You write: a Lambda handler that validates the request, builds a DocumentJob, calls resonate.run once with doc/${jobId} as the promise id, and returns 202. You write the workflow as a generator that yields each step through ctx.run. You write the individual step functions (download, extract, analyze, store, notify) as ordinary async functions.
The SDK handles: initiating durable promise registration against the Resonate Server when resonate.run is called; routing the work to a registered worker; checkpointing each ctx.run result so that retries skip completed steps; deduplicating concurrent submissions of the same doc/${jobId} on the server side; exposing the durable result through resonate.get(id) so a separate Lambda invocation can read it back. The workflow is intended to run on a separate long-lived worker process; the Lambda handler triggers and reads.
Failure modes covered
- Workflow exceeds Lambda's 15-minute limit. The workflow does not run in Lambda.
resonate.runhands execution to a worker process that has no Lambda timeout.src/handler.ts:86-88; constraint comparison inREADME.md:34-41. - Lambda container freezes immediately after returning 202. The handler calls
resonate.run(...).catch(console.error)withoutawait(src/handler.ts:88). The SDK's firstawaitinsiderunis thetaskCreatenetwork call to the Resonate Server, so the registration request is initiated before the handler returns but not necessarily completed. AWS Lambda may freeze the execution environment as soon as the async handler returns, so the durable promise landing on the server before Lambda exits is best-effort, not guaranteed — this is the example's known fire-and-forget tradeoff. A deployment that needs the promise to be durably recorded before responding shouldawait resonate.beginRun(...)instead of fire-and-forget —beginRunawaits only thetaskCreateregistration and returns a handle.await resonate.run(...)would block the handler until the workflow's terminal result (potentially hours), reintroducing the 15-minute timeout problem this pattern exists to avoid, and is not appropriate inside the Lambda handler.src/handler.ts:88; SDK atresonate-sdk-ts/src/resonate.ts:290-330. - The same request arrives twice (API Gateway retry, client retry). Both calls resolve to the same
doc/${jobId}promise — the workflow executes once.src/handler.ts:88,README.md:173-179. - The LLM step throws a transient error. Only that step retries; previously checkpointed steps (download, extract) do not re-run. Demonstrated in the local crash demo:
local-demo/src/workflow.ts:54-60throws on attempt 1 ofanalyzeDocument; the replay behavior is a property of the surroundingctx.runsequence atlocal-demo/src/workflow.ts:102-111, where the prior steps' checkpoints are reused on retry rather than re-executed.local-demo/src/index.ts:142-147. - The worker process crashes mid-workflow. Same checkpoint mechanism: on resume, the workflow re-enters at the first non-completed
ctx.run. The five-step pipeline is resilient to crashes at any point.src/workflow.ts:118-122.
When to reach for this pattern
- If you have an HTTP/webhook entry point with a short timeout budget (Lambda, API Gateway, edge function) but the actual work is long-running, multi-step, or unbounded.
- If you need duplicate requests on the same key to be deduplicated server-side without writing a dedup table.
- If a workflow step can fail transiently and you want only that step retried, not the entire pipeline.
- If you want to poll for the result of a backgrounded job from a separate short-lived handler invocation (a different Lambda, a separate request) without storing job state yourself.
- If you want to keep the trigger and the executor on separate compute (cheap stateless trigger, long-lived worker) without coordinating IAM, service registration, or per-step deployment.
Sources
- Example repo: https://github.com/resonatehq-examples/example-aws-lambda-ts
- Resonate TypeScript SDK: https://github.com/resonatehq/resonate-sdk-ts
- Handler source: https://github.com/resonatehq-examples/example-aws-lambda-ts/blob/main/src/handler.ts
- Workflow source: https://github.com/resonatehq-examples/example-aws-lambda-ts/blob/main/src/workflow.ts
- Local demo (crash/retry simulation): https://github.com/resonatehq-examples/example-aws-lambda-ts/blob/main/local-demo/src/workflow.ts
- Resonate docs: https://docs.resonatehq.io
