5 min readResonate HQJust published

Durable batch processor with checkpoint resume in TypeScript on Resonate

How the batch-processor pattern collapses to a for-loop of ctx.run calls when each iteration is a durable checkpoint.

Resonate brand card on a dark background with a teal spectrum wave at the bottom and the post headline in white Sansation.

A bulk import of N records has to make forward progress across crashes: if the worker dies mid-import, the next run must resume at the first uncompleted batch and must not re-process batches that already succeeded. The Resonate shape of the solution is to chunk the input into batches and wrap each batch in ctx.run(...) inside a plain for-loop; every call is a durable promise, so on replay the completed batches return their stored result and the loop advances to the first uncompleted one. The example runs in embedded mode under Bun and ships both a happy-path entrypoint and a --crash entrypoint that forces batch 3 to throw on its first attempt.

The shape of the solution

export function* importRecords(
  ctx: Context,
  records: Record[],
  batchSize: number,
  crashAtBatch: number,
): Generator<any, ProcessingResult, any> {
  const batches = chunkArray(records, batchSize);
  const batchResults: BatchResult[] = [];
 
  for (let i = 0; i < batches.length; i++) {
    const batch = batches[i]!;
    // Each yield* is a durable checkpoint. On crash+resume,
    // completed batches are returned from cache — not re-processed.
    const result = yield* ctx.run(processBatchChunk, i, batch, crashAtBatch);
    batchResults.push(result);
  }
 
  const totalProcessed = batchResults.reduce((s, b) => s + b.processed, 0);
  const totalSkipped = batchResults.reduce((s, b) => s + b.skipped, 0);
 
  return {
    totalRecords: records.length,
    totalProcessed,
    totalSkipped,
    batchCount: batches.length,
    batches: batchResults,
  };
}
// from example-batch-processor-ts/src/workflow.ts:37-64

The workflow is a generator function (function*), not async. yield* ctx.run(processBatchChunk, ...) creates a durable child promise for that batch, suspends the workflow until the child resolves, and resumes with the returned BatchResult. The loop index i is not stored anywhere by user code — it is rediscovered by replay, because each completed iteration's child promise is already resolved in the promise store.

The durable primitives in play

  • new Resonate() — constructs an embedded-mode client. With no constructor arguments, the SDK runs against an in-process promise store; no external server is required. src/index.ts:9.
  • resonate.register(importRecords) — registers the top-level workflow so the SDK can claim and execute it (and so a replay can look it up by name + version). src/index.ts:10.
  • resonate.run(id, fn, ...args) — starts the workflow under a caller-supplied promise id (`import/${Date.now()}`). The id is the durable handle for the whole run. src/index.ts:40-46.
  • ctx.run(fn, ...args) — alias for ctx.lfc (Local Function Call). Creates a durable promise for the child call and yields back the resolved value T, not a Future<T>. Each call is checkpointed independently. SDK definition: resonate-sdk-ts/src/context.ts:214-215, alias at :283. Used in: src/workflow.ts:50.
  • yield* ctx.run(...) — the iterator protocol on LFC<T> yields the LFC itself once and is fed back the resolved T. SDK: resonate-sdk-ts/src/context.ts:72-76. The rejection path is driven from outside the iterator: the coroutine decorator calls this.generator.throw(value.error) when a child promise rejected, which propagates the error out of the user's yield* ctx.run(...) site. SDK: resonate-sdk-ts/src/decorator.ts:218-219.
  • Default retry policy on the batch stepprocessBatchChunk is an async function, so its LFC is created with new Exponential() as the retry policy. SDK: resonate-sdk-ts/src/context.ts:432 (opts.retryPolicy ?? (util.isGeneratorFunction(func) ? new Never() : new Exponential())). The example does not pass a custom policy to ctx.run, so the default applies.
  • SDK surfacepackage.json:11 pins "@resonatehq/sdk": "^0.10.0". On the 0.10 line, Context exposes run / beginRun / rpc / beginRpc as aliases of the underlying lfc / lfi / rfc / rfi operations (resonate-sdk-ts/src/context.ts:283-286); detached is a separate verb that returns an RFI in "detached" mode (:226-227, implementation at :545-567).

What the SDK handles vs. what you write

SDK handlesYou write
Creating one durable promise per ctx.run(...) call and persisting it in the promise storeThe ctx.run(processBatchChunk, i, batch, crashAtBatch) call inside the loop
Suspending the generator on yield* ctx.run(...) and resuming with the stored value on replayThe for-loop body that consumes each batch result
Storing each batch's return value as soon as it resolves so a replay does not re-execute the batch functionThe pure batch function processBatchChunk that produces that return value
Retrying an async function passed to ctx.run that threw, under the default Exponential policyThe actual failure mode (throw new Error("Batch 3 failed — simulated DB timeout") on attempt 1)
Discovering "where the loop is" on resume — by reading each ctx.run child promise and skipping completed onesNothing — there is no progress table, no cursor, no resume-from-N parameter

The workflow body reads like a plain for-loop. The progress tracking, persistence, per-batch retry, and replay-skip logic are not in the code you write — they are in the SDK.

Failure modes covered

  • processBatchChunk throws on its first attempt at the target batch. src/processor.ts:70-73 throws new Error(`Batch ${batchIndex} failed — simulated DB timeout`) when crashAtBatch === batchIndex && attempt === 1. The SDK retries the async function passed to ctx.run under the default Exponential policy. The README's crash-mode runtime log captures the SDK message verbatim: Runtime. Function 'processBatchChunk' failed with 'Error: Batch 3 failed — simulated DB timeout' (retrying in 2 secs) (README:94).
  • The workflow function suspends and resumes mid-loop. Because every iteration's ctx.run(...) is its own durable promise, a resume of the workflow re-enters the for-loop from i = 0, but each yield* for an already-completed batch returns its stored BatchResult from the promise store instead of re-invoking processBatchChunk. The loop advances to the first uncompleted batch with no batch re-processing.

The example runs in embedded mode (no Resonate server). Claims about cross-process replay against a shared promise store are properties of the SDK + server, not features the example itself wires up. The outer promise id is `import/${Date.now()}` (src/index.ts:41), so every invocation of the script produces a fresh root id — the example never demonstrates two runs sharing an id, and run-deduplication is a separate Resonate property the post does not claim. Database idempotency is also out of scope — processBatchChunk simulates a DB write with await sleep(150) and a counted-attempts Map (src/processor.ts:30, 68); a real implementation would need provider-side idempotency keys to make re-issued batches safe end-to-end.

When to reach for this pattern

  • If you're importing a large record set in fixed-size chunks and a crash anywhere through the import must not restart from record 0.
  • If each chunk is independently committable (writing a chunk does not require the others to have finished) — this is the prerequisite for treating each chunk as its own checkpoint.
  • If you want straight-line for-loop code instead of an external progress table, cursor, or resume-from-N parameter — the loop index is rediscovered by replay rather than stored by you.
  • If the batch size is a tuning knob you want to control directly — more records per batch means fewer durable promises (less storage) but more re-work on a mid-batch crash.
  • If a single retry policy per batch (the SDK's default Exponential on async functions) is acceptable; if you need bespoke policy per batch type, pass it via ctx.run's options.

Sources

  • Example repo: https://github.com/resonatehq-examples/example-batch-processor-ts
  • TypeScript SDK repo: https://github.com/resonatehq/resonate-sdk-ts
  • SDK source for the primitives used:
    • resonate-sdk-ts/src/context.ts:214-215run declaration
    • resonate-sdk-ts/src/context.ts:283-286run / rpc / beginRun / beginRpc alias bindings
    • resonate-sdk-ts/src/context.ts:226-227, 545-567detached (separate verb, returns RFI in "detached" mode — not an alias)
    • resonate-sdk-ts/src/context.ts:45-77LFC<T> and its iterator (yield* returns T)
    • resonate-sdk-ts/src/context.ts:403-436lfc implementation, default retry policy
    • resonate-sdk-ts/src/decorator.ts:213-237safeGeneratorNext (rejection propagation: this.generator.throw(value.error) at :219)
  • Resonate documentation: https://docs.resonatehq.io
  • Files cited in this post:
    • src/workflow.ts:37-64 — the importRecords generator
    • src/processor.ts:52-87 — the processBatchChunk step (async function, retried by SDK)
    • src/processor.ts:30, 68, 70-73 — attempts counter, simulated DB latency, crash injection
    • src/index.ts:9-10, 40-46 — embedded client, registration, run invocation
    • package.json:11 — SDK pin
    • README.md:94 — crash-mode runtime log line