5 min readResonate HQJust published

Load balancing across worker instances in TypeScript on Resonate

How a worker group plus a `poll://any@workers` target collapses service discovery, load balancing, and crash recovery into a constructor arg and an option string.

Resonate brand card on a dark background with a teal spectrum wave at the bottom and the post headline in white Sansation.

A single worker process saturates under load and disappears under crash; scaling to N instances surfaces service discovery, balanced dispatch, and recovery as new problems the application code is forced to solve. Resonate moves those concerns into the server: workers register into a named group, callers target the group rather than a specific process, and the server picks an available worker and reassigns the task if that worker dies before the function returns. The example-load-balancing-ts repo demonstrates this with a worker script that registers one function and a client script that fires a Remote Function Invocation (RFI) against poll://any@workers.

The shape of the solution

import { Resonate } from "@resonatehq/sdk";
import type { Context } from "@resonatehq/sdk";
 
const resonate = new Resonate({
  url: "http://localhost:8001",
  group: "workers",
});
 
function computeSomething(context: Context, args: any): void {
  const id = args.id;
  const computeCost = args.computeCost;
  console.log(`${id} starting computation`);
  setTimeout(() => {
    console.log(`${id} computed something that cost ${computeCost} seconds`);
  }, computeCost * 1000); // Simulate computation time
  return;
}
 
resonate.register("computeSomething", computeSomething);
 
console.log("worker is running...");
// from example-load-balancing-ts/worker.ts:1-21

Every process started from this script joins the workers pool. Each instance, on boot, polls the server for work targeted at its group.

The client is symmetrical. It joins a different group ("client") and fires the RFI with a target string that names the worker group, not a specific worker:

import { Resonate } from "@resonatehq/sdk";
import { v4 as uuid } from "uuid";
 
const resonate = new Resonate({
  url: "http://localhost:8001",
  group: "client",
});
 
async function main() {
  try {
    const id = uuid();
    const computeCost = randint(1, 10);
    await resonate.beginRpc(
      id,
      "computeSomething",
      { id: id, computeCost: computeCost },
      resonate.options({
        target: "poll://any@workers",
      })
    );
    resonate.stop();
  } catch (e) {
    console.log(e);
  }
}
 
function randint(min: number, max: number): number {
  return Math.floor(Math.random() * (max - min + 1)) + min;
}
 
main();
// from example-load-balancing-ts/client.ts:1-31

beginRpc returns a handle without awaiting the result, so the client process can stop() and exit immediately after dispatching one invocation. The Resonate Server holds the durable promise until a worker in the workers group claims it.

The durable primitives in play

  • Worker groups via new Resonate({ group }) — every process that constructs new Resonate({ group: "workers" }) joins the same dispatch pool. When the constructor is given a url (as in this example), the SDK opens a poll subscription against /poll/workers/<pid> on the server; with no url, no poll URL is opened. worker.ts:4-7 (worker joins workers), client.ts:4-7 (client joins client); SDK resonate-sdk-ts/src/resonate.ts:103 (group default "default"), :155-160 (poll URL is /poll/{group}/{pid} when resolvedUrl is set), :171 (no poll URL for LocalNetwork).
  • resonate.register(name, func) — registers computeSomething under its string name so any client can dispatch it without a function reference. worker.ts:19; SDK resonate-sdk-ts/src/resonate.ts:253-294.
  • resonate.beginRpc(id, name, args, options) — creates a root durable promise keyed on id, attaches resonate:target to its tags, returns a handle without blocking on the result. client.ts:13-20; SDK resonate-sdk-ts/src/resonate.ts:380-427.
  • resonate.options({ target: "poll://any@workers" }) — the routing primitive: scheme poll://, selector any, group workers. The server uses this to pick whichever process in workers claims the work first. client.ts:17-19; SDK resonate-sdk-ts/src/resonate.ts:421 (target written into resonate:target tag), :186-197 (match function used when the target is a bare group name).
  • resonate.stop() — releases the network and heartbeat so the client process can exit cleanly after dispatch without leaving the durable promise's fate dependent on the caller. client.ts:21.

What the SDK handles vs. what you write

You write: one function, one new Resonate({ group: "workers" }), one resonate.register(...) line on the worker; and on the client one resonate.beginRpc(...) call with a target option. That is the entire surface.

The SDK handles: subscribing each worker process to its group's poll queue via /poll/{group}/{pid} so the server knows which workers are alive in workers; constructing an AsyncHeartbeat at ttl/2 for any networked instance and sending task.heartbeat messages to the server (SDK heartbeat.ts:33-49, resonate.ts:177-181); creating the durable promise on the server keyed on the caller-supplied id and writing the target into its resonate:target tag (SDK resonate.ts:421); and reconnecting a re-beginRpc call with an already-PENDING id to the existing durable promise rather than starting a parallel execution (the standard durable-promise create-on-existing-id semantics, not a separate dedup mechanism).

The Resonate Server handles: picking an available worker in the target group when a new invocation is dispatched against poll://any@workers; and releasing a claimed task back to the queue if the claiming worker's heartbeats lapse past its TTL, so another worker in the group can claim it.

Nothing in computeSomething mentions service discovery, leader election, heartbeats, locks, or recovery. Nothing in main() does either.

Failure modes covered

  • All workers down when the client dispatches. beginRpc only creates the durable promise on the server; the dispatch enqueues against the workers group and waits for any worker to come online. The client exits cleanly via resonate.stop() (client.ts:21) regardless. Starting workers later drains the backlog.
  • A worker claims a task and then dies before it runs. Because the worker's heartbeat lapses (SDK resonate.ts:177-181 constructs an AsyncHeartbeat at ttl/2 for any networked instance), the Resonate Server reassigns the unfulfilled durable promise to another worker in the workers group. The README invites this test directly: "If you kill one of the workers while it is in the middle of handling executions, you will see the executions recover on another worker." (README.md:81).
  • Burst load against a single worker. target: "poll://any@workers" routes each invocation to a worker that claims it; running additional worker instances absorbs the burst without changing application code. The README invites the test: "As you invoke more and more executions, you will see them start to spread across the multiple worker instances." (README.md:79).
  • Same id dispatched twice. Each new invocation in this example mints a fresh UUID (client.ts:11), but the durable-promise model means a re-dispatch with an already-PENDING id reconnects to the existing execution rather than starting a parallel one — useful if a higher layer retries with a stable idempotency key.

A caveat the code makes explicit: computeSomething returns synchronously after scheduling a setTimeout (worker.ts:13-17). The body of the simulated computation is plain JavaScript, not a durable step. Crash recovery applies to the dispatch and claim lifecycle managed by the Resonate Server; durability inside the function body would require additional checkpoints (for example, ctx.sleep for the timer, or ctx.run around side-effecting work).

When to reach for this pattern

  • If you need to scale a single function out to many workers without writing service-discovery, load-balancer, or registry code in the application.
  • If you want clients (or other workflows) to address a pool by name (poll://any@workers) rather than holding references to specific processes or addresses.
  • If invocations are independent units of work and any worker in the pool is equally capable of running any one of them.
  • If a worker that crashes between claiming and completing a task must have that task reassigned to another worker without operator action.
  • If you want a fire-and-forget dispatch from the client (beginRpc returns a handle, does not block on the result) while keeping the durable promise on the server for the work to land against.

Sources