A single worker process saturates under load and disappears under crash; scaling out to N instances surfaces service discovery, balanced dispatch, and recovery as new problems the application code is forced to solve. Resonate moves all three concerns into the server: workers register into a named group, callers target the group rather than a specific process, and the server picks an available worker and reassigns the work if that worker dies. The example-load-balancing-rs repo demonstrates this with a worker binary that registers one function and a client binary that fires spawn() RPCs against poll://any@workers.
The shape of the solution
#[resonate::function]
async fn compute_something(ctx: &Context, id: String, compute_cost: u64) -> Result<()> {
println!("{id} starting computation");
// Durable sleep simulates a time-consuming task. Survives restarts —
// if this worker crashes mid-sleep, another worker resumes it on the
// remaining time.
ctx.sleep(Duration::from_secs(compute_cost)).await?;
println!("{id} computed something that cost {compute_cost} seconds");
Ok(())
}
// from example-load-balancing-rs/src/bin/worker.rs:11The function is plain async Rust. The group affiliation lives in the Resonate::new config, not in the function body:
let resonate = Resonate::new(ResonateConfig {
url: Some("http://localhost:8001".into()),
group: Some("workers".into()),
..Default::default()
});
resonate.register(compute_something).unwrap();
// from example-load-balancing-rs/src/bin/worker.rs:27Every process started from this binary joins the workers pool. Run three, run thirty — the only thing each instance does on boot is poll the server for work targeted at its group.
The client is symmetrical: it joins a different group ("client") and fires the RPC with a target string that names the worker group, not a specific worker:
let _handle = resonate
.rpc::<_, ()>(&id, "compute_something", (id.clone(), compute_cost))
.target("poll://any@workers")
.spawn()
.await
.expect("rpc spawn failed");
// from example-load-balancing-rs/src/bin/client.rs:23spawn() returns a handle without awaiting the result, so the client process exits immediately after dispatching one invocation. The Resonate Server holds the work until a worker in the workers group claims it.
The durable primitives in play
- Worker groups via
ResonateConfig { group: ... }— every process that starts withgroup: Some("workers".into())joins the same dispatch pool.src/bin/worker.rs:27-31(worker joinsworkers),src/bin/client.rs:9-13(client joinsclient); SDK atresonate-sdk-rs/resonate/src/resonate.rs:121(group resolution) and:215(group used as the default target resolver). #[resonate::function]+resonate.register(compute_something)— registerscompute_somethingunder its name so any client can dispatch it by string.src/bin/worker.rs:11,src/bin/worker.rs:33.resonate.rpc(id, name, args).target("poll://any@workers").spawn()— durable, fire-and-forget RPC: creates a root promise keyed onid, routes the dispatch to any worker in the named group, returns a handle without blocking on the result.src/bin/client.rs:23-28; SDK atresonate-sdk-rs/resonate/src/resonate.rs:361(rpc),:940(targetonResRpcTask),:948(spawn).ctx.sleep(Duration)— durable timer promise used here to simulate a multi-second compute. Survives worker restarts; on resumption another worker waits out only the remaining time.src/bin/worker.rs:17.poll://any@workerstarget string — the routing primitive: schemepoll://, selectorany, groupworkers. The server uses this to pick whichever process inworkersclaims the work first.src/bin/client.rs:25.
What the SDK handles vs. what you write
You write: one #[resonate::function], one Resonate::new with group: "workers", one resonate.register(...) line on the worker; and on the client one resonate.rpc(...).target("poll://any@workers").spawn() call. That is the entire surface.
The SDK and Resonate Server handle: subscribing each worker process to the workers group's poll queue, persisting every invocation as a durable promise keyed on the caller-supplied id, picking an available worker when a new invocation is dispatched, transferring ownership of an in-flight execution to another worker in the group when the original dies, replaying durable checkpoints (here, the ctx.sleep timer) on the recovering worker so it resumes from the remaining wait rather than from the top of the function, and deduplicating concurrent dispatches that share an id so a retried client doesn't double-run the work. Nothing in compute_something mentions service discovery, leader election, heartbeats, locks, or recovery, and nothing in main() does either.
Failure modes covered
- One worker crashes mid-execution. Killing a worker while
compute_somethingis in itsctx.sleepreleases the in-flight durable promise back to theworkersgroup's queue; another worker claims it and resumes from the remaining sleep duration, not from the top of the function. The README states this explicitly: "If you kill one of the workers while it is in the middle of handling executions, you will see the executions recover on another worker. The durablectx.sleepsurvives the crash, so the recovered execution waits out only the remaining time." (README.md:76). - All workers down when the client dispatches.
spawn()only creates the durable promise on the server; the dispatch enqueues against theworkersgroup and waits for any worker to come online. Restarting workers is enough to drain the backlog. The client exits regardless (src/bin/client.rs:31:resonate.stop().await.ok();). - A single worker overloaded by burst traffic.
target("poll://any@workers")routes each invocation to a worker that has capacity to claim it; running additional worker instances absorbs the burst without changing application code. The README invites the test directly: "As you invoke more and more executions, you will see them start to spread across the multiple worker instances." (README.md:74). - Same
iddispatched twice. The RPC is keyed on the promise id (uuid::Uuid::new_v4().to_string()per dispatch insrc/bin/client.rs:15); a secondrpcwith an already-PENDING id attaches to the existing execution rather than starting a parallel one. Here the client mints a fresh UUID per run, so retries from a higher layer that reuse an id will deduplicate.
When to reach for this pattern
- If you need to scale a single function out to many workers without writing service-discovery, load-balancer, or registry code in the application.
- If a worker that crashes mid-task must have its in-flight work recovered onto another worker without re-executing already-completed durable steps.
- If you want clients (or other workflows) to address a pool by name (
poll://any@workers) rather than holding references to specific processes or addresses. - If invocations are independent units of work (here, a single
compute_somethingper RPC) and any worker in the pool is equally capable of running any one of them. - If you want a fire-and-forget dispatch pattern on the client (
.spawn()returns a handle, does not block on the result) while still getting durable execution on the worker side.
Sources
- Example repo: github.com/resonatehq-examples/example-load-balancing-rs
- Rust SDK repo: github.com/resonatehq/resonate-sdk-rs
- SDK primitives cited:
resonate/src/resonate.rs—Resonate::new,ResonateConfig.group,Resonate::register,Resonate::rpc,ResRpcTask::target,ResRpcTask::spawnresonate/src/context.rs—Context::sleep
- Docs:
