An HTTP API that accepts long-running work needs to survive process restarts and worker crashes without losing the request. The shape of the Resonate solution is to split the system in two: a stateless FastAPI gateway that dispatches a durable promise to a separate worker group and returns immediately, plus a polling endpoint that reads completion state from the Resonate Server rather than from gateway memory. The example-async-http-api-py repo shows this with FastAPI and the Python SDK in roughly 80 lines across two files.
The shape of the solution
# Initialize Resonate - for production, configure with external store
resonate = Resonate().remote(group="gateway")
@app.post("/begin")
def begin(data=None, id=None):
# IMPORTANT: Provide your own ID for deduplication and retries
# Without a client-provided ID, retries will create duplicate work
if id is None:
id = str(uuid.uuid4())
# Set reasonable defaults for your use case
if data is None:
data = {"foo": "bar"}
# This starts durable execution remotelly at any node registered under worker group - the function will complete even if this process dies
handle = resonate.options(target="poll://any@worker").begin_rpc(func="foo", id=id, data=data)
return {
"promise": handle.id,
"status": "pending",
"wait": f"/wait?id={handle.id}"
}
# from example-async-http-api-py/main.py:10-32The gateway never awaits the workflow. begin_rpc returns a Handle as soon as the durable promise is created on the Resonate Server, and the HTTP response goes back to the client with a promise id and a polling URL.
The worker side is a single registered function:
# Initialize Resonate under worker group - for production, configure with external store
resonate = Resonate().remote(group="worker")
# Register your durable functions with @resonate.register
# IMPORTANT: All parameters must be serializable
@resonate.register
def foo(context: Context, data):
# Add your processing, external API calls, database operations, etc.
# IMPORTANT: Return values must be serializable
print("resolved at worker node")
return {"result": f"Processed: {data}", "timestamp": time.time()}
if __name__ == "__main__":
resonate.start()
Event().wait()
# from example-async-http-api-py/worker.py:7-21The gateway runs in group "gateway" and the worker in group "worker"; the target="poll://any@worker" option on begin_rpc routes the work to any process in the worker group (main.py:11, worker.py:8, main.py:26).
The durable primitives in play
resonate.options(target=...).begin_rpc(id, func, *args, **kwargs)— creates a durable promise with the supplied id, enqueues the function-name dispatch for the target group, returns aHandlewithout awaiting completion.idis the first positional parameter andfuncis the second; the example calls it via kwargs asbegin_rpc(func="foo", id=id, data=data)(main.py:26). Deduplication is keyed onid: a second call with the same id subscribes to the existing promise rather than starting new work. SDK definition atresonate-sdk-pyv0.6.2resonate/resonate.py:477.resonate.get(id)— re-attaches to an existing durable promise from a cold start. The/waithandler uses this so the gateway holds no in-memory state between requests.main.py:38; SDK definition atresonate-sdk-pyv0.6.2resonate/resonate.py:525.handle.done()— non-blocking check on completion state; returnsbool.main.py:41; SDK definition atresonate-sdk-pyv0.6.2resonate/models/handle.py:18.handle.result()— returns the resolved value or raises the rejection error, so the/waithandler can branch onresolvedvs not-found via try/except.main.py:42; SDK definition atresonate-sdk-pyv0.6.2resonate/models/handle.py:21.@resonate.register— registers a function under its own name so the gateway can dispatch it by the string"foo".worker.py:12; SDK definition atresonate-sdk-pyv0.6.2resonate/resonate.py:270(decorator-form overloads).Resonate.remote(group=...)— classmethod factory that wires up the remote store and poller against the Resonate Server and tags this process with a worker group.main.py:11,worker.py:8; SDK definition atresonate-sdk-pyv0.6.2resonate/resonate.py:173.
What the SDK handles vs. what you write
You write: the FastAPI routes, the function body (foo), the choice of promise id for deduplication, and the group strings ("gateway" / "worker").
The SDK and Resonate Server handle: persisting the durable promise the moment begin_rpc returns, routing the dispatch to a process in the worker group via the poll://any@worker target, redispatching to another worker if the first one disappears, surfacing completion state to any process that later calls resonate.get(id), and decoding the resolved value through handle.result(). The gateway process never tracks which worker took the job, and the worker process never tracks which gateway requested it — both sides exchange only (id, function_name, kwargs) through the server.
The worker function in this minimal example is a single straight-line block — it does not use ctx.run(...). Real workloads would wrap each side-effecting step (DB write, external API call, long computation) in a durable step call so that a crash mid-function resumes from the last successful step rather than from the top. The example demonstrates the dispatch-and-poll shell; per-step checkpointing is the next layer.
Failure modes covered
- Worker crashes mid-execution. The durable promise lives on the Resonate Server, not in worker memory. When the crashed worker stops heartbeating, the server re-dispatches the work to another process in the
workergroup. The dispatch targetpoll://any@worker(main.py:26) is what makes load-balanced redelivery work. - Gateway crashes between
/beginand the client's first/waitpoll. The gateway holds no in-memory map of id to handle. On restart,/waitcallsresonate.get(id)(main.py:38) and reads the current state from the server. - Client retries
/beginwith the same id. Resonate deduplicates by promise id. The secondbegin_rpccall attaches to the in-flight (or already-resolved) promise rather than starting duplicate work. Theidquery parameter on/begin(main.py:15,main.py:18-19) is what gives clients a stable idempotency key; if it's omitted, the gateway generates auuid.uuid4()for fresh work. - Client polls
/waitagainst a workflow that hasn't completed yet.handle.done()returnsFalsewithout blocking the gateway thread (main.py:41). The handler responds with{"status":"pending"}and frees the connection. /waitcalled with an unknown id, or any other lookup failure.resonate.get(id)raises; thetry/exceptblock atmain.py:36-57returns a 404 with{"detail":"<id> not found"}.
When to reach for this pattern
- If you're exposing an HTTP endpoint that triggers work taking longer than a reasonable HTTP timeout (seconds to hours) and the client should not hold a connection open.
- If the same logical request might be sent more than once (network retries, client-side retry loops) and you want the second request to attach to the first rather than duplicate the work.
- If the HTTP frontend and the work-doing process should scale and crash independently — gateway pods can be replaced without dropping in-flight work, and worker pods can be replaced without dropping in-flight requests.
- If a status/result endpoint must keep working across full restarts of every process in the system — recovery cannot depend on any specific process being up.
- If you eventually want each step inside the worker function to be individually checkpointed (DB writes, external API calls, long computations), the same registration shape extends with durable step calls inside
foo.
Sources
- Example repo: github.com/resonatehq-examples/example-async-http-api-py
- Python SDK repo: github.com/resonatehq/resonate-sdk-py
- SDK primitives cited (pinned to
v0.6.2, matching this repo'suv.lock):resonate/resonate.py@ v0.6.2 —Resonate.remote(L173),Resonate.register(L270),Resonate.begin_rpc(L477),Resonate.get(L525)resonate/models/handle.py@ v0.6.2 —Handle.done(L18),Handle.result(L21)
- Docs:
