A multi-service request flow that crosses process boundaries needs to survive any node crashing mid-flight without re-doing completed work. Resonate models the entire cross-service call graph as a durable promise rooted at the gateway, so each remote invocation is a checkpoint and any service in the same group can resume an in-flight call. This example exposes three HTTP routes on a Flask gateway and routes each to one of three durable remote-invocation shapes — ctx.rfc (await chain), ctx.detached (fire-and-forget chain), and ctx.rfi (fan-out with promises) — backed by nine Python services.
The shape of the solution
The gateway is the ephemeral-to-durable boundary. Every Flask route uses resonate.options(target=...).rpc(promise_id, func, *args) with a hard-coded promise_id and a target of the form poll://<group>:
@app.route("/await-chain", methods=["POST"])
def await_chain_route_handler():
try:
print("running await_chain_route_handler")
promise_id = "await-chain"
handle = resonate.options(target="poll://service-a").rpc(promise_id, "foo")
print("waiting on result")
message = handle.result()
return jsonify({"message": message}), 200
except Exception as e:
print(e)
return jsonify({"error": str(e)}), 500
# from example-async-rpc-py/src/gateway.py:20Inside the durable call graph, the three flows differ only in which Context method they call. The fan-out workflow uses two rfi calls back-to-back to overlap the remote invocations of rax and dop:
@resonate.register
def zim(ctx, arg):
print("running function zim")
promise_bar = yield ctx.rfi("rax").options(target="poll://service-h")
promise_baz = yield ctx.rfi("dop").options(target="poll://service-i")
result_bar = yield promise_bar
result_baz = yield promise_baz
return result_bar + result_baz + arg
# from example-async-rpc-py/src/service_g.py:22The durable primitives in play
resonate.options(target=...).rpc(promise_id, func, *args)— the ephemeral-to-durable entry point. Returns a handle whose.result()blocks on the entire durable call graph. The staticpromise_id("await-chain","detached-chain","fan-out-workflow") means a re-sent request reconnects to the same in-flight invocation.gateway.py:25,gateway.py:39,gateway.py:52.ctx.rfc(func)— Remote Function Call. The generator yields and is paused until the remote function returns, then receives the result inline. Used to chainfoo → bar → baz.service_a.py:22,service_b.py:22.ctx.detached(func, arg)— invokes a remote function in a new Call Graph detached from the caller's. Returns anRFI(mode="detached") whose yield resolves to a durable promise; the caller can either yield on that promise to retrieve the callee's return value (the same shape asctx.rfi) or discard it for fire-and-forget. Used to chainqux → quz → cog, fire-and-forget style:service_d.py:21discards the promise (unassigned yield);service_e.py:22captures it asresultbut never yields on it (soresultis aPromisehandle, andreturn result + 1would only matter if anyone awaitedquz's return — nothing does). The final chain value is therefore printed bycogitself onservice-f(service_f.py:22) rather than returned up the chain.ctx.rfi(func)— Remote Function Invocation. Returns a durable promise that can be yielded on at any later point in the generator.zimissues tworficalls back-to-back and then yields on both promises, producing in-parallel execution ofraxanddop.service_g.py:25,service_g.py:26.- Application Node identity (
group+id) — each service constructsResonatewith a hard-codedapp_node_group(e.g.,"service-a") and a freshuuid.uuid4()app_node_idso multiple instances can share a group for anycast routing viatarget="poll://<group>".service_a.py:8–16(pattern repeated acrossservice_b..service_i). The gateway is the exception:gateway.py:7–8hard-codes bothapp_node_id = "gateway"andapp_node_group = "gateway"because it is a single unicast node that initiates calls but does not receive them.
What the SDK handles vs. what you write
You write three things: the Flask route handlers that call resonate.options(target=...).rpc(...), the generator functions registered with @resonate.register, and the per-node identity (app_node_group, app_node_id). Each durable function is a plain Python generator that yields on a Context method to invoke remote work — there is no transport code, no message broker setup, no per-call retry wrapper, no shared correlation IDs to thread through requests, and no try/except inside durable functions (README.md:71).
The SDK handles the rest: durably persisting the call graph and per-invocation arguments to the Resonate Server, routing each call to a node in the target group via the poll://<group> address, awaiting and resolving cross-process promises, replaying the generator from the last checkpoint after a crash, automatically retrying functions that raise, and reconnecting a re-sent top-level invocation (same promise_id) to the existing in-flight durable promise.
Failure modes covered
- A service node crashes mid-call. Each
ctx.rfc/ctx.rfi/ctx.detachedcall is a durable checkpoint. When a new node joins the same group, it claims the durable promise and resumes from the last completed step. Documented in README.md:82–99; demonstrable by injectingyield ctx.sleep(10)into any function (README.md:91–96) and killing the process during the sleep. - The gateway crashes after handing off to
rpc(...). The durable call graph continues to make progress in the service groups. Because the route handler uses a staticpromise_id(gateway.py:24,:38,:51), a re-sent cURL request invokesrpcwith the same id and the SDK returns a handle attached to the existing in-flight promise rather than starting a new one (README.md:98–99). - A durable function raises. Inside the durable call graph the SDK catches the error and retries the function automatically — no
try/exceptis needed infoo,bar,baz,qux,quz,cog,zim,rax, ordop(README.md:71). The onlytry/exceptblocks in the codebase are the three Flask route handlers (gateway.py:22,:36,:49), where the ephemeral HTTP request can't be resumed. - The Flask process crashes before the detached chain completes. The detached chain does not depend on the gateway for completion —
quxdetachesquz,quzdetachescog, andcogprints the value onservice-f(service_f.py:22). The/detached-chainhandler discards therpchandle without calling.result()(gateway.py:39–40) and returns"detached-chain started"immediately, so the gateway is not in the result path. - Two
service-anodes are running at the same time. Thepoll://service-atarget is anycast — only one node in the group claims a given invocation (README.md:80). Adding nodes to a group is the horizontal-scaling and high-availability story; killing nodes is the recovery story.
When to reach for this pattern
- If you have an HTTP gateway that fans work out to multiple downstream services and you want crashes anywhere in the chain to be recoverable, not lost.
- If you need a request flow that can be resumed by a re-sent client request (same
promise_idin, same durable promise out). - If the work fans out in parallel across services and the caller needs to combine results, use
ctx.rfito grab promises up front andyieldeach promise when you need its value. - If the work is a chain where each step hands off to the next without anyone waiting on the tail, use
ctx.detachedand let the final node print, persist, or notify. - If the work is a synchronous service-to-service chain where each step needs the next step's return value, use
ctx.rfc. - If you are running multiple instances of the same service for HA and want anycast routing without standing up a load balancer, set
groupon each node and targetpoll://<group>.
Sources
- Example repo: https://github.com/resonatehq-examples/example-async-rpc-py
- Resonate Python SDK: https://github.com/resonatehq/resonate-sdk-py
- SDK version pinned:
resonate-sdk>=0.6.7(pyproject.toml:10). - README — Async RPC explanation: https://github.com/resonatehq-examples/example-async-rpc-py/blob/main/README.md
- Resonate docs — Async RPC pattern: https://docs.resonatehq.io/get-started/examples/async-rpc
- Resonate docs — Python SDK guide: https://docs.resonatehq.io/develop/python
