A single order.confirmed event has to notify the customer over four independent channels — email, SMS, Slack, push — and total time should be the slowest channel, not the sum, while a transient failure on any one channel must not cause the others to re-send. The Resonate shape of the solution is to start each channel as a Remote Function Invocation (ctx.rfi) keyed by a deterministic per-channel id, then yield each handle in turn to fan back in; every branch is a durable promise checkpointed independently. The repo pins resonate-sdk>=0.6.7 and runs against resonate serve from resonatehq/resonate-legacy-server (README:48-50), so the surface shown here is the pre-0.10 Python SDK API. The example demonstrates the workflow under the happy path and under a forced first-attempt failure on the push channel.
The shape of the solution
def notify_all(
ctx: Context,
event: dict[str, Any],
simulate_crash: bool,
) -> Generator[Any, Any, dict[str, Any]]:
start = time.time()
# Fan-out: start all 4 channels simultaneously.
# ctx.rfi(...) returns a handle immediately — no blocking.
email_p = yield ctx.rfi(send_email, event).options(
id=f"{ctx.id}.email",
)
sms_p = yield ctx.rfi(send_sms, event).options(
id=f"{ctx.id}.sms",
)
slack_p = yield ctx.rfi(send_slack, event).options(
id=f"{ctx.id}.slack",
)
push_p = yield ctx.rfi(send_push, event, simulate_crash).options(
id=f"{ctx.id}.push",
)
# Fan-in: await each result.
# If push fails and retries, the other channels are already checkpointed.
email = yield email_p
sms = yield sms_p
slack = yield slack_p
push = yield push_p
results = [email, sms, slack, push]
return {
"order_id": event["order_id"],
"channels_notified": sum(1 for r in results if r.get("success")),
"total_ms": int((time.time() - start) * 1000),
"results": results,
}
# from example-fan-out-fan-in-py/workflow.py:29-65The workflow is a generator function, not async def. yield ctx.rfi(...) returns a handle to the durable promise the SDK created for that branch; yield <handle> later in the function is where the workflow blocks on the result. The body has eight yield keywords total — four on the rfi-call lines, four on the fan-in lines.
The durable primitives in play
Resonate.remote()— constructs a client that talks to a Resonate server (the legacy server, per the SDK pin in the opener).main.py:25.resonate.register(fn)— registers a top-level workflow or a child function so the worker can claim and execute it. The wrapper returned for the workflow exposes.run(id, *args).main.py:29-33.workflow.run(id, *args)— starts the workflow under a caller-supplied promise id (f"notify/{event['order_id']}"), giving the run a stable handle for replay.main.py:58-62.ctx.rfi(fn, *args)— Remote Function Invocation. Creates a durable promise for the child branch and returns a handle without blocking.workflow.py:38-49..options(id=...)— pins the child promise to a deterministic id (f"{ctx.id}.email", etc.), so on replay the SDK looks up the existing branch rather than starting a new one.workflow.py:38-49.yield <handle>— the fan-in. The workflow suspends here until the branch's durable promise resolves; the SDK then feeds the resolved value back into the generator.workflow.py:53-56.
What the SDK handles vs. what you write
| SDK handles | You write |
|---|---|
Creating one durable promise per ctx.rfi(...) and persisting it in the promise store | The four ctx.rfi(...) calls (one per channel) |
Suspending the workflow on yield <handle> and resuming when the promise resolves | The yield <handle> fan-in points |
| Storing each branch's return value as soon as it resolves so that a retry does not re-execute it | The channel functions that produce that return value (send_email, send_sms, send_slack, send_push) |
| Retrying a registered function that raised, while leaving sibling branches untouched | The actual failure mode in send_push (a RuntimeError on attempt 1 when simulate_crash is set) |
| Mapping the deterministic per-branch id to its stored result on replay | The id template f"{ctx.id}.<channel>" passed to .options(id=...) |
The workflow body reads like straight-line Python. The branching, persistence, and per-step retry are not in the code you write — they are in the SDK + server.
Failure modes covered
- The push channel raises on its first attempt.
channels.py:131-134throwsRuntimeError("Push service unavailable — will retry")whensimulate_crashis set andattempt == 1. Push has the shortest latency (time.sleep(0.12)atchannels.py:129), so on the recorded crash run the push attempt fails before the other three channels finish: the README crash block (lines 110-121) shows push attempt 1 fails (line 115), then slack/sms/email complete on their own branches (the three completion lines at 116-118 — twoSent — msg_...lines plus onePosted — msg_slack_...), then push attempt 2 fires (lines 119-120). Email, SMS, and Slack are not re-invoked when push retries because each branch is keyed by its own deterministic id (f"{ctx.id}.<channel>",workflow.py:38-49) and lives in its own durable promise; the SDK retries only the failedsend_pushstep. - The worker crashes mid-workflow. Because each branch promise carries a deterministic id (
f"{ctx.id}.<channel>") and the workflow itself runs under a caller-supplied id (f"notify/{event['order_id']}",main.py:59), a replay of the workflow looks up the existing per-channel promises rather than issuing new sends. Branches that had already resolved feed their stored results back into the generator; branches still in flight resume on whichever worker claims them. - The workflow is invoked twice with the same order id. The outer promise id is keyed on
order_id(main.py:59); the second call resolves against the existing promise rather than starting a parallel run.
The example does not implement provider-level idempotency on the channel side — that lives outside the workflow and is not claimed.
When to reach for this pattern
- If a single trigger has to drive N independent side effects and total latency should track the slowest one, not the sum.
- If partial failure must be tolerated — one branch failing and retrying without re-driving the successful branches.
- If the branches are heterogeneous (different providers, different latencies, different retry profiles) but the orchestration is uniform.
- If you want straight-line workflow code rather than a DAG declaration or a child-workflow framework.
- If each branch's result needs to be durably captured so replays after a crash do not re-issue calls already acknowledged by the provider.
Sources
- Example repo: https://github.com/resonatehq-examples/example-fan-out-fan-in-py
- Python SDK repo: https://github.com/resonatehq/resonate-sdk-py (this example pins
resonate-sdk>=0.6.7, the pre-0.10 surface) - Legacy Resonate server: https://github.com/resonatehq/resonate-legacy-server
- Resonate documentation: https://docs.resonatehq.io
- Files cited in this post:
workflow.py:29-65— thenotify_allgeneratormain.py:25-62— client construction, registration, and run invocationchannels.py:113-143— thesend_pushfunction and its failure injectionpyproject.toml:9— SDK pin
