May 25, 20264 min readResonate HQ

Fan-out / fan-in across notification channels in Python on Resonate

How the fan-out / fan-in pattern collapses to four `yield ctx.rfi(...)` calls plus four `yield <handle>` fan-in points when every branch is a durable promise.

python fan-out-fan-in for-agents

A single order.confirmed event has to notify the customer over four independent channels — email, SMS, Slack, push — and total time should be the slowest channel, not the sum, while a transient failure on any one channel must not cause the others to re-send. The Resonate shape of the solution is to start each channel as a Remote Function Invocation (ctx.rfi) keyed by a deterministic per-channel id, then yield each handle in turn to fan back in; every branch is a durable promise checkpointed independently. The repo pins resonate-sdk>=0.6.7 and runs against resonate serve from resonatehq/resonate-legacy-server (README:48-50), so the surface shown here is the pre-0.10 Python SDK API. The example demonstrates the workflow under the happy path and under a forced first-attempt failure on the push channel.

The shape of the solution

def notify_all(
    ctx: Context,
    event: dict[str, Any],
    simulate_crash: bool,
) -> Generator[Any, Any, dict[str, Any]]:
    start = time.time()
 
    # Fan-out: start all 4 channels simultaneously.
    # ctx.rfi(...) returns a handle immediately — no blocking.
    email_p = yield ctx.rfi(send_email, event).options(
        id=f"{ctx.id}.email",
    )
    sms_p = yield ctx.rfi(send_sms, event).options(
        id=f"{ctx.id}.sms",
    )
    slack_p = yield ctx.rfi(send_slack, event).options(
        id=f"{ctx.id}.slack",
    )
    push_p = yield ctx.rfi(send_push, event, simulate_crash).options(
        id=f"{ctx.id}.push",
    )
 
    # Fan-in: await each result.
    # If push fails and retries, the other channels are already checkpointed.
    email = yield email_p
    sms = yield sms_p
    slack = yield slack_p
    push = yield push_p
 
    results = [email, sms, slack, push]
 
    return {
        "order_id": event["order_id"],
        "channels_notified": sum(1 for r in results if r.get("success")),
        "total_ms": int((time.time() - start) * 1000),
        "results": results,
    }
# from example-fan-out-fan-in-py/workflow.py:29-65

The workflow is a generator function, not async def. yield ctx.rfi(...) returns a handle to the durable promise the SDK created for that branch; yield <handle> later in the function is where the workflow blocks on the result. The body has eight yield keywords total — four on the rfi-call lines, four on the fan-in lines.

The durable primitives in play

Resonate.remote() — constructs a client that talks to a Resonate server (the legacy server, per the SDK pin in the opener). main.py:25.
resonate.register(fn) — registers a top-level workflow or a child function so the worker can claim and execute it. The wrapper returned for the workflow exposes .run(id, *args). main.py:29-33.
workflow.run(id, *args) — starts the workflow under a caller-supplied promise id (f"notify/{event['order_id']}"), giving the run a stable handle for replay. main.py:58-62.
ctx.rfi(fn, *args) — Remote Function Invocation. Creates a durable promise for the child branch and returns a handle without blocking. workflow.py:38-49.
.options(id=...) — pins the child promise to a deterministic id (f"{ctx.id}.email", etc.), so on replay the SDK looks up the existing branch rather than starting a new one. workflow.py:38-49.
yield <handle> — the fan-in. The workflow suspends here until the branch's durable promise resolves; the SDK then feeds the resolved value back into the generator. workflow.py:53-56.

What the SDK handles vs. what you write

SDK handles	You write
Creating one durable promise per `ctx.rfi(...)` and persisting it in the promise store	The four `ctx.rfi(...)` calls (one per channel)
Suspending the workflow on `yield <handle>` and resuming when the promise resolves	The `yield <handle>` fan-in points
Storing each branch's return value as soon as it resolves so that a retry does not re-execute it	The channel functions that produce that return value (`send_email`, `send_sms`, `send_slack`, `send_push`)
Retrying a registered function that raised, while leaving sibling branches untouched	The actual failure mode in `send_push` (a `RuntimeError` on attempt 1 when `simulate_crash` is set)
Mapping the deterministic per-branch id to its stored result on replay	The id template `f"{ctx.id}.<channel>"` passed to `.options(id=...)`

The workflow body reads like straight-line Python. The branching, persistence, and per-step retry are not in the code you write — they are in the SDK + server.

Failure modes covered

The push channel raises on its first attempt. channels.py:131-134 throws RuntimeError("Push service unavailable — will retry") when simulate_crash is set and attempt == 1. Push has the shortest latency (time.sleep(0.12) at channels.py:129), so on the recorded crash run the push attempt fails before the other three channels finish: the README crash block (lines 110-121) shows push attempt 1 fails (line 115), then slack/sms/email complete on their own branches (the three completion lines at 116-118 — two Sent — msg_... lines plus one Posted — msg_slack_...), then push attempt 2 fires (lines 119-120). Email, SMS, and Slack are not re-invoked when push retries because each branch is keyed by its own deterministic id (f"{ctx.id}.<channel>", workflow.py:38-49) and lives in its own durable promise; the SDK retries only the failed send_push step.
The worker crashes mid-workflow. Because each branch promise carries a deterministic id (f"{ctx.id}.<channel>") and the workflow itself runs under a caller-supplied id (f"notify/{event['order_id']}", main.py:59), a replay of the workflow looks up the existing per-channel promises rather than issuing new sends. Branches that had already resolved feed their stored results back into the generator; branches still in flight resume on whichever worker claims them.
The workflow is invoked twice with the same order id. The outer promise id is keyed on order_id (main.py:59); the second call resolves against the existing promise rather than starting a parallel run.

The example does not implement provider-level idempotency on the channel side — that lives outside the workflow and is not claimed.

When to reach for this pattern

If a single trigger has to drive N independent side effects and total latency should track the slowest one, not the sum.
If partial failure must be tolerated — one branch failing and retrying without re-driving the successful branches.
If the branches are heterogeneous (different providers, different latencies, different retry profiles) but the orchestration is uniform.
If you want straight-line workflow code rather than a DAG declaration or a child-workflow framework.
If each branch's result needs to be durably captured so replays after a crash do not re-issue calls already acknowledged by the provider.

Sources

Example repo: https://github.com/resonatehq-examples/example-fan-out-fan-in-py
Python SDK repo: https://github.com/resonatehq/resonate-sdk-py (this example pins resonate-sdk>=0.6.7, the pre-0.10 surface)
Legacy Resonate server: https://github.com/resonatehq/resonate-legacy-server
Resonate documentation: https://docs.resonatehq.io
Files cited in this post:
- workflow.py:29-65 — the notify_all generator
- main.py:25-62 — client construction, registration, and run invocation
- channels.py:113-143 — the send_push function and its failure injection
- pyproject.toml:9 — SDK pin