5 min readResonate HQJust published

Recursive distributed calculator in Python on Resonate

How a self-calling calculator function distributes its sub-tasks across task groups while keeping each step a durable checkpoint.

Resonate brand card on a dark background with a plum spectrum wave at the bottom and the post headline in white Sansation.

A calculator that parses (1 + 2) * (3 - 4) into a tree and dispatches each sub-expression to a different worker pool is straightforward to write until a worker crashes mid-evaluation — then you need durable state for every partial result and an at-least-once dispatch story. Resonate handles that by making every function invocation a durable promise, so the evaluator can recurse into itself across worker groups with no separate workflow layer and no manual checkpointing. This example evaluates an arithmetic expression by recursively yielding sub-expressions as remote invocations routed to an exp task group, and operator calls routed to an ops task group.

The shape of the solution

@resonate.register(name="=")
def clc(ctx: Context, expr: parser.Expr) -> Generator[Any, Any, int]:
    if grp:
        print(f"{grp}/{pid}: {expr}")
 
    match expr:
        case (op, lhs, rhs):
            # Send the expressions to the exp task queue with
            # preference.
            #
            # The expressions are sent to the task queue as invocations
            # which return a handle so we can wait for the result later.
            px = yield ctx.rfi(clc, lhs).options(send_to="poll://exp/lhs")
            py = yield ctx.rfi(clc, rhs).options(send_to="poll://exp/rhs")
 
            # Wait for results from the lhs and rhs tasks.
            vx = yield px
            vy = yield py
 
            # Send the operation to the ops task queue.
            #
            # The operation is sent to the task queue as a call which returns
            # the result directly.
            return (yield ctx.rfc(op, vx, vy).options(send_to="poll://ops"))
 
        case x:
            return x
 
# from example-distributed-calculator-py/resonator/resonator.py:35

The top-level entry point parses the input string and runs clc with a fresh UUID as the promise ID:

if expr := input("❯ "):
    # calculate the expression
    h = clc.run(str(uuid.uuid4()), parser.parse(expr))
 
    # print the result
    print(f"""
{expr}
= {h.result()}
""")
 
# from example-distributed-calculator-py/resonator/resonator.py:94

The parser produces a recursive tuple type ("+" | "-" | "*", Expr, Expr) | int:

Expr = tuple[Literal["+", "-", "*"], "Expr", "Expr"] | int
 
def parse(expr: str) -> Expr:
    tokens = re.findall(r'\d+|[()+\-*]', expr)
    return parse_expr(tokens)
 
# from example-distributed-calculator-py/resonator/parser.py:7

The durable primitives in play

  • Resonate(pid=..., task_source=Poller(group=grp)) — constructs a process that polls the Resonate server for tasks routed to the named group, identifying itself with pid. (resonator/resonator.py:18)
  • @resonate.register(name="=") — registers clc as a durable function under the name "=" so any process can claim an invocation by name. The same registration pattern names the three operator functions "+", "-", "*" so they too are addressable by symbol; the four registrations together cover clc plus the three operators. (resonator/resonator.py:20, :25, :30, :35)
  • ctx.rfi(clc, lhs) — a Remote Function Invocation: creates a durable promise for the sub-expression, dispatches it asynchronously, and yields a handle the parent can yield on later to retrieve the value. Used here to fan out the left and right sub-expressions in parallel. (resonator/resonator.py:47, :48)
  • .options(send_to="poll://exp/lhs") — routes the invocation to the exp task queue, optionally to a specific worker pid (lhs or rhs). The exp workers are launched with GRP=exp PID=lhs and GRP=exp PID=rhs, giving the orchestrator preference over which worker each sub-tree lands on. (resonator/resonator.py:47, :48; README.md:41–43)
  • yield px / yield py — blocks on the previously-dispatched RFI handles. Resonate suspends the generator until each child promise resolves; on crash, the parent replays and these yields short-circuit on already-resolved children. (resonator/resonator.py:51, :52)
  • ctx.rfc(op, vx, vy) — a Remote Function Call: creates a durable promise, dispatches it, and yields the resolved value directly in one step (no separate handle). Used for the terminal operator step routed to poll://ops. (resonator/resonator.py:58)
  • clc.run(str(uuid.uuid4()), parser.parse(expr)) — the client-side entry point. The first argument is the durable promise ID for the top-level computation; passing a fresh UUID makes each prompt a distinct durable invocation. (resonator/resonator.py:96)

What the SDK handles vs. what you write

You write a plain Python generator that pattern-matches on the parsed expression. If it's a tuple, you yield two rfi calls for the children, yield on their handles to collect results, and yield one rfc for the operator. If it's an int, you return it. The send_to= option is the only routing logic; there is no queue client, no result correlation table, no retry loop.

The SDK and server handle: minting a durable promise for every rfi and rfc invocation; persisting the promise as PENDING until a worker resolves it; routing each invocation to a worker that polls the matching group/pid via Poller(group=grp); replaying the parent generator on worker crash and skipping yields whose child promises are already terminal; propagating each child's resolved integer back into the parent generator so vx = yield px resumes with the value; and deduplicating top-level invocations by promise ID (the UUID supplied to clc.run).

The body of clc contains no checkpoint calls, no retry annotations, and no workflow/activity split. The function is the workflow.

Failure modes covered

  • An exp worker crashes after claiming a sub-expression. The durable promise for that sub-tree stays PENDING; the task is addressed to a specific worker PID (lhs or rhs per resonator/resonator.py:47–48), so the server holds the task until that named worker is restarted and reclaims it. When it finishes, the resolved value propagates back into the parent's yield px or yield py. (resonator/resonator.py:47–52; worker topology at README.md:42–43)
  • The orchestrator process crashes between dispatching rfi and receiving the result. On restart, the parent invocation is reclaimed by a worker registered under name "=". The generator replays from the top; the two ctx.rfi(...) yields short-circuit on the existing durable promises rather than dispatching new ones, and execution resumes at whichever yield was outstanding. (resonator/resonator.py:47–58)
  • The ops worker crashes mid-operation. The rfc durable promise remains PENDING until an ops-group worker picks it up and computes x + y / x - y / x * y (the README starts one such worker at README.md:46). The parent's terminal return (yield ctx.rfc(...)) waits on that promise; replay is a no-op once the operator promise is terminal. (rfc call site at resonator/resonator.py:58; operator bodies at resonator/resonator.py:21, :27, :32)
  • Duplicate dispatch of the same sub-expression. Resonate's durable-promise store deduplicates on promise ID. The example uses Resonate-generated child IDs (no explicit id= on the rfi/rfc options), so each clc.run(str(uuid.uuid4()), ...) call is a distinct top-level computation, while replay of a single top-level invocation reuses the same child IDs and short-circuits to stored results. (resonator/resonator.py:96, :47–58)
  • A worker is removed from the pool mid-run. Surviving members of the same group continue polling for tasks at poll://exp/... or poll://ops; the server reissues unclaimed work. The CLI flow in the README spins up two exp workers (PID=lhs, PID=rhs) and one ops worker plus the prompt process, but any subset that covers the addressed targets keeps the system live. (README.md:39–50)

When to reach for this pattern

  • If you have a recursive computation over a tree where each node may dispatch to a different worker pool (parsed expressions, query plans, dependency graphs, hierarchical aggregations).
  • If you want to fan sub-tasks out in parallel across named worker groups without writing your own queue, correlation IDs, or result-join logic.
  • If you need each sub-result to survive a worker crash without writing per-step checkpoint code — the durable promise per rfi/rfc is the checkpoint.
  • If you want routing preference (e.g., lhs vs rhs workers, or a dedicated ops pool) expressed as a one-line .options(send_to=...) rather than a separate dispatcher service.
  • If you want the recursion written as ordinary Python — match, yield, return — instead of an external workflow DSL.

Sources