4 min readResonate HQJust published

Saga with compensating action in Python on Resonate

How a two-step money transfer with compensating reversal is written as straight-line Python when each step is a Resonate durable checkpoint.

Resonate brand card on a dark background with a plum spectrum wave at the bottom and the post headline in white Sansation.

Moving funds between two accounts is two writes that in a real deployment can't share a transaction (different account systems, different services), so a partial failure can leave the ledger short or double-credited. Resonate's shape of solution is the saga pattern written as straight-line Python: each step is a ctx.run(...) durable checkpoint, and compensation is an except branch on the credit step. This example demonstrates that with a SQLite ledger, deterministic operation ids for idempotency, and an explicit compensating reversal when the credit leg is rejected.

The shape of the solution

def transfer_money(
    ctx: Context,
    transfer_id: str,
    source: str,
    target: str,
    amount: float,
    *,
    simulate_credit_failure: bool = False,
) -> Generator[Any, Any, dict]:
    # ...
    debit_id = f"{transfer_id}-debit"
    credit_id = f"{transfer_id}-credit"
    reversal_id = f"{transfer_id}-reversal"
 
    # Step 1 — debit the source (durable checkpoint).
    yield ctx.run(apply_entry, debit_id, source, -amount, "debit")
 
    # Step 2 — credit the target (durable checkpoint). On failure,
    # compensate by reversing the debit, then re-raise so the caller
    # sees the saga aborted.
    #
    # `retry_policy=Never()` is intentional here: this saga's compensation
    # IS the response to a credit-side failure. In production you might
    # use a few retries first (network blips happen) and only compensate
    # once the upstream has clearly rejected the credit.
    try:
        yield ctx.run(
            credit_target,
            credit_id,
            target,
            amount,
            fail=simulate_credit_failure,
        ).options(retry_policy=Never())
    except Exception as err:
        print(f"[saga] credit failed: {err}. Compensating...")
        # Compensating action — also durable + idempotent.
        yield ctx.run(apply_entry, reversal_id, source, amount, "reversal")
        return {
            "transfer_id": transfer_id,
            "status": "compensated",
            "error": str(err),
        }
 
    print(f"[saga] transfer {transfer_id} committed")
    return {
        "transfer_id": transfer_id,
        "status": "committed",
        "source": source,
        "target": target,
        "amount": amount,
    }
# from example-money-transfer-py/main.py:105-167 (docstring + opening print elided as `# ...`)

The workflow is a Python generator. Each yield ctx.run(...) is a suspension point the SDK awaits and durably records. The saga's compensation is an ordinary try/except on the awaited credit step; there is no saga DSL, no rollback registry, and no state column on the transfer.

The durable primitives in play

  • resonate.register(transfer_money) — registers the generator as a workflow callable. Returns a handle that can be invoked with transfer.run(id, *args). Source: main.py:183.
  • ctx.run(fn, *args) — executes fn as a durable step. The result is checkpointed on the Resonate server (or in-memory store in local mode). On replay, completed steps return their stored result instead of re-running. Three invocations in the workflow at main.py:132, :143-149, :153.
  • .options(retry_policy=Never()) — disables the SDK's default retry for the credit step so the saga compensates immediately on the first failure rather than retrying. Never is imported from resonate.retry_policies. Source: main.py:21, :149.
  • ctx.get_dependency("db") — retrieves a worker-scoped dependency (here, the sqlite3.Connection) that the workflow code shouldn't construct itself. Registered with resonate.set_dependency("db", db). Source: main.py:61, :181.
  • Resonate.local() — runs the SDK with an in-memory store, no server process. Swap for Resonate() and run resonate serve to back the same workflow with durable storage that survives a process restart. Source: main.py:180; README:81-100.

What the SDK handles vs. what you write

The SDK handles: durable invocation records for the workflow and each ctx.run leaf, replay from the last completed step after a worker crash, scheduling and execution of the generator, retry policy enforcement (including Never), and dependency injection through ctx.get_dependency.

You write: the generator that yields the steps in order, the leaf functions that perform the actual work (here, SQL inserts and a balance sum), the deterministic op_ids that make the leaves idempotent ({transfer_id}-debit, -credit, -reversal), and the try/except that decides when to run the compensating reversal. The ledger schema and INSERT OR IGNORE semantics are also yours — the SDK does not provide storage; it provides durable execution of the steps that talk to your storage.

Failure modes covered

  • Worker crashes between the debit and the credit. On restart, Resonate replays transfer_money. The debit ctx.run returns its stored result without re-executing. Execution resumes at the credit step. The apply_entry leaf would also be safe to re-run because of INSERT OR IGNORE on the deterministic op_id (main.py:62-65).
  • Credit step raises (target rejects the credit). The try/except at main.py:142-158 catches it and runs the reversal ctx.run(apply_entry, reversal_id, source, amount, "reversal"). The function returns status: "compensated". simulate_credit_failure=True exercises this path via TransferRejected at main.py:87-101.
  • Worker crashes after compensation has started but before it completed. The reversal's op_id is deterministic ({transfer_id}-reversal); replay re-enters the except branch and the INSERT OR IGNORE makes a re-applied reversal a no-op (main.py:62-65).
  • The workflow is invoked twice with the same id. transfer.run("transfer-001", ...) keys the durable invocation on "transfer-001"; if a promise with that id already exists, the SDK subscribes to its result or returns it immediately when it has completed, so duplicate executions for the same id are prevented (resonate-sdk-py/resonate/resonate.py:1368-1391Function.run docstring). The demo uses the transfer id as both the invocation id and the first positional arg (main.py:194).

The example does not retry the credit before compensating — retry_policy=Never() is intentional. In production you would typically allow a small number of retries first and only compensate after the upstream has clearly rejected the credit (README:146; main.py:138-149).

When to reach for this pattern

  • If you are coordinating two or more writes across systems with no transaction spanning them and need a compensating action when a later write fails.
  • If a multi-step business operation must survive worker restarts mid-flight without losing its position.
  • If you want a saga without a state-machine framework, a job table, or a separately maintained rollback registry.
  • If your steps can be made idempotent with a deterministic operation id and an INSERT OR IGNORE-style upsert (or equivalent dedupe at the storage layer).
  • If you want to start in-process (Resonate.local()) and later move the same code to a server-backed deployment without changing the workflow.

Sources