5 min readResonate HQJust published

Durable Hacker News research agent in Python on Resonate

How a long-running keyword-monitoring agent collapses to a generator workflow when every yield is a durable checkpoint and the promise store IS the state store.

Resonate brand card on a dark background with a plum spectrum wave at the bottom and the post headline in white Sansation.

A long-running monitor that polls an external API, scores each result with an LLM, and emits notifications must not re-analyze stories after a crash and must not collapse to an immediate redundant scan when restarted mid-interval. The Resonate shape of the solution is a generator workflow where every yield ctx.run(...) is a durable checkpoint and yield ctx.sleep(...) is a durable timer, so the dedup set rebuilds from cached step results on replay and the inter-round interval survives restarts. The example registers two generator functions — scan_keyword (one round for one keyword) and monitor_hackernews (the infinite loop that owns the dedup set) — and uses Resonate dependencies to inject the OpenAI client and config into worker functions.

The shape of the solution

@resonate.register
def monitor_hackernews(ctx: Context):
    """
    Continuous monitoring loop.
 
    Owns the `seen_ids` set. On crash-recovery Resonate replays this generator
    and returns cached results for completed `scan_keyword` calls, so `seen_ids`
    rebuilds deterministically — the promise store IS the state store.
 
    `yield ctx.sleep(...)` between rounds is a durable timer: a restart during
    sleep resumes the sleep rather than triggering an immediate redundant scan.
    """
    config: AgentConfig = ctx.get_dependency("config")
    keywords = config["keywords"]
    scan_interval_secs = config["scan_interval_secs"]
    # ...
 
    seen_ids: set[str] = set()
 
    while True:
        for keyword in keywords:
            try:
                result = yield ctx.run(scan_keyword, keyword, list(seen_ids))
                for a in result["newly_analyzed"]:
                    seen_ids.add(a["story_id"])
            except Exception as e:
                print(f"❌ Error scanning '{keyword}': {e}")
 
        yield ctx.sleep(scan_interval_secs)
# from example-hackernews-research-agent-py/src/agent.py:220-253

The inner scan_keyword workflow is the same shape — three yield ctx.run(...) calls (fetch, per-story analyze, notify) under @resonate.register:

@resonate.register
def scan_keyword(
    ctx: Context,
    keyword: str,
    seen_ids: Optional[list[str]] = None,
):
    # ...
    config: AgentConfig = ctx.get_dependency("config")
    relevance_threshold = config["relevance_threshold"]
 
    stories = yield ctx.run(search_hackernews, keyword)
    seen = set(seen_ids or [])
    new_stories = [s for s in stories if s["objectID"] not in seen]
 
    # ...
 
    newly_analyzed = []
    for story in new_stories:
        analysis = yield ctx.run(analyze_story, story, keyword)
        newly_analyzed.append(analysis)
 
    interesting = [
        a for a in newly_analyzed
        if a["is_interesting"] and a["relevance_score"] >= relevance_threshold
    ]
 
    yield ctx.run(notify_findings, interesting, keyword)
 
    # ...
 
    return {
        "keyword": keyword,
        "stories_found": len(stories),
        "newly_analyzed": newly_analyzed,
    }
# from example-hackernews-research-agent-py/src/agent.py:172-217

Both workflows are plain Python generator functions, not async def. Each yield ctx.run(...) runs a child step under a durable promise and suspends the generator until that step's result is checkpointed.

The durable primitives in play

  • Resonate() — constructs the Resonate client embedded in the worker process. src/agent.py:25.
  • @resonate.register — registers a generator function as a top-level workflow the worker can claim and execute under a caller-supplied promise id. Applied to scan_keyword at src/agent.py:172 and monitor_hackernews at src/agent.py:220.
  • ctx.run(fn, *args) — runs fn as a durable child step. The return value is persisted; on replay the SDK returns the cached value rather than re-invoking the function. Used at src/agent.py:192 (fetch), :201 (per-story analyze), :209 (notify), and :247 (parent invokes the registered scan_keyword workflow as a step).
  • ctx.sleep(seconds) — durable timer. The pending wake-up is stored on the server, so a worker restart during the wait resumes the remainder of the sleep instead of restarting it from zero. src/agent.py:253.
  • resonate.set_dependency(name, value) — registers a worker-process value (OpenAI client, config dict) under a string key so durable functions retrieve it from ctx instead of closing over module-level state. src/agent.py:273-274.
  • ctx.get_dependency(name) — fetches a registered dependency inside a workflow or step function. src/agent.py:58 (openai inside analyze_story), :117 (config inside notify_findings), :189 (config inside scan_keyword), :232 (config inside monitor_hackernews).
  • resonate.start() / resonate.stop() — start and stop the worker's polling loop against the Resonate server. src/agent.py:284, :290.

What the SDK handles vs. what you write

SDK handlesYou write
Checkpointing each ctx.run(...) return value in the durable promise storeThe four yield ctx.run(...) calls (search_hackernews, analyze_story, notify_findings, scan_keyword) and the step function bodies
Replaying the generator after a crash and returning cached results for completed steps, so seen_ids rebuilds in the same orderA plain Python set[str] named seen_ids, mutated normally inside monitor_hackernews (src/agent.py:242, 249)
Holding the inter-round wait as a durable timer that survives restartsA single yield ctx.sleep(scan_interval_secs) (src/agent.py:253)
Routing the OpenAI client and config into each function call via the dependency registryTwo resonate.set_dependency(...) lines in main() (src/agent.py:273-274) and ctx.get_dependency(...) lookups in each function
Holding the workflow's identity under the caller-supplied promise id (monitor-1, scan-1) so a re-invoke resolves the existing promise rather than starting a duplicateThe resonate invoke <id> --func <fn> commands in the README (README.md:77, :83)
Walking the durable log on restart to rebuild generator stateNo replay code — the workflow is written as if it never crashes

The author writes a while True loop, a for loop over keywords, a Python set, and a time.sleep-shaped ctx.sleep. Everything that makes the workflow durable — the checkpoints, the replay, the timer, the dependency wiring — is in the SDK and server.

Failure modes covered

  • Worker crash mid-scan. Each completed step's return value lives in the promise store; on restart the generator replays and the SDK returns cached results for the steps that already finished. Only the unfinished step re-enters its body. Code: src/agent.py:192, 201, 209. README: README.md:17-19.
  • Worker crash mid-interval. yield ctx.sleep(scan_interval_secs) (src/agent.py:253) is a durable timer. A restart during the wait resumes the remaining sleep rather than triggering an immediate redundant scan. README: README.md:25.
  • Re-processing a story across rounds. seen_ids is a set[str] local to monitor_hackernews (src/agent.py:242). After each scan_keyword call, every story_id from result["newly_analyzed"] is added (src/agent.py:248-249). On replay, the cached result dicts come back from the promise store and the same IDs land in seen_ids in the same order. The next scan_keyword call receives list(seen_ids) and filters fetched stories at src/agent.py:194. No external dedup database.
  • One keyword's scan raises. monitor_hackernews wraps the per-keyword ctx.run(scan_keyword, ...) in try/except Exception (src/agent.py:246-251) and continues to the next keyword. The inter-round ctx.sleep still runs.
  • Re-invoking with the same promise id. resonate invoke monitor-1 --func monitor_hackernews (README.md:83) supplies the promise id at the call site. A second invoke with the same id resolves against the existing promise instead of starting a parallel run.

The example does not deduplicate against OpenAI or Slack. The README is explicit at README.md:125-127 that Slack delivery is at-least-once — notify_findings is one ctx.run, so a retry can re-POST. An analyze_story retry re-issues the OpenAI request. Both are accepted trade-offs, not bugs. The README also flags at README.md:118-123 that seen_ids lives in worker memory and grows unbounded; Resonate replay cost on restart grows with it, so production monitors should bound the window or snapshot externally. Finally, pyproject.toml:11 and uv.lock pin resonate-sdk at 0.6.7 — on the 0.6.x Python SDK line. The TypeScript and Rust SDKs underwent post-0.10 / post-0.4 brand-aligned API renames; agents mapping this Python pattern onto a newer SDK release should re-verify the surface of @resonate.register, ctx.run, ctx.sleep, ctx.get_dependency, and resonate.set_dependency against current Python SDK docs before assuming the call shapes carry over unchanged.

When to reach for this pattern

  • If you are running a continuous monitor or polling agent that scores each result with an LLM and must survive worker restarts without re-scoring previously-seen items.
  • If you want per-step retry on a long-running loop without retry decorators or external cursor storage — cached step results plus a local Python collection are enough.
  • If the interval between rounds matters (rate limits, cost) and a restart mid-wait must not collapse the interval — ctx.sleep is the durable timer.
  • If you want CLI / RFI (Remote Function Invocation) invocability of the inner scan unit on top of an infinite outer loop — both scan_keyword and monitor_hackernews are @resonate.register-ed, and scan_keyword's optional seen_ids arg makes it standalone-callable via resonate invoke ... --func scan_keyword --arg "<keyword>" (README.md:77).
  • If you need clean separation between ephemeral (OpenAI client, config) and durable state so the workflow stays free of unserializable closures — resonate.set_dependency / ctx.get_dependency is the seam.

Sources