4 min readResonate HQJust published

Recursive social-graph scraper in TypeScript on Resonate

How a self-spawning workflow walks a social graph using context.detached, with each step durably checkpointed.

Resonate brand card on a dark background with a teal spectrum wave at the bottom and the post headline in white Sansation.

Walking a social graph to depth N means fetching one user, paging their followers, and then recursing into each follower — a tree-shaped traversal where any node can fail or get throttled and the worker may restart mid-traversal. The Resonate shape of the solution is a single recursive workflow function: context.run for each network call, and context.detached to spawn each follower as an independent durable child. This example scrapes Bluesky profiles and followers via the public AtProto API and demonstrates the recursive pattern in ~30 lines of TypeScript.

The shape of the solution

export function* scrape(
  context: Context,
  actor: string,
  depth: number,
): Generator<any, void, any> {
  // Fetch profile
  const profile: ProfileData = yield* context.run(async (context: Context) => {
    return await getProfile(actor);
  });
  // Store profile in database
  yield* context.run((context: Context) => {
    console.log(`@${profile.handle}`);
  });
  // Iterate through followers page by page
  if (depth > 0) {
    let cursor: string | undefined = undefined;
    do {
      // Fetch one page of followers
      const page: FollowersPage = yield* context.run(async (_ctx: Context) => {
        return await getFollowersPage(profile.did, cursor);
      });
      // Schedule scrapes for this page's followers
      for (const follower of page.followers) {
        yield* context.detached(scrape, follower.did, depth - 1);
      }
      // advance
      cursor = page.cursor;
    } while (cursor !== undefined);
  }
}
// from example-bluesky-scraper-ts/src/scraper.ts:9-38

The function is registered as a top-level workflow at startup:

const resonate = new Resonate({ url: "http://localhost:8001" });
resonate.register("scrape", scrape);
// from example-bluesky-scraper-ts/src/index.ts:4-6

A scrape is kicked off from the CLI: resonate invoke <id> --func scrape --arg <handle> --arg <depth>.

The durable primitives in play

  • context.run(fn) — wraps a sync or async step (getProfile, the console log, getFollowersPage) as a durable checkpoint. On replay after a worker crash the SDK skips already-resolved checkpoints and resumes at the next pending one. Used at src/scraper.ts:15, :19, and :27.
  • context.detached(scrape, follower.did, depth - 1) — spawns a new top-level durable promise for each follower and returns immediately. The child is independent of the parent's lineage: the parent does not block on it, the parent's lifetime does not bound the child's lifetime, and the child gets its own retry and durability guarantees. Used at src/scraper.ts:32.
  • resonate.register("scrape", scrape) — exposes the generator function as a remotely-invocable workflow so that both the CLI invocation and every recursive context.detached(scrape, …) resolve to the same worker code path. Used at src/index.ts:6.

The IDs Resonate assigns to detached children are deterministic on replay — when no { id: ... } option is passed, the SDK derives the child ID via util.detachedId(originId, seqid()), which returns ${originId}.${cyrb53(seqid).toString(16).padStart(14, "0")}. The originId is the root of the current lineage and seqid() returns ${parent.id}.${seq}, so a re-executing parent after a crash hashes the same (parent.id, seq) inputs and addresses the same children rather than spawning a new set.

What the SDK handles vs. what you write

What you write: the generator function scrape, the async helpers getProfile and getFollowersPage that hit the AtProto API, the depth bound, and the cursor loop that pages through getFollowers 50 records at a time (src/bluesky/client.ts:29, 62).

What the SDK handles: persisting a durable promise for the top-level scrape and one for every detached child; recording each context.run step as a checkpoint so a crashed worker resumes at the next unfinished step rather than re-fetching pages that already succeeded; assigning deterministic IDs to detached children so replay produces the same fan-out; and routing every recursive context.detached(scrape, …) call back into the worker pool registered under "scrape". None of the queue plumbing, retry bookkeeping, or replay logic appears in the application code — there are three primitives (run, detached, register) and ~30 lines of business logic.

Failure modes covered

  • Worker crashes mid-page. If the worker dies after getProfile resolves but before the first getFollowersPage call completes, the durable promise for the getProfile checkpoint is already recorded. On replay the SDK skips it and resumes at the next pending checkpoint. Cited path: src/scraper.ts:15 (checkpoint 1, profile fetch) → :19 (checkpoint 2, log) → :27 (checkpoint 3, followers fetch inside the do/while).
  • Worker crashes mid-fan-out. Each context.detached call at src/scraper.ts:32 creates a durable promise for the child on the server before returning. If the parent crashes after spawning half of a page's children, the already-spawned children continue independently; on parent replay, the deterministic child IDs derived via util.detachedId(originId, seqid()) cause the already-spawned children to be recognised as existing rather than re-spawned.
  • AtProto API throws on a single follower. The exception surfaces inside that follower's own detached workflow. The parent does not see it because context.detached returns immediately without awaiting the child. The failed child can be retried per its own retry policy without re-fetching the parent's data.
  • Long-running deep traversals. Because each follower is a detached promise, no single process must stay alive for the duration of the full graph walk. Workers can come and go between levels of recursion; the tree continues across worker restarts and across multiple worker instances.

What this example does not handle: cross-branch deduplication. Two different parents that both list the same user as a follower will spawn two different detached children with different IDs, scraping the user twice. Deduplication by user DID would require passing an explicit { id: follower.did } options object to context.detached, which the current code does not do.

When to reach for this pattern

  • If you're walking a tree, graph, or hierarchical resource where each node triggers more work and the total fan-out is unknown at start time.
  • If a single traversal can outlive a single process — worker restarts, deploys, or the orchestrator wanting to scale workers up and down mid-job.
  • If individual subtrees should fail and retry independently rather than dragging the whole traversal down.
  • If you want straight-line generator code instead of an external queue plus a hand-rolled state table tracking which nodes have been visited.
  • If you need a bounded recursion (the depth parameter here) and want the bound enforced in the workflow itself rather than in queue tooling.

Sources