Field Notes · Build in public

One founder, seventeen AI teammates, and one quiet phone

How My Storyland is built, shipped, and marketed by a team where every role except one is an AI — written for readers who have never merged a pull request.

How Storyland — a site that turns the books you love into real travel itineraries — is built, shipped, and marketed by a team where every role except one is an AI. Written for readers who have never merged a pull request.

Part 1Three places where work happens

Most companies organize people. Storyland organizes environments. There are exactly three, and everything about the setup follows from what each one can and cannot touch.

Olga — the founder
The only human, usually on a phone. She doesn't write the code or run the deploys — she makes the calls that money and users depend on, mostly by dragging cards and clicking one merge button.
The Cowork AI team — a sealed sandbox
Two AI departments, each role with a written job description: a product side (product manager, design lead, staff engineer, engineering lead, QA, release engineer, analysts) and, new this week, a marketing side. It can write code, draft content, and update the boards — but it cannot deploy, merge anything risky, or publish without Olga's sign-off.
The Local Runner — Claude Code on Olga's Mac
The hands. It has what the sandbox doesn't: real internet, a browser, GitHub credentials, and the key to the production server. It executes — and only inside pre-approved lanes. It never merges or deploys on its own decision.

The sandbox team is deliberately cut off from the real world; the runner deliberately can't decide anything on its own. Only together — with Olga's stamps in between — can a feature reach users.

Novice corner — what are "Claude Cowork" and "Claude Code"? Every teammate here is powered by Claude, an AI assistant made by Anthropic — the same underlying intelligence hired into many different jobs, arriving through two products. Claude Cowork is Anthropic's workspace app for office-style AI work: each role is a Cowork agent with a written charter, working on its own schedule inside a sealed sandbox with only the tools it's been handed. Claude Code is Anthropic's agent that runs directly on a real computer — which is exactly why the Local Runner can push code to GitHub, open a browser, and reach the production server. Same brain in every seat; what differs is the room it works in, and what that room lets it touch.

Why split it this way? Because it makes the scary failure modes structurally impossible instead of merely discouraged. The AI that writes code physically cannot touch the production server. The AI that touches the production server only acts on work a human has explicitly approved. Neither has to be trusted to "remember the rules" — the rules are the walls of the rooms they work in.

Part 2The board is the boss

All product coordination happens on one Linear board (Linear is a project-tracking app — think of a wall of sticky notes in columns). Every piece of work is a card like MYS-161: Redesign the Destinations page, and the column a card sits in is its status. Nobody messages anybody; they read the wall.

A card travels left to right through nine columns. Two of the moves can only be made by Olga — those are her approval gates. Three happen automatically when code events fire. The rest are done by whichever teammate finished the work.

THE OPPORTUNITY BACKLOG — ONE CARD'S PATH Idea Ready Todo InProgress InReview Merged Readyto ship Shipped Done spec'd & mocked design attached build queue PR opened review asked code on main deploy queue live on prod measured G1 ✓ the founder drags the card: "yes, build this" G5 ✓ the founder drags the card: "yes, put it live" nobody touches these — GitHub moves the card automatically PR opened → In Progress · review requested → In Review · PR merged → Merged
the founder & her gates the AI team the local runner
The pipeline, recolored to the site palette. White columns are moved by teammates, the dashed green ones move themselves when GitHub reports a pull request opened, reviewed, or merged, and the two brown arrows only move when Olga drags the card. A drag is the approval — there are no separate sign-off forms.
Novice corner — what's a "pull request"? Code changes aren't typed straight into the live product. An engineer bundles a change into a pull request (PR) — a proposal others can read, comment on, and test. "Merging" the PR accepts it into the main codebase. Even then it isn't live: someone still has to deploy — copy the new code onto the server users actually visit. Storyland puts an approval in front of both steps.

Part 3The four stamps only a human can give

Everything in this system is autonomous except four decisions. They're called gates, numbered like customs checkpoints, and each exists because the step after it is expensive, public, or hard to undo.

GateThe questionHow Olga grants it
G1Should we build this at all? Ideas are cheap; engineering time isn't. Olga reviews the spec and design mock first — her standing rule is "I want to see design first."Drags the card Ready → Todo
G2Is this code good enough to accept? Once merged, a change is woven into everything built after it. The one gate she partly delegates: the AI engineering lead may merge small, clean, routine changes — but anything touching security, money, or user data waits for her.Clicks Merge on the PR — or lets the delegation rule handle routine ones
G4May we spend real money on this test? Quality evaluations call paid AI APIs; each run costs actual dollars.Stamps gate_g4_spend: APPROVED on the request
G5Does this go live for users, now? Deploys are the one step visitors can feel.Drags the card Merged → Ready to ship

Notice what's not gated: writing code, drawing designs, opening PRs, updating the board, verifying the live site, building marketing assets — even merging a small, clean, routine change. The AI team does all of that continuously, without asking. The gates sit exactly at the points of no return — irreversibility, security, money, and anything users can feel — and nowhere else.

Part 4Handoff files: how a sandboxed team gets things done anyway

Here's the puzzle at the heart of the setup. The Cowork team writes the code — but it lives in a sandbox with no real internet, no browser, no production server. So how does its work ever become a pull request, a rendered design, or a live feature?

It writes a letter.

When a sandboxed teammate finishes something that needs real-world hands, it drops a small text file — a handoff — into a shared folder on Olga's Mac. The file says what kind of job it is, which card it belongs to, and exactly what to run. There are six inboxes, one per job type, each with hard limits the runner may never cross:

InboxWhat it asks forHard limit
staff-engineer/"The branch is pushed — open the pull request." Or: "run this build the sandbox couldn't."never merges
release/"This is approved to deploy." Cross-checked against the board — the card must really be in Ready to ship.G5-gated
eval/"Run this small quality test on the AI's itinerary output and report the scores."G4-gated, costs capped
qa/"Check the live site actually does X" — real clicks on the real product, written back to the card.never merges or deploys
design/"Render this HTML mock into a screenshot and attach both to the card so Olga can review it."render only — never edits
grow/"Build these marketing assets" — fetches photos the sandbox can't reach, composites carousels and Reels.never posts publicly

Finished jobs get a written ## RESULT receipt appended and move to a shared done/ folder — an audit trail of every real-world action ever taken.

Part 5Nine minutes past every hour

The Local Runner wakes on a schedule — at :09 past each hour — with no human watching. An unattended AI with production keys sounds alarming until you see how narrow its script is.

First, it checks its own footing. Is GitHub access alive? Is the live site responding? What's actually on the board right now? It re-reads Linear fresh every run rather than trusting local notes, because notes go stale and the board is the boss.

Then it empties the six inboxes, each according to its lane rules — opening PRs, rendering mocks, running approved checks. Then it looks at the deploy queue: every card in Ready to ship is a standing instruction from Olga. For each one it confirms the code is really merged, deploys it, and then — this part matters — proves it worked by loading the live site in a headless browser and checking the page genuinely rendered, because a server can happily say "200 OK" while serving a blank white screen. (That exact failure happened once. It's now a permanent checklist item.)

Then it reports — quietly. The card moves to Shipped with a signed comment, anything that needs Olga goes on one short pending list, and her phone stays silent: the runner is only allowed to notify her for a genuine production emergency. The quiet rule is deliberate — early versions pinged her after every run, a dozen times a day, so the rule flipped: silence is the default, and a notification now means something.

Part 6One feature, end to end

Take a real card — MYS-161, a redesign of the Destinations page Olga requested in chat. The product manager specs it and the design lead attaches a mock; Olga reads both and drags it Ready → Todo (G1). The staff engineer builds it and pushes a branch, dropping a staff-engineer/ handoff; the runner opens the PR, which auto-moves the card to In Review. Olga (or the delegation rule) merges it (G2), QA verifies, and she drags it Merged → Ready to ship (G5). The runner deploys, proves the page rendered, and marks it Shipped. Count the human touches: one sentence of intent and three gestures. Everything else happened without her.

Part 7The marketing department, hired this week

Until this week, all of Storyland's marketing was one AI role that planned, wrote, designed, published, and measured. It worked, then stopped scaling — one context juggling five jobs drops balls. So the team did what human companies do: it reorganized. The one role became a five-role department — a Marketing Director (plans the week and sends Olga one daily digest), a Content Creator & Publisher (drafts, designs in Canva, and hits publish), an Editorial Reviewer (may only comment or bounce a draft — never edit, publish, or approve), a Community Manager (the one role trusted to reply and repost publicly on its own, in Olga's voice, with receipts), and a Marketing Analyst (reads the numbers weekly and files new idea cards).

The crew has its own Linear wall — a Content Pipeline with columns that fit content instead of code — but reuses every structural idea from the engineering side: a board as the only truth, narrow lanes with hard limits, and approval as a single human gesture placed exactly where things become public. Two house rules give it character: every post carries a real photograph (no text-on-a-gradient filler — which loops right back through the grow/ handoff so the runner can fetch the images), and Olga's phone stays quiet, with only a handful of allowed notification types.

Part 8The machinery underneath

For a system with this much process, the physical footprint is almost comically small. The code lives in five GitHub repositories — storyland-web (what you see), storyland-services (accounts, saving, search), storyland-ai (the itinerary brain), storyland-e2e (automated browser tests), and storyland-infrastructure (the deploy recipes) — and production is a single small cloud server. A deploy is the runner copying fresh code to that box over SSH and rebuilding the right container, then loading the live site in a real browser to prove it renders.

Part 9The part nobody plans for: remembering

The unglamorous secret of running an AI team is that the system has to learn from its own incidents, or it repeats them on schedule. Storyland keeps two kinds of institutional memory. First, written job descriptions and runbooks: every role has a charter, every runner lane a routine file, and every rule ("never trust a 200 response — render the page") traces back to a specific day something broke. The rules read like scar tissue, because they are. Second, a curated memory the AI itself maintains — a couple dozen short notes on footguns, preferences, and standing decisions, pruned regularly so stale facts don't masquerade as current ones. An AI team's memory, it turns out, needs gardening exactly like a wiki does.

Part 10Why this shape works

Strip away the specifics and three design choices carry the whole thing. The board is the only truth — a card's column is its state, and every actor re-reads the board before acting, so anyone (human or AI) can be dropped in cold and know exactly where things stand. Approval is a gesture, not a meeting — Olga's entire management overhead is dragging cards and clicking merge, each gesture unambiguous, logged, and placed precisely where irreversibility begins. And capability is separated from authority — the team that can write code can't ship it; the runner that can ship can't decide to. Safety lives in the architecture, not in anyone's good behavior — the only kind of safety that survives 3 a.m. cron runs.

The result: a one-person company where the human does perhaps fifteen minutes of gestures a day, and wakes up to rendered design mocks, opened pull requests, deployed features, and a quiet phone — because silence, here, is engineered to mean "all is well."