productize.life
TH EN
AI · Coding Agents

I ran three AI coders at once
and the agents weren't the cost

The bill barely moved. The expensive part is human time and the defects that slip through, not the coders. Here's the shape we use to hold both.

Yim· written with Dobby (AI Oracle)/Jun 30, 2026/~5 min read

One day we had three scripts that all needed a similar fix. Instead of doing them one at a time, we told three AI coders to go at once, one per file, each working in its own space so they couldn't collide. A few minutes later, three pull requests came back, separate files, already reviewed.

What stopped me wasn't the speed. It was the cost when I went to check it: it had barely moved. We'd expected running several AI agents in parallel to burn a fair bit. The numbers said otherwise, and once we dug into why, we hit the thing that changed how we think about putting AI to work.

Part 1The coders aren't the real cost

Most people picture running a lot of AI codegen as a per-token bill that climbs and climbs, because the mental image is a metered API. But the coders we run don't connect that way. We run them through Hermes, an agent framework that decides which engine drives each coder, where the engine is just the model doing the real work behind it. The engine we have plugged in logs in through a monthly subscription we already pay for, not an API key billed per token. So whether we launch one coder or three at once, the cost per round barely changes.

So where does the real cost go? To us. Early on we babysat the coders, ssh-ing in to check status every few seconds. That back-and-forth was what ate the resources, not the coders. The fix wasn't fewer coders, it was to stop babysitting and let a dispatcher hand out the work and close it off automatically. People set the problem and make the calls; people don't sit and stare.

Once you see that, running several coders in parallel stops being the extravagance you feared. The coders are cheap and replaceable. The expensive thing is human time and attention.

Part 2Cheap and parallel means you need a checker who isn't the writer

But once coders are cheap and you can launch several at once, the risk moves. It isn't about money anymore, it's about quality. Output that comes fast and in volume slips defects through just as easily if nobody checks first.

The path we chose: everything a coder writes goes through a separate reviewer first. The reviewer has one job, to read and flag, never to write. Because the moment the writer and the checker are the same agent, it glosses over the exact spots it just missed. Splitting the reviewer out gives you a second pair of eyes that isn't attached to what was just written.

What we'd like to do better: the reviewer should run a different engine from the writer, so it doesn't share the same blind spots. We tried switching the reviewer to a different engine, but it hung silently inside the system and never produced a single review, so we fell back to the engine we knew worked. A different engine is the right target, but something that actually works now beats an ideal that isn't stable yet.

🔗 Why a reviewer on a different model matters, read AI code review with Codex CLI for Claude Code

Part 3A human keeps merge authority, and the work stays isolated

Coders running in parallel, a reviewer in between, and there's still one last line to draw: who presses merge into the main code. Our answer is a person, not any AI. The coder writes, the reviewer flags, but a human decides whether it actually goes in. We call that last gate Gate A, not because we distrust the AI, but because taking code into the main line is a hard-to-reverse decision, and someone should own that spot every time.

The other thing to set up the moment you run several at once: each coder needs its own workspace, not all writing over the same files. Let several coders edit the same space at once and the work tangles. That's the dispatcher's job, to fence off space for each one up front, not to patch collisions after the fact.

🔗 How to isolate agents so they don't collide, read git worktree for parallel AI agents

Wrap-upStart small: the shape you can copy

If you take one thing away, take this: running a fleet of AI coders isn't as expensive as you fear, because the coders aren't the real cost. The expensive things are human time and the defects that escape, and both are solved by the same shape.

Start small. You don't need the whole fleet on day one.

  1. One coder, running on a subscription you already pay for, instead of a per-token API.
  2. One reviewer that isn't the writer, reading and flagging first, every time.
  3. A human holding merge authority, with a dispatcher handing out the rest of the work instead of you watching it.

Once that shape is stable, add coders one at a time. Because what lets you run a whole fleet without losing sleep isn't smarter coders, it's the checker and the human who still holds the decision.

Sources & references
Series · running AI coders as a team
Follow along

Get new posts and free resources first

Leave your email. New posts and the occasional free resource land in your inbox. No spam.

Email only, for updates.

Comments

Join the conversation

Share a thought.

Name is shown publicly. Email stays private and is never shown.

Loading comments…