You have two ways to work with an AI agent, and both wear you down. Sign off on every move it makes, and you become the bottleneck. Let it run loose, and you are one bad call away from the demo that deleted the wrong file at 2 a.m. Most writing about "human in the loop" treats it as exactly that switch: on, or off.
That on/off framing is why so much AI work feels exhausting. An approval gate on everything turns the AI into a very expensive autocomplete; no gate at all turns it into a liability you can't leave alone.
For context: I build software with an AI assistant most days, often as a team of one. What follows is the rule we actually run by, not theory.
The demo where an AI does a task is trivial. Anyone can wire that up in an afternoon. The production-grade part, the part almost nobody writes about, is the guardrail underneath: knowing which decisions the AI is allowed to make on its own, and which ones it has to hand back to you.
The insight is small and it changes everything: not every action needs a human. You tier them.
The modelThe three tiers
When I work with my AI assistant, every action falls into one of three tiers. The sorting question is always the same: how big and how reversible is this?
Upstream: big or irreversible. The AI gathers the facts, lays out the options, and makes a recommendation. Then it stops. I decide. These are the choices where being wrong is expensive and hard to undo: deleting data permanently, picking an architecture you'll live with for a year, anything with money or a public footprint. The AI's job here is to make my decision well-informed, not to make it for me.
Midstream: moderate, and the right answer isn't obvious. We decide together. The AI surfaces a real tradeoff it noticed, I weigh in, and we pick. This is the tier people forget exists, and it's the most interesting one, because it's where the AI's instinct to "just handle it" is most dangerous: the stakes are high enough that a wrong default costs you, but low enough that the AI is tempted to guess.
Downstream: small and reversible. The AI decides, executes, and reports. Renaming a variable, moving a file, running a formatter. Things where if it's wrong, you notice and undo it in seconds. Routing these to a human is pure friction. The report matters more than the permission: I don't need to approve it, I need to be told it happened.
The whole skill is sorting fast and sorting honestly. Most failures aren't "the AI did the wrong thing." They're "the AI put a 🟥 decision in the 🟩 bucket": it executed something irreversible as if it were trivial.
In practiceWhat it looks like when it's running
Tiers sound clean on a slide. Here's what they actually look like in a real working session, including the part where the AI was wrong and the framework caught it.
Before the code examples, map it to everyday work first: an AI drafting a customer email and hitting send is 🟥 (it reaches an outsider, hard to walk back, wait for you) · an AI picking a vendor from three quotes is 🟨 (borderline, surface the tradeoff, decide together) · an AI filing receipts into the right accounting category is 🟩 (small, reversible, do it and report). Same rule, different domain. The two stories below come from our own dev work, but read them with your own work in mind.
🟩 The script it almost deleted
While cleaning up, the AI found a small script that looked, from its name, like a leftover duplicate, and was about to delete it. Small, reversible-looking, clearly downstream: the kind of thing it should handle alone without bothering me.
But there's a habit baked in before any delete: read it first. That read showed it was no duplicate at all. It was the exact script that had produced the project's one clean result, using a smarter approach than the version we'd kept. The AI left it, did the cleanup it was actually asked to do, and reported what it found.
This is downstream working correctly. The AI handled it alone and told me afterward. The "read before you delete" habit isn't a human-approval gate. It's a guardrail the AI runs itself, on its own actions, before committing. That's the whole point: downstream doesn't mean careless, it means the care is built into the AI's own process instead of routed through me.
🟨 The setting it refused to guess
Then a harder one. An audio step had a chunk-size setting: 180 seconds or 30 seconds. There were real arguments both ways: bigger chunks meant fewer calls, smaller chunks meant tighter handling but more overhead. The "right" answer genuinely wasn't obvious from where the AI sat.
The tempting move, the one an all-or-nothing setup invites, is for the AI to just pick one and move on. Instead it did the midstream thing: it surfaced the tradeoff. Here's the choice, here's what I lean toward, here's why I'm not sure.
I picked the answer most product people will recognize: prove it first. Don't argue the hypothesis, test it. We ran a small controlled experiment, and it overturned the AI's own leading guess. The option it would have silently chosen, if I'd let it run downstream, was the wrong one.
That's the midstream tier earning its keep. Not because the AI was incapable, but because the decision sat in the zone where a confident guess is exactly the failure mode, and "let's check" beats "let's assume." Most actions never reach this tier: they're the boring downstream majority, handled and reported and never escalated. You only feel the framework on the edges: the 🟥 you're glad it didn't touch, and the 🟨 it was right to ask about.
The pointWhy this is the production-grade part
Notice what both stories have in common. The interesting failure was never "the AI is dumb." In the config case the AI was confidently wrong, and the only reason that didn't ship is that the decision was correctly tiered as midstream and tested instead of assumed.
That's the guardrail. The demo is "AI does a task." The product is "AI knows the ceiling on what it's allowed to decide alone, and a wrong guess at a real decision gets caught before it costs you." One is a parlor trick. The other is the difference between an assistant you can leave running and one you have to babysit.
This is a different guardrail from access, from what is even allowed to trigger the AI in the first place. Just because a message shows up doesn't make it a command to act on; that's the access boundary, and it's its own post. This one assumes the task is legitimate and asks the next question: given a real task, who owns the outcome? Access controls the front door. Tiering controls the steering wheel.
TakeawaysWhat to carry out
- Human-in-the-loop is not all-or-nothing. The useful question is never "human or no human": it's "which human decisions, and which ones the AI owns."
- Sort by size and reversibility. Big or irreversible → you decide (🟥). Moderate and non-obvious → decide together (🟨). Small and reversible → the AI decides and reports (🟩).
- The dangerous tier is the middle one. A confident guess on a real tradeoff is the classic failure. When the right answer isn't obvious, surface it, don't default it.
- Downstream isn't careless. "The AI decides" still carries guardrails the AI runs on itself: look before you delete, verify before you claim done. The care moves into the AI's process; it doesn't disappear.
- The report replaces the permission. For downstream work you don't need to approve, you need to be told. Proactive reporting is what makes autonomy safe to leave running.
🔒 GatedGet the working method
The principle is here, free: tier your actions, and the loop stops being a bottleneck.
What I haven't put in this post is the part that makes it operational: the actual rubric I use to sort a fresh action into a tier in seconds, the wording that makes an AI surface a tradeoff instead of guessing, and the hook that enforces the tiers at runtime so "the AI decides" can't quietly leak into "the AI decided something irreversible." That's the difference between a nice idea and a guardrail your AI actually obeys.
🔒 Get the working method → the decision-tier rubric + the runtime hook that enforces it. The principle's free above; the operational method is the gated upgrade.
The first time the framework actually paid off, it wasn't because the AI did something brilliant. It was because the AI was about to be confidently wrong, the decision was sitting in the right tier, and "prove it first" beat "trust the guess." That's the whole series in one moment: the demo is easy, the guardrail is the work.
Evaluating Claude Code plugins · the test for which plugins are worth installing before you let them run
You can still run agents on a subscription (for now) · subscription vs API for an always-on agent
Bring your AI into Discord without handing over the keys · the access boundary: what's even allowed to trigger the AI in the first place
Turn speech into trustworthy notes, without letting AI make things up · the first post, build a second brain you can trust
Your site is live, but who can see it · make AI and Google see your site, solo