AI Data Privacy: A Three-Layer Defense for Using AI Without Leaking Secrets

One time we opened a daemon's log on the box, saw that some secret values were mixed into it, and decided to scrub them first with a short command that matched the pattern KEY= and TOKEN= and replaced each with the word redacted. We thought that was the end of it.

Then the text came out and two real key values were sitting right there in the transcript. The ones that leaked were named ANTHROPIC_API_KEY_FALLBACK= and CLAUDE_CODE_OAUTH_TOKEN_BAD=. Their names end in _FALLBACK= and _BAD=, not KEY= or TOKEN=, so the pattern never caught them.

The secret did not leak to some hacker. It leaked into our own transcript, a place we were not counting as "outside" at that moment, even though that is exactly what it is. A secret printed into a log has to be treated as leaked right away, and the key has to be rotated.

The lesson here is not that the command was written wrong. It is that data leaks at the seams you forgot to label, not at the spot you were already watching.

Part 1A three-layer defense: data can flow down, but never up

The AI we all use day to day keeps its knowledge base on someone else's cloud. You ask it something and the model answers from the broad knowledge of the world. No problem there. Public knowledge flowing down to you is fine.

The trouble starts when data flows the other way. The moment you feed your private data up, the model is not just reading it to answer you. That chunk has already left your machine and now lives in someone else's system. So how do you know what is safe to send up and what is not? The approach we use is to split data into three layers.

The outer layer is the cloud, the world knowledge the model already has. Ask it general things freely. None of your own data is in there.
The middle layer is shareable work data: public documents, project context that is already open. If it flows up to the cloud, that is still acceptable.
The inner layer is secrets: financial data, health data, client data, credentials. This layer has one rule. It never flows up to the outer layer, period.

The whole point of these layers is one-way flow. Knowledge from the outer layer can flow down and help your work all it wants, but anything in the inner layer never flows back up. Once you set the direction like this, a hard question such as "can I feed this to the AI?" becomes a much easier one: which layer does this belong to?

Part 2The leaks you don't think about

The most obvious leak is pasting secret text straight into an AI chat. Most people already guard against that. The sneakier leaks are the automatic "processing" steps you never see.

Take one close to home: the memory system our own agent actually runs, Graphiti. Each time it records a chunk of work, a small model reads the raw contents of that chunk to extract the facts. That "read to extract" step is exactly where data leaves the machine (egress). In the stack we actually run, that small model does its reading in the cloud, so your raw content has already gone up at that moment, even though all you did was tell it to "remember." You never meant to send anything up. That is why we treat "which model does the reading, and where it runs" as an inner-layer question.

The last one is logs and transcripts, like the leaked key at the start. The data does not escape to a criminal. It just pools in a place you never classified as the outer layer, even though that is what it is. Anyone who can read that log can see everything in your inner layer.

Part 3Make it a real defense, not just an intention

Judgment that runs on instinct does not count as a defense. Three things turn it into a real one.

Separate data by layer at design time, not afterward. Building a system that holds real personal data taught us that separating data by purpose has to be structural from day one, because each kind of data has different rules. Some can be deleted the instant the owner withdraws consent. Some the law requires you to keep for 5 to 7 years. Pile them all on one table and the moment you have to delete, the whole pile breaks.
Set it to fail closed. If you cannot prove where a chunk of data will end up, assume it is going to flow up to the outer layer and block it. Do not wave it through because "no problem has shown up yet." It is the same rule as access control: if you cannot confirm who has permission, deny first. Safer than guessing and opening the door.
Filter data by what it is, not what it looks like. Go back to the leaked key. The root cause was filtering by the character pattern KEY= instead of by the meaning "this variable is a secret." Once you classify things by what they actually are, an odd name like _FALLBACK can no longer slip through the net.

The actual wiring that makes the three layers enforce themselves automatically is another part of the implementation we are holding back for now. But the three principles above you can use today, with no special tooling required.

Part 4One question that works on any tool

Before you wire a new AI tool into your work, ask it a short question: where does the text I type end up, and which layer is that? If it cannot answer, do not feed anything from your inner layer into it yet. This question works on everything, from a plain chat to a memory system humming quietly in the background. Try it on the tools you already use every day.

Sources

The three-layer frame and every case here come from our own real work on productize.life and our fleet (June 2026).
The leaked key, caught by filtering on the character pattern KEY= rather than the variable name, comes from a real incident while scrubbing a daemon's log (June 2026).
Separating data by purpose from day one comes from building a system under PDPA, where different kinds of data carry different retention and erasure rules.
The "read to extract" step references the memory stack we actually run (Graphiti). More on it in the agent memory stack we really run.

The data that should never flow up

Part 1A three-layer defense: data can flow down, but never up

Part 2The leaks you don't think about

Part 3Make it a real defense, not just an intention

Part 4One question that works on any tool

Get new posts and free resources first

Join the conversation