On the night of June 13 our 16GB Mac, the machine that ran everything, started acting strange. The screen froze in bursts and the cursor spun. The numbers told the whole story: RAM full, another 4.1GB in the memory compressor, 3.8GB spilled into swap (disk space the OS uses when RAM runs out). macOS's own window renderer hung long enough that the system had to step in.
That machine carried a full agent stack. Daemons that must stay online, a memory service, a local embedding model for memory search, an investment-analysis workload, plus every app we personally had open. The first thought went straight to a purchase order: time to buy the 64GB machine, estimated around $2,000.
But before ordering, one question had no answer yet: of that 8GB, how much actually belongs to the agents? On a dev machine the numbers blur together: workload, tooling, and desktop all in one pot. This post is the homework we did to answer that question, using the cheapest method we could think of: rent a box and measure.
Part 1Rent before you buy: the cheapest measurement is one month of rent
Our problem was never "get a stronger machine." It was "find out what the workload actually needs." Those are different problems with different price tags. The first one ends in a $2,000 purchase order. The second one starts for a few dollars.
The method: rent a VPS with the same specs as the current machine (8 vCPU / 16GB RAM), move only the always-on parts onto it, and leave everything else where it was. We picked Hetzner, the German provider whose price-per-spec makes you double-check the page. The cx43 plan (8 vCPU / 16GB / 160GB disk) was $13.99 per month, billed by the hour. If the measurement failed, we could delete the server and pay only a fraction of a month. The whole experiment cost less than a lunch.
One trade-off had to be accepted at checkout: Hetzner's datacenters are Europe and US only, nothing in Asia. Ping from Thailand is roughly 250-300ms. That sounds slow, but for outbound agent work, meaning bots that push messages out, wake on schedules, and answer webhooks, the number is imperceptible. It would matter for a user-facing server in Thailand, and that is simply not this machine's job.
Part 2The rented box tells a different story than the dev machine
Once the same workload ran on a headless box with nothing else on it, the first reading answered the $2,000 question on the spot: resting RAM was 2.7GB, for a stack we believed ate 8GB.
We didn't trust an idle number, so we loaded it: 24 concurrent embedding request threads against bge-m3 for 75 straight seconds, as hard as the real workload could ever push. RAM peaked at 3.4GB, or 21% of the machine. Available memory never dropped below 78%, and memory PSI stayed at 0.00 for the entire test. In plain words: not one second of work ever waited for memory.
So where did the Mac's 8GB come from? It was never the agents' number to begin with. That figure was the agents plus Docker Desktop plus a browser plus the entire macOS desktop, which paints the screen, animates windows, and carries every app left open. Call it the desktop tax: memory the real workload never uses but the machine has to pay. Subtract that tax and the actual workload is 3GB.
The first lesson has nothing to do with Hetzner: never size a production machine from numbers read on a dev machine. Workload and environment are inseparable there. Isolate the workload, measure it alone, and only then talk about what to buy.
Part 3The first thing to run out is CPU, not RAM
The same test threw in a second answer we hadn't asked for. While RAM sat three-quarters empty, the embedding model pinned all 8 cores at 100% and started dropping work. Of 106 request batches fired at it, 57 were shed.
The bottleneck of this machine is not memory, the thing everyone fears. The real limiter is compute. The reason is straightforward: an agent's heavy thinking already goes out to large models over an API. What remains on the box is plumbing: receive a message, search memory, compute an embedding, call the API, pass the result on. That work barely touches RAM, but when everything arrives at once, CPU is the first gate to close.
The practical takeaway: when budgeting a VPS for agents, spend on vCPU before RAM. For this class of workload 16GB is oversized to the point of using less than a quarter, while 8 vCPU is already where bursts visibly hit the ceiling.
Part 4Today's prices: Hetzner vs DigitalOcean vs Fly.io
This table compares machines of the same class (16GB RAM, shared CPU), checked live on the official pricing pages on Jul 4, 2026. The Hetzner price is what we actually rented at in June 2026.
| Provider | Spec | Per month | Best for |
|---|---|---|---|
| Hetzner cx43 | 8 vCPU / 16GB / 160GB | $13.99 | 24/7 agent watch duty where cheap and steady wins; EU/US datacenters only; what we use |
| DigitalOcean Basic | 8 vCPU / 16GB / 320GB | $96 | Work that needs a datacenter near users (Singapore available); twice the disk |
| Fly.io shared-cpu-8x | 8 vCPU / 16GB | $88.88 | Deploy-near-users apps across regions; billed for actual runtime |
| Buying a 64GB machine | desktop | ~$2,000 one-time | Only after you've measured your real ceiling, and only for work that must live on your own machine (next part) |
The two middle columns are the whole point. Same RAM, nearly a seven-fold price difference. That gap is not free: Hetzner trades away Asian locations and hands-on support, while DigitalOcean and Fly sell location and convenience Hetzner doesn't have. So the right question is not which provider is best, but whether your workload needs to pay for location. If your agents only work outbound, the answer is clearly no.
The last row is in the table as a note to ourselves. A $2,000 machine is not a wrong choice. It should just be a decision made after measuring, not one made in the panic of a frozen screen.
Part 5Which work stays home, which work moves to the VPS
At this point you might wonder why not move everything to the VPS and be done. The answer we ran into along the way: the dividing line is not machine specs, it's how your thinking tools log in.
AI tools that do heavy thinking come in two flavors by authentication. The first is monthly subscription services bound to a login on a machine you own and sit at. They work everywhere you are, but they're not designed to run unattended on a rented box somewhere. The second is key-based APIs, pay per use, callable from wherever the key lives. The two kinds of work sort themselves into separate homes naturally.
- Stays on your own machine: heavy thinking on subscription services, anything you personally supervise anyway, and everything wired to physical devices. Those can't move even if you want them to.
- Moves to the VPS: always-on daemons, the memory service, the embedding model, scheduled jobs, webhook responders, and every workload that calls models by API key.
Split along that line, the original problem untangles itself. The Mac doesn't need a 64GB upgrade because the watch-duty work that crowded it has moved out. And the small VPS does the one thing it does best: being the house where the lights never go out, for the work that must never sleep. The two machines talk over an encrypted channel, which we've written up separately (linked at the end).
Part 6Three weeks later, the real invoice arrived
As this article is written, that server has been up for 20 days without a restart. Checked live before writing: RAM in use is 2.6GB, steadier than on measurement day. And the first Hetzner invoice has landed: $9.78 for the partial month of June, billed hourly. About ten dollars for three weeks of truth that nearly cost us two thousand.
There was one bonus we never planned for. The datacenter is in Germany, so all of the agents' memory data sits under Europe's data-protection rules, stricter than what most providers give you by default. If your agents' memory contains anything personal, that's a feature, not a footnote.
So how do you choose? Answering purely from what we measured:
- 24/7 agents doing plumbing work → the cheapest VPS with enough vCPU; 16GB RAM is more than plenty. If EU-only locations are acceptable, Hetzner's price is one nobody else plays at.
- Serving users in Thailand or Asia with low latency → pay the location premium for a provider with Singapore. Hetzner's cheapness can't help you if ping hurts your users.
- Running language models yourself → that's a GPU problem, a different problem from a watch-duty box. We've written that up separately (below).
- Thinking tools on a login-bound subscription → that work stays on your own machine. Don't force the move.
Before paying $2,000 for a new machine, try paying $14 for a first month of measurement. The numbers you get may rewrite the entire purchase order. Ours did.
- RAM / PSI / shed-request numbers come from logs collected by our own sampler (Jun 13-14, 2026, on the actual rented box). The 20-day / 2.6GB figures were checked live on Jul 4, 2026 before writing. The Mac-side numbers (4.1GB compressor / 3.8GB swap) were read off the machine on the day it happened, Jun 13, 2026.
- Hetzner Cloud pricing. The cx43 at $13.99/month is our provisioning price (June 2026), with the real first invoice of $9.78 dated Jul 2, 2026. Check current prices on that page.
- DigitalOcean droplet pricing, checked Jul 4, 2026: Basic 8 vCPU / 16GB / 320GB = $96/month.
- Fly.io pricing, checked Jul 4, 2026: shared-cpu-8x 16GB = $88.88/month (Amsterdam region; prices vary by region).
- PSI (pressure stall information), the Linux kernel doc for the metric we used to read memory stalls.