Why we run our servers from RAM (and what that bought us)
Diskless boot, tmpfs everything, every reboot wipes the box. The structural reason we can tell auditors there is nothing to log to — and the operational tax we pay for that property.
Most VPN providers claim a "no-logs policy." A policy is a promise. We did not want to live by promises, because promises are exactly what subpoenas are designed to puncture. So a few years ago we made a structural decision: every exit node in our fleet runs entirely from RAM. No persistent disk. No swap. No journald spool. The drive bays in our chassis are empty.
This post is the engineering side of that choice. It is also an honest accounting of what it cost us. The short version: it bought us a property our auditors can verify in fifteen minutes, and it cost us roughly three to four times the BIOS-quirk debugging budget of a normal fleet.
The boot sequence, in plain English
When a node powers on in, say, our Frankfurt rack at Interxion, here is what happens. The NIC firmware does a PXE request. A management box on a separate VLAN serves a signed iPXE script. That script pulls a versioned, hash-pinned OS image — kernel, initramfs, and a squashfs root — over HTTPS from our image origin. The image lands in RAM. The kernel boots. The root filesystem is mounted as tmpfs. Configuration is fetched from our control plane, decrypted with a key the node derives from a TPM-attested boot, and written to /etc — also in RAM. WireGuard comes up. The node joins the pool.
There is no install step. There is no /var/log that survives. If you pull the power cord and put it back in, you do not get the same machine back — you get a fresh machine, identical to every other node booted from the same image, with no historical state.
What it bought us
Three things, and they are the three things we cared about most.
- Audit clarity. Our auditors at Cure53 and the team that did our last infrastructure review do not have to take our word for "we delete logs nightly." They look at the chassis, see no drives, look at the boot config, see PXE-only with tmpfs root. The conversation about logging takes about ten minutes instead of three days.
- Seizure resilience. If a datacenter operator is compelled to physically pull a server, what they get is a powered-off box with empty DIMMs. We have not had this happen. We have prepared for it.
- Configuration drift, eliminated. Every node in a region runs the same image, byte for byte. There is no "well, this one was patched manually in October." Reboot the node, you get the current image. Drift is impossible by construction.
What it cost us
We do not want to pretend this was free. It was not.
BIOS-quirk debugging, multiplied
Every reboot is a clean install. That sounds great until you realize it means every reboot exercises the full PXE → iPXE → image-fetch → tmpfs-mount → service-start path. On a normal server, you patch a kernel, you reboot, and ninety-nine times out of a hundred it comes back. On our boxes, ninety-nine times out of a hundred it comes back, and the hundredth time you discover that this particular Supermicro board, with this particular firmware version, does not honor the PXE timeout we configured. The node sits at a black screen. We have a runbook.
We estimate this multiplies our hardware-debugging time by three to four. Worth it. Annoying.
Hardware sourcing is harder
Not every datacenter rents servers that PXE cleanly. Some vendors ship boxes with locked-down BMC firmware that fights us on netboot order. Some regions — we ran into this in Johannesburg and parts of Latin America — only have providers offering generation-old hardware where the NIC firmware is flaky enough that one reboot in fifty hangs. We have walked away from otherwise great regions because the only available hardware would have given us a 2% boot-failure rate, which is unacceptable for a fleet that reboots weekly.
Secret distribution at boot is a real engineering problem
A diskless box has no place to hide a long-lived secret. So how does it authenticate to our control plane on boot? We use TPM-based remote attestation: the node proves to our control plane that it is running our exact image, on hardware we have previously enrolled, and only then does the control plane release the WireGuard private key and TLS material. This works. It also took us about four months of careful engineering to get right, and we still occasionally hit edge cases when a TPM gets into a weird state after a firmware update.
Regional rollouts are slower
When we want to launch a new city, we cannot just rent any box and ssh in. We have to verify the hardware supports our boot path, often by shipping a test unit and burning it in for a week. This adds two to six weeks to a regional launch versus the "rent a VPS and run an Ansible playbook" approach a normal provider uses.
The hypervisor question
We do not run a hypervisor on exit nodes. They are bare metal. The reason is simple: a hypervisor introduces a layer that has its own state, its own logs, and its own attack surface. We considered KVM with a similarly diskless host, but the operational complexity of "a tmpfs hypervisor running tmpfs guests" was not worth it for the workload, which is overwhelmingly network-bound rather than compute-bound.
On the management side — image build, control plane, billing, the boring stuff that needs persistent state — we run a small K3s cluster on dedicated hardware in a single management region. That cluster has disks. It is not in the request path for any user traffic. It also is not where law enforcement asks about, because it does not handle user sessions. The exit nodes do, and the exit nodes have nothing.
The maintenance window pattern
Reboots are interesting in this architecture because every reboot is a full reinstall. We needed a way to roll an image to, say, all nine boxes in the New York pop without dropping connections.
The pattern is straightforward. We mark a node as draining in the control plane. New connections route to its peers. Existing connections are given a soft deadline — typically thirty minutes — after which they are gently terminated with a 503 at the WireGuard handshake renegotiation, which most clients reconnect through transparently. Once the node is empty, we trigger a power cycle. It boots into the new image in roughly four minutes. We watch it serve health checks for ten more, then return it to the pool. The next node drains.
A nine-box city updates in about two and a half hours, fully transparent to all but the most session-sensitive users. Single-box cities (we still have three of these) get an upfront announcement and a five-minute window. We do not love this and are working to retire single-box pops by end of 2026.
Would we do it again?
Yes. The audit conversation alone justifies the operational tax, and the audit conversation is the conversation that matters when a journalist or a researcher asks "how do you know they are not logging?" The answer "they cannot, because there is nothing to log to, and you can verify that yourself" is qualitatively different from "they say they delete logs."
Structural beats policy. It always has. It costs more, in real engineering hours and real dollars. We think the trade is worth it. If our reasoning ever changes, it will be on this blog before it is anywhere else.
Run PlanetProxy for seven days, on us.
Same purple tile cards you see on this page, plus the green lock and a 50 ms hop to wherever you want to be.
Start the trial →More from the dispatch
Inside Planet ProxyPP · DispatchWhy we are domiciled in Panama (it is not the cliché)Inside Planet Proxy · 5 minWhy we are domiciled in Panama (it is not the cliché)
Privacy companies always seem to be from Switzerland, the British Virgin Islands, or Panama. Here is the actually-substantive reason we picked Panama, and the trade-offs that came with it.
Inside Planet ProxyPP · DispatchWe rebuilt our mobile app this spring. Here's what changed.Inside Planet Proxy · 6 minWe rebuilt our mobile app this spring. Here's what changed.
A note from inside the design team. Why we threw out two years of UI, what we replaced it with, and the hard choices we made about what to leave on the cutting room floor.
- Inside Planet ProxyPP · DispatchThe architecture of our streaming-clean exitsInside Planet Proxy · 7 min
The architecture of our streaming-clean exits
A look behind the curtain at the three-tier exit pool, why a residential allocation costs us 3-5x a datacenter one, the burn detector that watches for outbound 5xx spikes, and the 90% success-rate target we are honest about.