Where the flower grows
Air gapped intelligence, local LLMs, and the case for private AI
Standing in the parking lot outside the rave, Yatú was agitated. The dystopian, bass-heavy soundscape emanating from the cavernous room inside sounded like two mechatronic warriors grinding, fucking, and fighting in painfully slow motion.
“This is what people think the future will sound like,” I said.
“People don’t realize the future doesn’t have to be like this,” he responded.
He took my phone out of my hand to record a video of his own phone, sliding face first down a steep concrete ramp to reveal the sticker on the back of his device — SCROLLING KILLS.
A note to the reader: “Air gapped” is a term of art used among privacy conscious technologists for describing a device that has never, and will never, be connected to the internet or any other external network.
Yatú and Norm work under many monikers, one of which is USB Club. Over the past month at Sanctuary Computer, we’ve been working with them to research the practical implications and possibilities of local LLMs.
In our recent post, Off Brain, on the proliferation of AI, I write:
I won’t pretend humanity’s transition through this period will be fair and just for all people. I fully expect the megacorps hoping to monetize their hundred billion dollar investments will install pervasive surveillance harnesses, biased points of view and product placements deeper into our psyche.
But we’ve learned from social media companies and their dopamine jacking strategies. We’ll need to stay hyper vigilant by developing privacy first, air gapped, and end-to-end encrypted systems running against local, offline and open source toolchains.
Around the time I was writing that article, Alibaba released their Qwen 1.5 0.5B Chat model for anyone to use.
This is one of the first open source models that can feasibly run on a small chipset, so I spun it up on my Raspberry Pi 5 (16GB), and after a couple hours of tweaking, it was explaining quantum physics with an average time to first token of ~0.67 seconds.
The device, roughly the size of a pack of cards, was casually holding a compressed record of all human thought, in a squishy stochastic database, talking back to me in plain English, completely offline.
Off Brain, Off Grid
In just a few cycles, our handheld devices will house small, local LLMs: a first line of defense for our barrage of messy human queries, responding to anything it deems simple enough to save the network a request, and routing more complex or topical queries through to to the bigger, more expensive cloud-based models.
But this architecture will do little to improve upon our privacy under the gaze of Big Tech. A move simply to save money by pushing compute to the edge, our local queries and their responses will continue to be tracked and backed up like any other behavioral analytic.
To access the benefits of this magic database of all of documented human knowledge without leaking our beautiful dark twisted fantasies to multiple private mega corps in the process — just like anyone living happily off-grid in their solar-powered passive house — it’s on us as sovereign citizens to invest some time and effort in understanding and assembling our own private tool chains.
Local LLMs offer:
Control / Sovereignty: You have total control & ownership of any data the system stores as you use it
Peace of Mind: You know where your data is stored, and under what conditions it’s accessible
Audibility: Open source tool chains & hardware components allow you to clearly understand the system and its security characteristics
Observability: Hackable and ownable logic chains store, process and activate your memories in ways that best suit you
As the din of corporate tech reaches a deafening — and also largely boring — pitch, we’ve been standing in the parking lot, discussing an alternative. This article is a collage of our ideas, research, and notes as we explore the burgeoning arena of local LLMs, open, air-gapped hardware, and objects of private intelligence.
Throughout the piece, we outline how these components can form practical architectures: a home-server LLM, a hub-and-spoke network accessible over tools like Headscale, a decision-layer that selectively brokers internet access, a BYOM API endpoint compatible with existing Chat/Completions clients, and a portable memory vault built on vector embeddings. Together, they point toward a personal AI system that departs from industry paradigms in philosophy, mechanics, and user behavior: Private by default, locally anchored, and interoperable by design.
Character Traits
Unlike the cloud surveillance LLMs of 2025/2026, central to a sovereign private intelligence device is it’s trustworthy, observable and timeless nature.
Trustworthy means it does what it says (and nothing else). No secret data collection, no hidden behavioral analysis, no corporate interests pulling strings behind the scenes. You know what the model is trying to do and why.
Observable means you can look under the hood. Open source code, readable logs, auditable processes. You’re not blind to what’s happening with your data. You can trace decisions, verify behavior, tweak the code.
Timeless means it’s committed to the long term. No subscription that might vanish, no startup that could pivot or sell out. Conversations and memories live on owned hardware, in formats you control.
Simply, this object supports you. You’re not being mined for data or manipulated for engagement. Trust comes from behavior, not promises — and a device that’s transparent, durable, and truly yours, earns it.
Here are the qualities that set this type of device apart from cloud-based LLMs today:
Support & Scaffolding
A device built on these principles is not designed to create dependency or lock in. Instead, it acts as scaffolding: temporary support that helps you build your own capacity, then steps back.
Think of training wheels on a bicycle. They give you confidence while you learn balance and momentum. Once you’ve internalized those skills, the wheels come off. When you don’t need them anymore, the bicycle doesn’t become less useful. It just becomes yours.
A private intelligence device should work the same way. It might help organize thoughts, surface patterns, build mental models. The goal isn’t to make you dependent on it, it’s to help you become more capable on your own. This is not engagement-driven design, where the goal is to keep you scrolling, clicking, coming back. Rather, this design measures success by it’s own obsolescence. The device has done its job when you’ve internalized what it taught you.
In practice, this might mean the system gradually offers fewer suggestions, asks fewer clarifying questions, or quietly archives itself as your own thinking becomes clearer. It’s not trying to be sticky — it’s trying to be useful, then invisible.
Transparency & Modularity
For a private intelligence device, transparency is a trust signal.
Clear casings, exposed circuits, floating components make a device legible. When you can see under the hood and trace the connections, is better understood. You can check there are no hidden antennae or mystery chips. You can read the code to ensure it’s not making backdoor network calls. You can verify the stackup matches what you expect. This is technology is knowable, fixable, hackable. It’s not hiding anything.
This approach has roots in hacker culture and DIY electronics, where visible internals meant the device was yours to tinker with. Think of the iMac G3’s translucent shells or Framework laptop’s modular design. A device that shows you its internals is one that treats you as an owner, not just a user.
Play & Safety
A device that invites touch and rewards curiosity feels fundamentally different than one that communicates careful, sterile interaction. When a private intelligence device uses color as a language (soft greens for safe states, warm ambers for processing, gentle pulses for listening) it communicates status without requiring you to parse text or decipher icons. Tactility matters because touch is how we think. A satisfying click of a hardware switch, the reassuring weight of a metal enclosure, the subtle texture of a matte surface. These are sensory feedback loops that ground us in the physical world while we engage with abstract intelligence.
Trust in Architecture
As you use the device, it will store responses and conversations as fodder for future context windows. As such, you’ll want assurance that your data can’t be read if the device falls into the wrong hands. Here, we look to similar hardware that underpin the privacy elements of a smartphone or crypto wallet.
Encryption at Rest
Encryption at rest ensures that if your device is stolen, the data remains locked. Linux’s LUKS (Linux Unified Key Setup) is the standard for full-disk encryption, turning a file partition into a vault.
sudo cryptsetup luksFormat /dev/mmcblk0p3
sudo cryptsetup luksOpen /dev/mmcblk0p3 airgapt_vault
sudo mkfs.ext4 /dev/mapper/airgapt_vault
sudo mount /dev/mapper/airgapt_vault /mnt/airgapt_vaultA speculative private intelligence device could leverage LUKS to encrypt its memory database and conversation logs, ensuring your digital twin remains inaccessible without your passphrase, even if the hardware falls into the wrong hands.
TPM & Secure Elements (SE)
A Trusted Platform Module (TPM) or Secure Element (SE) is a small chip added to the board that can act as a hardware-backed keyring, binding LUKS encryption keys to a specific device. By storing the LUKS master key (or a key-encrypting-key) inside the TPM/SE, the encrypted filesystem can only be unlocked on the exact hardware that created it—even if the SD card is removed and inserted into another system.

In practice, this works by deriving or sealing the LUKS key using the TPM’s Platform Configuration Registers (PCRs), which measure the boot state and hardware identity. During boot, if the PCR values match the expected state, the TPM releases the key to unlock the LUKS partition:
sudo systemd-cryptenroll /dev/mmcblk0p3 --tpm2-device=auto --tpm2-pcrs=0+7
sudo cryptsetup luksOpen /dev/mmcblk0p3 airgapt_vaultThis ensures that the encrypted vault is bound to both the device and its boot integrity, so physical theft of the storage media won’t grant access without the original hardware (and the end user’s passphrase).
Hardware Kill Switches
A hardware microphone switch offers a physical, verifiable layer of control, allowing the user to mechanically sever the audio input circuit, ensuring the device cannot listen unless the switch is explicitly toggled on. Unlike software toggles, which can be overridden by malware or firmware exploits, a hardware switch operates at the electrical level, providing a tangible sense of sovereignty over when and how the device can perceive sound.
This can be implemented using a SPST (Single Pole Single Throw) switch that interrupts the power supply to the microphone module, or by physically disconnecting the audio signal line before it reaches the analog-to-digital converter (ADC). When the switch is in the “off” position, no audio signal can reach the processing pipeline.

Similar approaches are found in devices like the Purism Librem laptops or the PinePhone, where cameras and microphones can be physically disabled via toggle switches on the device chassis.
Faraday Cage
A Faraday cage is a conductive enclosure that blocks electromagnetic fields, creating a signal-proof barrier around the device. By housing the hardware in a metal mesh or conductive fabric, all wireless signals (Wi-Fi, Bluetooth, cellular) are prevented from entering or leaving.
This could be as simple as a grounded metal case, or a flexible fabric pouch that opens when you want to connect and seals when you want isolation. When closed, the device becomes entirely unreachable: no remote access, no exfiltration, no background connections.

It’s a hardware-enforced air gap. Even if the software were compromised or a radio inadvertently enabled, the physical barrier ensures no signal escapes. Your private intelligence device is provably offline, invisible to the electromagnetic spectrum, a tangible guarantee that private thoughts remain private.
Hub & Spoke
While purist implementations will, at all layers, physically prevent a single packet from leaving the system, a private LLM doesn’t necessarily need to be entirely isolated to broadly achieve its privacy goals.
Rather, systems like Plex and headscale demonstrate open, blended architectures and tools that allow users to store private data on premises, while allowing secure network tunnels for clients roaming in the outside world to safely burrow in.
An on-premises LLM that is privately accessible via a VPN allows the user to benefit both from a heightened sense of privacy (knowing their conversational back and forth is not observed or leaked externally), while allowing the system to be remotely accessible on the go from a client app; opening the use case to the LLM as a home appliance, via a Hub & Spoke model.
As BYOM (Bring Your Own Model)
Today, there’s no IETF-style “standard” for an API interface into an LLM. OpenAI’s Chat/Completions API have become a defacto wire format for LLM request/responses.
Implementing the OpenAI Chat Completions wire format allows for pretty much any client library that is designed for use with OpenAI’s API to be compatible with the system.
Coding Assistants like Cursor, Windsurf and Aider, Chat UIs like LibreChat, Chatpad and Open Web UI, or workflow platforms like LangChain and Flowise all maintain compatibility with Open API’s wire format.
Binding the a simple endpoint against a port on localhost, and opening a tunnel a via headscale, your local LLM is suddenly a drop-in backend for a wide variety of existing systems; further keeping your work and life private.
POST localhost:5555/openai/v1/chat/completions
Content-Type: application/json
Authorization: Bearer <token>
{
"model": "AirGaPT",
"messages": [{ "role": "user", "content": "Hello" }]
}import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.AIRGAPT_API_KEY,
baseURL: "http://localhost:5555/openai/v1"
});As a Decision Layer
Similar to ChatGPT 5’s router architecture, the private LLM could act as a decision layer, assessing queries for complexity, topicality and intent, gating and asking the user to approve external calls to access the internet on a case by case basis.
Rather than leaking the user’s entire query to the internet, such an architecture allows only a specific MCP web browser call to be exposed externally, while keeping the user’s intention hidden from the outside world, offering security via obscurity: a fragment of a thought, whispered into the ocean.
As a Home Appliance
Further, a local LLM designed to serve as an appliance could integrate into home automation systems, like the open source home-assistant.io project on Github; finally allowing us to talk loosely to our living spaces without needing to ask Jeff Bezos to dim our lights.
Tired of checking if his laundry was done, Sanctuary Computer developer and hardware hacker Guy Dupont recently demonstrated these types of local appliance architectures by beaming audio to a small local computer to use basic ML for analyzing the sound in his laundry room, so he could tell what part of the wash/dry cycle his clothes were on.
Further, Cyril Engmann, founder of the The Garage out of Brooklyn runs local LLMs, agents and other workflows locally via a tangle of cables above his fridge.

As a Memory Primitive
As LLM use cases become more and more prevalent, so too does the need for a universal memory format. Backed by vector databases (pg_vector, Pinecone), our queries and conversations with the model are stored to be loaded in future context windows; making for more and more specific and relevant responses as the system learns about it’s user.
A defining feature of SQLite is that it houses the entire dataset in a single file. Projects like mem0, zep, or pg-agent-memory work with SQL, implementing a standard schema for the model’s known “facts”, “observations”, “memories” and more. Allowing the user’s entire memory to be stored in a single, highly portable encrypted file, your local LLM now serves as system for building your very own private memory vault.
CREATE EXTENSION IF NOT EXISTS “vector”;
CREATE TABLE memories (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID, -- optional: who this memory belongs to
content TEXT NOT NULL, -- the actual textual memory
memory_type VARCHAR(32) DEFAULT ‘fact’, -- e.g. ‘fact’, ‘preference’, ‘observation’
importance FLOAT DEFAULT 0.0, -- used for weighting retrieval
created_at TIMESTAMP DEFAULT now(),
updated_at TIMESTAMP DEFAULT now()
);
CREATE TABLE memory_embeddings (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
memory_id UUID REFERENCES memories(id) ON DELETE CASCADE,
embedding vector(1536), -- or vector(4096) depending on model
model VARCHAR(64) DEFAULT ‘text-embedding-3-small’
);
CREATE INDEX ON memory_embeddings
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);As a Portable Context Key
In our recent article, Oh! To be known by my computer, we propose the concept of a Digital Twin →
Each of these archetypes are based on the concept of the Digital Twin, a dynamic, agentic AI partner who matures through its human’s lived experience. Our digital twin engages us through patterns and situations that mirror how we live, work, and connect with each other, integrating naturally into our lives as a nebulous mesh, rather than a bunch of scattered, siloed data points.
Ambient, adaptive, sometimes even invisible, you won’t have to ask your Digital Twin to do something; it will already be halfway there, anticipating your needs like a true collaborator. As designers, this paradigm shift means we need to stop optimizing for clicks and flows, and begin fostering trust, intuition, and co-creation between humans and machines.
The more we use our private LLM, the more it starts to truly know us. Less a memory palace or collection of pointers, and more digital representation of our hopes, dreams, desires and fears.
The Open Context Layer (OCL) is a decentralized memory protocol that lets users carry their preferences, goals, history, and behavior across apps, agents, and chains. It sits at the infrastructure layer of the Agentic Web and decouples memory from any single model or interface. By design, it is permissioned, portable, and encrypted, turning ephemeral context into something structured and user-controlled.
— Plurality on Building an Open Context Layer for the Internet
On a case by case basis, we may decide to share parts of our digital twin with external AI systems, allowing them to operate, combine, and collaborate with the small blips of our largely private and hidden personhoods. By implementing protocols like OCL, we may freely move between external systems, and even allow multiple local LLMs to be physically chained and unlocked, gleaning insights about each other only possible in close proximity.
Making the jump
A local air-gapped LLM, complete with hardware encryption, physical kill switches and faraday cage might start to feel like a somewhat niche use case.
But despite billions spent on cybersecurity, data leaks remain a constant, with companies like Meta, Microsoft, Google, Amazon, and Apple suffering major breaches, exposures, or privacy violations nearly every year—ranging from hundreds of millions of scraped user profiles to nation-state-level intrusions into corporate email systems. The consequences are steep: record-setting fines (up to €1.2 billion), the mass erosion of trust, and sustained regulatory scrutiny. Identity theft is now a global epidemic, with millions of victims and billions lost each year, fueled by breaches, phishing, and dark-web data markets.
In the era of cloud-based LLMs, Big Tech no longer needs to guess who we are by interpreting clicks and swipes. Rather, we’re now offering fully-formed sentences into our microphones, whispering more about ourselves to the cloud than ever before. If the recent wave of vitriol for friend.com’s recent NYC subway campaign is any indicator, the idea of an always-on, anthropomorphized surveillance device sending a stream of your life’s audio to private company’s cloud backend is definitively not a vibe.
At the same time, there’s no argument that this (still primitive) kaleidoscopic synthesizer of all human thought is a giant step forward in the potential of humankind. The existence of natural intelligence always presupposed that we’d create synthetic intelligence on a long enough timeline. There’s no argument that LLMs are inevitable, are here to stay and have tremendous capacity for abuse. It’s on us as individuals to decide on what terms we’ll interact with these incredible black boxes.
That’s why Yatú was upset, standing in the cavernous parking lot on a mild night in Los Angeles. The rave felt like a caricature of the future — a world where everything is artificially loud, maddeningly optimized, and not-so-subtly dehumanizing. The audio inside wasn’t music, it was the sonic equivalent of the cloud: intrusive, unavoidable, and indifferent to the people it engulfed.
We’ve all accepted this as inevitable. That we’re supposed to surrender our inner lives to systems that don’t care about us. Yatú refused, his note a reminder that nothing about the future is predetermined, and that a different trajectory is still available if we’re willing to build it ourselves.
Over the past month of collaboration, we’ve come to feel that private intelligence devices and architectures solve for many of the inherent dangers under the current LLM paradigm, spanning from the esoteric and paranoid, to the nurturing or commercial.
Over the next few months, we’ll be documenting and speculating a theoretical device running a private intelligence toolchain, sharing our progress here on Substack and Instagram. Follow along for updates.
garden3d is (among other things) a future facing, fully integrated product design & development team. We’re driven to create systems that understand people.
USB Club is a social network built on USBs by Teal Process. They’re interested memory, networks and novel product designs.
If you’re working on something ambitious, strange, or quietly revolutionary, we’d love to help. We’re looking for partners who are ready to start designing the next generation of cognitive interfaces together. Just send us an email at hello@garden3d.net
Oh — and if you liked this post, you may enjoy Off Brain, or Oh, to Be Known By My Computer!














Incredibly insightful! Excited for a future with hardware + software that we can trust to truly have our best interests at heart
thank you for this