Why I Rebuilt NavBot From Scratch, A Security Case Study

Most "always-on AI assistant" projects look great in a screen recording and fall over the moment a real adult tries to use one in production. I know, because the first version of NavBot was one of them.

This is the story of how I rebuilt it, properly, and why every operator running an AI second brain in 2026 should care about the same five things I now care about.

1. The promise

The pitch I sold myself eighteen months ago was simple: an AI agent that lives on a server, listens to me through Telegram, reads my email, watches my calendar, drafts replies, files receipts, and never sleeps. The kind of thing every operator imagines when they first see Claude do something clever.

Version one of NavBot did all of that. It was brilliant for about six weeks. Then I started looking at what I'd actually built, and I went cold.

2. The failures

What I found in the first NavBot, in the order I was embarrassed to discover them:

Open shell access. The agent could execute arbitrary commands on the host with no allow-list and no audit. If a clever message reached it, the message was the command.
API keys in plain text on disk. Anthropic, ElevenLabs, Supabase, my own MCP servers, all sat in a config file that any process on the box could read.
No rate limit on outbound model calls. A single mis-fired loop racked up £340 in eight minutes before I noticed. Cheap lesson, expensive bill.
No prompt boundary. Anything in an inbound message, email subject lines, calendar invites, PDFs, flowed straight into the model context. Classic indirect prompt injection waiting to happen.
No sandbox. When the agent wrote files, it wrote them anywhere it liked.

I'd built a thing that worked for me because I was the only one using it and I was being polite to it. The minute it touched the real world it would have been a liability.

3. The warnings that made me listen

Around the same time, two security folks I respect, independently, sent me variations of the same message:

"If you give a language model the ability to take actions on a network it didn't write, you've handed an attacker the same ability, they just have to learn to talk to it nicely."

That's the line I keep on a sticky note above my desk. Once you internalise it, you stop building agent systems and start building permissioned agent systems. There's a big difference.

4. The architectural flaws I couldn't patch

I tried, briefly, to bolt fixes onto NavBot v1. It didn't work. The deeper problems were structural:

The agent and the tool runtime lived in the same process. There was no boundary to defend.
Memory and identity were stored in the model context, not in an external store with its own access rules. The model was the database.
Every integration was hot-wired. There was no concept of "this tool can read calendars but can't send emails", it was all-or-nothing.

You can't patch a building with no walls. I deleted v1 and started again.

5. The rebuild

NavBot v2, the version that runs my actual day, and the version that's the headline perk of the top of the NavAIgate ladder, looks nothing like v1. The principles that drove the rewrite:

Local-first, by default. NavBot runs on infrastructure I control. Cloud is opt-in per integration, never the default.
Permission-gated tools. Every tool the agent can call is declared, scoped, and revocable. Read-only by default. Write access requires me in the loop until I've explicitly trusted that path.
Sandboxed execution. When the agent runs code, it runs it in an isolated environment with a known filesystem and no network unless I've granted it.
Secrets out of process. Keys live in a secrets manager, not in a config file. The agent gets short-lived tokens it can't exfiltrate.
Predictable cost. Hard daily and per-task budgets. The agent stops itself at the limit and asks. My monthly model spend is now steady and I can forecast it.
An audit trail I actually read. Every action NavBot takes is logged with the prompt that caused it and the tool that ran it. Forensics is a feature, not an afterthought.

The result is an AI second brain I'm willing to bet my consultancy on. It runs my pipeline, drafts my proposals, manages my receipts, watches my calendar, and most importantly, when something goes wrong, I can see why it went wrong, and I can stop it from going wrong again.

Why I'm telling you this

Because the thing I see most often when operators try to build their own NavBot equivalent is exactly what I built first: a clever demo with the security model of a wet paper bag. It will work for them, until it doesn't.

NavBot is currently in private build for the founders' cohort of the NavAIgate community. When it ships, it ships with the architecture above baked in, not as an afterthought. If you want to join the cohort, the route is the annual member tier. If you want me to deploy it with you personally, that's the white-glove tier above it. And if you want the full thing, in person, in a small room of operators going through it together, that's the AIOS Training Programme, top of the ladder.

Either way: don't ship version one of your own NavBot to anything that matters. Ship version two.

, Daniel