Zero-trust overlay networks for AI agent isolation

The default network an AI agent runs in is too generous. On a stock Kubernetes cluster, every pod can dial every other pod by IP, every managed database and internal API on the same cloud VPC, and most internal control-plane services. That posture was fine when the workload was a web app you wrote. An AI workload is a different shape: somewhere — in an API parameter, in a document the LLM just summarised, in the output of a tool the agent just called — an attacker can influence the next outbound HTTP call the process makes. Whatever the trigger, the flat network is the exfiltration path.

This post is a short, plain-language tour of what a zero-trust overlay actually is, the exploit pattern it blocks, and how we wired one into Agyn.

What "flat cluster networking" actually means

Imagine an office where every internal door is unlocked. That's the posture a Kubernetes cluster ships with — networking is permissive on purpose. Pods get IPs from a shared CIDR range; any pod can open a TCP connection to any other pod's IP and port; the cluster's DNS server resolves service names cluster-wide. And because the cluster usually lives inside a cloud VPC, the rest of that VPC is reachable from a pod by default too — managed databases (RDS / Cloud SQL / Azure SQL on the same network), internal load balancers, monitoring stacks, admin endpoints on neighbouring services.

NetworkPolicy exists, but it's opt-in, namespace-scoped, additive-only (there are no explicit deny rules — anything not allowed is just left to whatever the CNI plugin defaults to), and it only does anything if your CNI plugin actually enforces it. Most teams treat it as fire-and-forget. The result: an agent compromised by prompt injection has, by default, line-of-sight to your in-cluster databases, every other workload in the same VPC, and any internal admin API on that network.

What a zero-trust overlay does instead

Picture a private club whose building doesn't appear on any street: members find the door through an app, show ID at it, and only get into the rooms their membership names. That's the shape of a zero-trust overlay.

More formally, an overlay network is a virtual network that sits on top of the real one. Pods don't talk via IPs — they "dial" service names through an encrypted fabric. Zero-trust means the fabric refuses every connection by default and only allows a dial when:

The caller has a cryptographic identity the fabric recognises.
A policy explicitly says "this identity may dial this service."
The connection is mTLS-authenticated end-to-end.

Three properties fall out of this:

Services have no open listening ports on the real network. They register themselves with the overlay and accept dials from it. Port-scan an agent pod and you find nothing.
The fabric checks who is calling, not where the call came from. A pod's IP can change every restart, and "this packet came from the cluster network" was never a security claim anyway; the certificate-bound identity the caller presents is what the fabric verifies before letting a dial through.
Default-deny is the default. You don't write rules to block traffic — every connection that isn't named in a policy is dropped.

This is the model NIST and CISA converged on for zero-trust (NIST SP 800-207, CISA's Zero Trust Maturity Model). Anthropic outlined the same principles applied specifically to AI workloads in Zero Trust for AI Agents — cryptographically-rooted identities, task-scoped permissions, and the observation that AI-accelerated attacks compress patch windows from months to hours. The interesting question is how to actually run it.

The exploit pattern this would have stopped

The fastest-growing class of vulnerability in LLM-adjacent tooling is server-side request forgery (SSRF): a fetcher inside the inference service — an image loader, a URL summariser, a tool the agent calls — will dial whatever URL its caller hands it, with no hostname validation and no private-network blocklist. The agent itself, or an outside attacker who reaches the agent's HTTP surface, can point that fetcher at internal targets the deployment never meant to expose.

A recent instance of this shape is CVE-2026-33626 in LMDeploy, a widely-used LLM inference toolkit. The vulnerability was a missing hostname check in the vision-language image loader — any URL the caller supplied, the loader fetched. After disclosure, the SSRF was observed being used in the wild to reach internal services from inside the inference pod: the cloud metadata endpoint, Redis and MySQL on their default ports, an internal admin interface, and LMDeploy's own distributed-inference control endpoint. No credentials were extracted in the observed sessions — but reaching those services at all from the outside is the foothold. The rest of any real attack starts there: from "what can I see from inside this pod?" to "what can I do with what I see?"

The pattern keeps showing up. SSRF advisories against the LLM-inference and MCP-server ecosystems have been a steady drip through 2025 and into 2026, and the window between disclosure and first internet-wide probing for each new advisory is now measured in hours, not weeks.

One honest note on the threat model. A directly-exploitable SSRF gives an outside attacker the same network reach the agent already has — they don't need to be the agent. But the same primitive is also what prompt injection becomes: the agent reads a malicious document, the LLM decides the next tool call is a fetch against an internal database or admin URL, and the dial leaves the pod with no resistance. Whatever sets the AI workload off — a crafted API input, a poisoned document, a compromised MCP tool, a buggy dependency — the destination-side defence is the same.

What breaks the chain on an overlay network: the inference pod has no route to anywhere it wasn't given a named service for. The Postgres in the next subnet, the internal admin on :8080, the Redis on :6379 — none of them are registered services on the overlay, so the dials never leave the pod. The flat-network property that makes the exploit easy — "pod can reach any IP it knows" — is the property an overlay removes.

How it plugs into Agyn

Each agent pod can reach a fixed set of named platform services and nothing else. No other IP routes from inside the pod, no other hostname resolves, and the rest of the network has no listening port for the agent to scan.

We picked OpenZiti (open source, Apache 2.0) because it takes the "no open listening ports" property seriously — services dial out to the fabric instead of binding to a TCP port the mesh then layers controls on top of. Every agent pod gets a Ziti sidecar at startup. The sidecar holds the pod's OpenZiti identity (a per-pod certificate). The agent's container talks to the platform through hostnames ending in .ziti — currently three:

gateway.ziti — the platform's control-plane API.
llm-proxy.ziti — the proxy that attaches the real Anthropic / OpenAI key.
tracing.ziti — the OTLP endpoint for spans.

These names are not in any DNS server. They're resolved inside the sidecar and routed over the overlay. From outside the pod, none of these services have an open port — only OpenZiti-authenticated dials reach them.

The same fabric carries private resources customers bring in — their own internal databases, APIs, or MCP servers. The customer registers a resource with the fabric (it never gets a public IP and isn't exposed to the cluster's flat network at all), and policy decides which agents can dial it.

The agent's container never holds an API key for any of this. The pod's network identity is the authorization. Whatever convinces the agent to dial somewhere it shouldn't — a prompt injection in a document it reads, an SSRF in a tool it calls, a compromised dependency it pulled in — hits the same wall: the destination isn't a registered service on the overlay, so the connection is refused.

Full source at github.com/agynio/platform. For the broader picture of how this composes with filesystem isolation and credential brokering, see AI agent sandboxing: filesystem and network isolation patterns.

Takeaway

Flat cluster networking is a routing accident that AI agents turn into an exfiltration channel. A zero-trust overlay makes every service-to-service reach an identity decision: no identity, no policy, no connection — and no listening port on the real network to bypass it. The cost is one sidecar per pod and a small mental shift (dial service names, not IPs). The payoff is that the SSRF-to-internal-network pattern that LMDeploy and a long tail of LLM tooling keep getting probed for — simply has nowhere to go.

References