Defence in depth for agentic AI — the eleven-layer envelope

The mental model that says: no two adjacent layers in your stack should share a single point of failure for the same class of attack. If layer N is bypassed, layer N+1 catches what slipped through.

Why a single line of defence fails

Every application has bugs. If the only thing standing between user A and user B’s data is the WHERE user_id = $1 clause in a hand-written SQL query, then a single missed filter is a cross-tenant leak. An if !isAdmin { return 403 } check that someone copied wrong is privilege escalation. A sanitize(input) call that someone removed during a refactor is an injection.

Defence in depth assumes every layer has bugs. The job of the stack is to ensure that a bug in any one layer is contained by the next.

The eleven layers

┌─────────────────────────────────────────────────────────────────────┐
│ L1. TLS termination                                                  │
│     — provided by the ingress                                        │
├─────────────────────────────────────────────────────────────────────┤
│ L2. HTTP middleware: rate limit, request ID, OTel trace start        │
├─────────────────────────────────────────────────────────────────────┤
│ L3. JWT verify + claims extract                                      │
├─────────────────────────────────────────────────────────────────────┤
│ L4. Role gate (optional per route)                                   │
├─────────────────────────────────────────────────────────────────────┤
│ L5. Handler validation: schema, classification, size                 │
├─────────────────────────────────────────────────────────────────────┤
│ L6. Message enrichment: tenant_id, expected_tenant, user_roles       │
├─────────────────────────────────────────────────────────────────────┤
│ L7. Orchestrator policy stack (CompositePolicy):                     │
│     RBAC → Tenant → Tier → Classification → PromptInjection →        │
│     PII → Schema → Consent → Explainability →                        │
│     MaxContentLength → board-DSL rules                               │
├─────────────────────────────────────────────────────────────────────┤
│ L8. Agent HandleMessage (the actual work)                            │
├─────────────────────────────────────────────────────────────────────┤
│ L9. DB query under WithTenant — RLS enforces tenant column           │
├─────────────────────────────────────────────────────────────────────┤
│ L10. LLM call wrapper: deadline, circuit, budget, sovereignty        │
├─────────────────────────────────────────────────────────────────────┤
│ L11. Audit append + hash chain + Annexure VI incident if applicable  │
└─────────────────────────────────────────────────────────────────────┘

The invariant a reviewer should hold us to

No two adjacent layers share a single point of failure for the same class of attack.

Worked examples:

A cross-tenant request that bypasses L7 (a bug in the bus tenant policy) still hits L9 (Postgres Row-Level Security refuses the read).
An unauthorised caller that bypasses L4 (a forgotten RequireRole) still hits L3 (JWT verify failed before they could send anything).
A hallucinating LLM that bypasses L7’s classification check still gets the disclaimer added at L11 and is incident-logged.

Threat model — fifteen threats, each mapped to the layer that catches it

#	Threat	Caught by
T1	Cross-tenant data leak via missing `WHERE user_id = $1`	RLS (L9) + bus tenant policy (L7)
T2	Privilege escalation by forging or replaying a JWT	Short TTL + HS256 signature + audience check; passkeys for MFA
T3	Confused-deputy: agent A’s token used to read agent B’s data	RFC 8693 audience-scoped exchanged tokens
T4	Prompt injection that pivots a tool call	`PromptInjectionPolicy`, output schema validation, tool allowlist
T5	Hallucinated payment / KYC verdict	Deterministic agent core; LLM is narration only
T6	Audit log tampering by an insider with DB write access	Hash chain + WORM external sink
T7	Untriaged AI-generated agent in production traffic	Tier promotion gate (L7) defaults to `TierPrototype`
T8	LLM provider data-residency violation	`sovereignty.ProviderRegistry.Allowed()`
T9	Runaway autonomy: agent loops until budget exhausted	Budget + circuit + deadline wrappers (L10)
T10	Single-vendor outage takes down the customer-facing path	Fallback agents + BCP drill
T11	KEK compromise	Per-row `kek_id`, KMS-pluggable resolver, rotation playbook
T12	Service-to-service identity spoofing inside the mesh	SPIFFE/SVID identity + mTLS
T13	Insider browsing all incidents via the admin UI	Admin-only routes + audit on every read
T14	Supply-chain attack on a third-party model provider	AIBOM with provenance + adversarial corpus on every release
T15	Adversarial input designed to evade safety scoring	Plugin chain with all-of mode; multi-vendor scoring

Fifteen defence-in-depth invariants the test suite enforces

If any of these break, the security posture has regressed:

Every customer-facing route requires authentication
Admin-only routes require the admin role
Every customer-facing message carries tenant_id
expected_tenant ≠ tenant_id denies the message
Cross-tenant DB read returns zero rows
A Sketch-tier agent cannot serve customer traffic
An undeclared-tier agent defaults to Prototype
A two-hop token exchange preserves user as Subject and stacks the actor chain
A fallback request carries the original tenant metadata
The audit chain detects tampering
RLS internals never leak into the public UI
The /ai-inventory fetch is admin-gated in the JS
The tier field is present and stable in the inventory JSON
An invalidated user has no cached exchanged tokens
An agent ID is unique across the agent tree

go test ./... -count=1

A failing invariant is a security regression. Treat it the same way you would treat a failing compile — revert first, debug second.

Why this matters in regulated finance

RBI FREE-AI Rec 16 (autonomous systems), Rec 19 (cybersecurity), Rec 22 (tamper-evident audit) all expect this kind of layered defence by name. GCP PCSE §3.3 (“Implementing security and privacy controls for AI/ML systems to protect against unintentional exploitation of data or models”) is the same idea in a different vocabulary.

The point isn’t to claim compliance with one framework. It’s that when a reviewer from any framework asks “what catches X?”, the answer is two layers deep, not one.

Repo: github.com/c2siorg/genie
Canonical security reference: docs/ai-governance-security.md (the full document this post draws from)
PCSE map: docs/gcp-pcse-mapping.md
FREE-AI map: docs/free-ai-mapping.md

Learn more

Agentic security in production — operational playbook
AI governance — policy-as-code
Agentic architecture — MARA patterns