· 4 min read · ← All posts
Security Architecture Multi-Agent AI FREE-AI

Defence in depth for agentic AI — the eleven-layer envelope

The mental model that says: no two adjacent layers in your stack should share a single point of failure for the same class of attack. If layer N is bypassed, layer N+1 catches what slipped through.

Why a single line of defence fails

Every application has bugs. If the only thing standing between user A and user B’s data is the WHERE user_id = $1 clause in a hand-written SQL query, then a single missed filter is a cross-tenant leak. An if !isAdmin { return 403 } check that someone copied wrong is privilege escalation. A sanitize(input) call that someone removed during a refactor is an injection.

Defence in depth assumes every layer has bugs. The job of the stack is to ensure that a bug in any one layer is contained by the next.

The eleven layers

┌─────────────────────────────────────────────────────────────────────┐
│ L1. TLS termination                                                  │
│     — provided by the ingress                                        │
├─────────────────────────────────────────────────────────────────────┤
│ L2. HTTP middleware: rate limit, request ID, OTel trace start        │
├─────────────────────────────────────────────────────────────────────┤
│ L3. JWT verify + claims extract                                      │
├─────────────────────────────────────────────────────────────────────┤
│ L4. Role gate (optional per route)                                   │
├─────────────────────────────────────────────────────────────────────┤
│ L5. Handler validation: schema, classification, size                 │
├─────────────────────────────────────────────────────────────────────┤
│ L6. Message enrichment: tenant_id, expected_tenant, user_roles       │
├─────────────────────────────────────────────────────────────────────┤
│ L7. Orchestrator policy stack (CompositePolicy):                     │
│     RBAC → Tenant → Tier → Classification → PromptInjection →        │
│     PII → Schema → Consent → Explainability →                        │
│     MaxContentLength → board-DSL rules                               │
├─────────────────────────────────────────────────────────────────────┤
│ L8. Agent HandleMessage (the actual work)                            │
├─────────────────────────────────────────────────────────────────────┤
│ L9. DB query under WithTenant — RLS enforces tenant column           │
├─────────────────────────────────────────────────────────────────────┤
│ L10. LLM call wrapper: deadline, circuit, budget, sovereignty        │
├─────────────────────────────────────────────────────────────────────┤
│ L11. Audit append + hash chain + Annexure VI incident if applicable  │
└─────────────────────────────────────────────────────────────────────┘

The invariant a reviewer should hold us to

No two adjacent layers share a single point of failure for the same class of attack.

Worked examples:

Threat model — fifteen threats, each mapped to the layer that catches it

# Threat Caught by
T1 Cross-tenant data leak via missing WHERE user_id = $1 RLS (L9) + bus tenant policy (L7)
T2 Privilege escalation by forging or replaying a JWT Short TTL + HS256 signature + audience check; passkeys for MFA
T3 Confused-deputy: agent A’s token used to read agent B’s data RFC 8693 audience-scoped exchanged tokens
T4 Prompt injection that pivots a tool call PromptInjectionPolicy, output schema validation, tool allowlist
T5 Hallucinated payment / KYC verdict Deterministic agent core; LLM is narration only
T6 Audit log tampering by an insider with DB write access Hash chain + WORM external sink
T7 Untriaged AI-generated agent in production traffic Tier promotion gate (L7) defaults to TierPrototype
T8 LLM provider data-residency violation sovereignty.ProviderRegistry.Allowed()
T9 Runaway autonomy: agent loops until budget exhausted Budget + circuit + deadline wrappers (L10)
T10 Single-vendor outage takes down the customer-facing path Fallback agents + BCP drill
T11 KEK compromise Per-row kek_id, KMS-pluggable resolver, rotation playbook
T12 Service-to-service identity spoofing inside the mesh SPIFFE/SVID identity + mTLS
T13 Insider browsing all incidents via the admin UI Admin-only routes + audit on every read
T14 Supply-chain attack on a third-party model provider AIBOM with provenance + adversarial corpus on every release
T15 Adversarial input designed to evade safety scoring Plugin chain with all-of mode; multi-vendor scoring

Fifteen defence-in-depth invariants the test suite enforces

If any of these break, the security posture has regressed:

  1. Every customer-facing route requires authentication
  2. Admin-only routes require the admin role
  3. Every customer-facing message carries tenant_id
  4. expected_tenant ≠ tenant_id denies the message
  5. Cross-tenant DB read returns zero rows
  6. A Sketch-tier agent cannot serve customer traffic
  7. An undeclared-tier agent defaults to Prototype
  8. A two-hop token exchange preserves user as Subject and stacks the actor chain
  9. A fallback request carries the original tenant metadata
  10. The audit chain detects tampering
  11. RLS internals never leak into the public UI
  12. The /ai-inventory fetch is admin-gated in the JS
  13. The tier field is present and stable in the inventory JSON
  14. An invalidated user has no cached exchanged tokens
  15. An agent ID is unique across the agent tree
go test ./... -count=1

A failing invariant is a security regression. Treat it the same way you would treat a failing compile — revert first, debug second.

Why this matters in regulated finance

RBI FREE-AI Rec 16 (autonomous systems), Rec 19 (cybersecurity), Rec 22 (tamper-evident audit) all expect this kind of layered defence by name. GCP PCSE §3.3 (“Implementing security and privacy controls for AI/ML systems to protect against unintentional exploitation of data or models”) is the same idea in a different vocabulary.

The point isn’t to claim compliance with one framework. It’s that when a reviewer from any framework asks “what catches X?”, the answer is two layers deep, not one.

Read more

Learn more

← Back to all posts