March 30, 2026 · 2 min read

Ardan Ultimate AI #23 — Direct and indirect prompt injection, plus defenses

The single biggest LLM security risk. The example walks through both forms (direct from user input, indirect via retrieved content) and the layered defenses: system prompt isolation, content classification, output validation, structured tool schemas.

Ardan LabsGoSecurityPrompt Injection
March 27, 2026 · 2 min read

Ardan Ultimate AI #20 — Embedding-based semantic cache

Exact-match caching misses paraphrases. "What is the refund policy?" and "How do refunds work?" should both hit the same cached answer. Semantic cache embeds queries and matches by similarity.

Ardan LabsGoCachingCost Optimisation
March 26, 2026 · 2 min read

Ardan Ultimate AI #19 — Speculative decoding with a draft model

Run a small draft model to predict several tokens at once; verify them in a single pass with the large model. Latency drops without quality dropping. The technique production LLM serving uses but most application engineers don't see.

Ardan LabsGoLLM OpsPerformance
March 25, 2026 · 2 min read

Ardan Ultimate AI #18 — Incremental message caching (IMC) for chat

A long chat reprocesses the entire history on every turn. Prefix caching lets the LLM serve the cached KV-cache prefix from the previous turn and only compute the new suffix. Massive latency win on long conversations.

Ardan LabsGoLLM OpsPerformance
March 24, 2026 · 2 min read

Ardan Ultimate AI #17 — Building an agent over an MCP server

Model Context Protocol standardises tool calling across LLMs. The example builds both sides: an MCP server exposing tools, and an agent that calls them. Works the same against any MCP-compatible LLM.

Ardan LabsGoMCPAgents
March 22, 2026 · 2 min read

Ardan Ultimate AI #15 — A read-only NL→SQL tool

Give an LLM a SQL tool, watch it write delete statements. The read-only version: parse the generated SQL, refuse anything that isn't SELECT, validate against an allow-listed schema, run with a strict timeout.

Ardan LabsGoSQLAgentsSecurity