April 9, 2026 · 2 min read
The course wrap-up: a Jupyter notebook driven by Go, using GoMLX for tensor ops and GoNB as the kernel. Showed me how to do exploratory Go AI work in the same shape data scientists already use.
Ardan LabsGoJupyterGoMLXAI
April 8, 2026 · 2 min read
A complete chat application: Go backend with RAG, React frontend, single binary. Showed me how to ship a full-stack AI demo without a separate frontend deployment.
Ardan LabsGoReactRAGAI
April 7, 2026 · 2 min read
Cursor / Claude Code in 600 lines of Go. The agent has read/write/search tools over a project directory and a loop that lets it iterate on its own work.
Ardan LabsGoAgentsCoding Agents
April 6, 2026 · 2 min read
PDFs are the format that breaks every RAG pipeline. Docling is the IBM-research extractor that handles layout, tables, and figures. The example wires Docling + LLM to make PDFs usable.
Ardan LabsGoRAGPDFDocling
April 5, 2026 · 2 min read
Transcribe a video, chunk by timestamp, embed each chunk, RAG-style chat over the result. The shape that powers "ask questions about this meeting recording."
Ardan LabsGoRAGVideoWhisper
April 4, 2026 · 2 min read
Generate a text description of an image with a vision LLM, embed the description, store in pgvector. Search becomes "find images that match this query" — works surprisingly well.
Ardan LabsGoRAGVisionpgvector
April 3, 2026 · 2 min read
An agent that can call tools to call tools can drift indefinitely. The escalation budget caps depth and cost; the audit trail records every step so you can replay what the agent did.
Ardan LabsGoAgentsSecurityAudit
April 2, 2026 · 2 min read
An LLM that controls the output can embed malicious HTML, exfiltrate data via crafted links, or inject prompt-stealing payloads. Sanitisation is the defense; the example shows what to allow and what to strip.
Ardan LabsGoSecurityAI
April 1, 2026 · 2 min read
A RAG pipeline that ingests user-supplied documents is a prompt-injection vector. An attacker uploads a document with hidden instructions; the LLM retrieves it and follows them. Defense: input filtering, content classification, output verification.
Ardan LabsGoSecurityRAG
March 31, 2026 · 2 min read
Giving an LLM a `run_command` tool is convenient and terrifying. The hardened version: allow-listed binaries, argument scrubbing, RBAC per user, audit per invocation.
Ardan LabsGoSecurityAgents
March 30, 2026 · 2 min read
The single biggest LLM security risk. The example walks through both forms (direct from user input, indirect via retrieved content) and the layered defenses: system prompt isolation, content classification, output validation, structured tool schemas.
Ardan LabsGoSecurityPrompt Injection
March 29, 2026 · 2 min read
Most queries are simple. A cascading router tries a small/fast/cheap model first; if confidence is low or the task is hard, it escalates to a larger one. Costs collapse without hurting quality.
Ardan LabsGoLLM OpsCost Optimisation
March 28, 2026 · 2 min read
Not every question needs retrieval. A classifier gates RAG: chat or general knowledge questions skip it; factual or document-grounded questions trigger it. Saves latency and tokens on the simple half of queries.
Ardan LabsGoRAGCost Optimisation
March 27, 2026 · 2 min read
Exact-match caching misses paraphrases. "What is the refund policy?" and "How do refunds work?" should both hit the same cached answer. Semantic cache embeds queries and matches by similarity.
Ardan LabsGoCachingCost Optimisation
March 26, 2026 · 2 min read
Run a small draft model to predict several tokens at once; verify them in a single pass with the large model. Latency drops without quality dropping. The technique production LLM serving uses but most application engineers don't see.
Ardan LabsGoLLM OpsPerformance
March 25, 2026 · 2 min read
A long chat reprocesses the entire history on every turn. Prefix caching lets the LLM serve the cached KV-cache prefix from the previous turn and only compute the new suffix. Massive latency win on long conversations.
Ardan LabsGoLLM OpsPerformance
March 24, 2026 · 2 min read
Model Context Protocol standardises tool calling across LLMs. The example builds both sides: an MCP server exposing tools, and an agent that calls them. Works the same against any MCP-compatible LLM.
Ardan LabsGoMCPAgents
March 23, 2026 · 2 min read
A panicking tool kills the agent loop. A slow tool blocks the loop forever. The example shows the boring-but-essential wrappers: recover, deadlines, structured errors.
Ardan LabsGoAgentsReliability
March 22, 2026 · 2 min read
Give an LLM a SQL tool, watch it write delete statements. The read-only version: parse the generated SQL, refuse anything that isn't SELECT, validate against an allow-listed schema, run with a strict timeout.
Ardan LabsGoSQLAgentsSecurity
March 21, 2026 · 2 min read
Stream the agent's reasoning and tool calls to the UI as they happen. The user sees "thinking about X, calling tool Y, got result Z, now answering..." — dramatically better UX than waiting for the final answer.
Ardan LabsGoAgentsStreamingUX
March 20, 2026 · 2 min read
The smallest possible multi-tool agent. The loop is 30 lines of Go and shows exactly what an "agent" is — there's no magic, just a structured back-and-forth between the LLM and a set of tools until the model says stop.
Ardan LabsGoAgents
March 19, 2026 · 2 min read
The tool-calling dance: the LLM emits a structured tool call → application runs the tool → application appends the result → the LLM uses it in the next turn. Two phases. Everything else is detail.
Ardan LabsGoTool CallingLLM
March 18, 2026 · 2 min read
A simple RAG pipeline embeds documents one at a time. The performant version batches the embeddings, parallelises the chunks, and caches the responses. Throughput goes up 5-10×.
Ardan LabsGoRAGPerformance
March 17, 2026 · 2 min read
Tie all the RAG pieces together into one interactive REPL. Type a question, see the retrieval, see the answer, ask follow-ups. The shape of every "chat with your docs" demo.
Ardan LabsGoRAGREPL
March 16, 2026 · 2 min read
When RAG gives wrong answers, the problem is usually retrieval, not the LLM. The example isolates the retrieval step so you can see exactly what chunks come back for a given query, with what scores, and tune K and the similarity threshold accordingly.
Ardan LabsGoRAGDebugging
March 15, 2026 · 2 min read
Ingest → embed → store → retrieve → answer. The full pipeline applied to Bill Kennedy's Go notebook. The result: a system that answers "how do channels work?" with quotes from the source material.
Ardan LabsGoRAGPipeline
March 14, 2026 · 2 min read
The ingestion step that turns a corpus into a vector database. Chunk the source, embed each chunk, store with metadata. The pre-work without which RAG is impossible.
Ardan LabsGoRAGIngestionpgvector
March 13, 2026 · 2 min read
pgvector adds vector similarity to Postgres. The example shows the schema, the indexes, the query, and what an ANN index buys you over a brute-force scan.
Ardan LabsGopgvectorPostgreSQLRAG
March 12, 2026 · 2 min read
Side-by-side comparison: ask the LLM a domain question with no context, then ask with retrieved context. The without-RAG answer is plausible nonsense. The with-RAG answer is correct. The example that motivates everything else in the course.
Ardan LabsGoRAGFoundations
March 11, 2026 · 2 min read
Token-by-token streaming over Server-Sent Events. The Go HTTP handler is short; the UX win is huge. The pattern every chat app needs.
Ardan LabsGoStreamingSSELLM
March 10, 2026 · 2 min read
Before RAG and tools, the original way to give an LLM extra information was to inject it into the prompt. The example shows the right way to format injected context and what the LLM does (and doesn't) pay attention to.
Ardan LabsGoPromptingLLM
March 9, 2026 · 2 min read
Hand-crafting vectors stops scaling at about 10 dimensions. LLM-generated embeddings give you a 1024-dim vector that captures semantic meaning. The example shows how to generate them and what they're good for.
Ardan LabsGoEmbeddingsFoundations
March 8, 2026 · 2 min read
The foundation. Build vectors by hand for a few words, compute cosine similarity, see why "cat" and "dog" come out closer than "cat" and "car." Demystifies everything that comes after.
Ardan LabsGoVectorsFoundations