RAG — Blog — Pratik Dhanave

Apr 08, 2026 · Engineering

Ardan Ultimate AI #32 — Embedded React chat over RAG (Go backend + bundled UI)

A complete chat application: Go backend with RAG, React frontend, single binary. Showed me how to ship a full-stack AI demo without a separate frontend deployment.

Ardan LabsGoReactRAGAI

Apr 06, 2026 · Engineering

Ardan Ultimate AI #30 — PDF extraction with Docling + LLM

PDFs are the format that breaks every RAG pipeline. Docling is the IBM-research extractor that handles layout, tables, and figures. The example wires Docling + LLM to make PDFs usable.

Ardan LabsGoRAGPDFDocling

Apr 05, 2026 · Engineering

Ardan Ultimate AI #29 — Chat over transcribed video chunks

Transcribe a video, chunk by timestamp, embed each chunk, RAG-style chat over the result. The shape that powers "ask questions about this meeting recording."

Ardan LabsGoRAGVideoWhisper

Apr 04, 2026 · Engineering

Ardan Ultimate AI #28 — Image search via a vision model + pgvector

Generate a text description of an image with a vision LLM, embed the description, store in pgvector. Search becomes "find images that match this query" — works surprisingly well.

Ardan LabsGoRAGVisionpgvector

Apr 01, 2026 · Engineering

Ardan Ultimate AI #25 — Poisoned-document attacks on RAG and defenses

A RAG pipeline that ingests user-supplied documents is a prompt-injection vector. An attacker uploads a document with hidden instructions; the LLM retrieves it and follows them. Defense: input filtering, content classification, output verification.

Ardan LabsGoSecurityRAG

Mar 28, 2026 · Engineering

Ardan Ultimate AI #21 — Adaptive retrieval (decide whether to RAG at all)

Not every question needs retrieval. A classifier gates RAG: chat or general knowledge questions skip it; factual or document-grounded questions trigger it. Saves latency and tokens on the simple half of queries.

Ardan LabsGoRAGCost Optimisation

Mar 18, 2026 · Engineering

Ardan Ultimate AI #11 — RAG performance: parallel and batched embeddings, response cache

A simple RAG pipeline embeds documents one at a time. The performant version batches the embeddings, parallelises the chunks, and caches the responses. Throughput goes up 5-10×.

Ardan LabsGoRAGPerformance

Mar 17, 2026 · Engineering

Ardan Ultimate AI #10 — Interactive RAG REPL end-to-end

Tie all the RAG pieces together into one interactive REPL. Type a question, see the retrieval, see the answer, ask follow-ups. The shape of every "chat with your docs" demo.

Ardan LabsGoRAGREPL

Mar 16, 2026 · Engineering

Ardan Ultimate AI #09 — Debugging retrieval in isolation (K and threshold)

When RAG gives wrong answers, the problem is usually retrieval, not the LLM. The example isolates the retrieval step so you can see exactly what chunks come back for a given query, with what scores, and tune K and the similarity threshold accordingly.

Ardan LabsGoRAGDebugging

Mar 15, 2026 · Engineering

Ardan Ultimate AI #08 — End-to-end RAG pipeline over a Go notebook

Ingest → embed → store → retrieve → answer. The full pipeline applied to Bill Kennedy's Go notebook. The result: a system that answers "how do channels work?" with quotes from the source material.

Ardan LabsGoRAGPipeline

Mar 14, 2026 · Engineering

Ardan Ultimate AI #07 — Ingesting a Go notebook into pgvector

The ingestion step that turns a corpus into a vector database. Chunk the source, embed each chunk, store with metadata. The pre-work without which RAG is impossible.

Ardan LabsGoRAGIngestionpgvector

Mar 13, 2026 · Engineering

Ardan Ultimate AI #06 — pgvector nearest-neighbour search

pgvector adds vector similarity to Postgres. The example shows the schema, the indexes, the query, and what an ANN index buys you over a brute-force scan.

Ardan LabsGopgvectorPostgreSQLRAG

Mar 12, 2026 · Engineering

Ardan Ultimate AI #05 — The same question with and without RAG

Side-by-side comparison: ask the LLM a domain question with no context, then ask with retrieved context. The without-RAG answer is plausible nonsense. The with-RAG answer is correct. The example that motivates everything else in the course.

Ardan LabsGoRAGFoundations

Feb 25, 2026 · Engineering

GraphRAG — when a knowledge graph beats vector search

Vector search treats every chunk as independent. GraphRAG models the relationships between entities, communities, and concepts. For corpus-spanning questions ("what's the relationship between X and Y"), graph wins.

GraphRAGRAGKnowledge Graph

Feb 23, 2026 · Engineering

HyDE — generate a hypothetical answer to improve retrieval

Embedding a question and embedding an answer often produce different vectors. HyDE generates a hypothetical answer to the question, embeds *that*, and retrieves on it. Retrieval quality goes up disproportionately.

RAGHyDERetrieval

Feb 22, 2026 · Engineering

Self-RAG and CRAG — when to retrieve, when to skip, when to correct

Naive RAG retrieves on every query. Self-RAG decides whether to retrieve. CRAG decides whether the retrieved content is good enough or needs corrective retrieval. Two papers; both worth implementing.

RAGSelf-RAGCRAGRetrieval

Feb 21, 2026 · Engineering

Multilingual RAG for India — Bhashini hooks and cross-lingual retrieval

An Indian banking deployment needs to handle Hindi, Marathi, Tamil, Bengali, and English in the same retrieval pipeline. Bhashini (the government's language stack) plus cross-lingual embeddings make it tractable.

RAGMultilingualBhashiniIndic Languages

#RAG