Ardan Ultimate AI #08 — End-to-end RAG pipeline over a Go notebook

Field notes from working through example 08 of Ardan Labs’ Ultimate AI course by Bill Kennedy and Florin Pățan (Apache 2.0). My fork: PratikDhanave/ai-training. Thank you Bill and Florin for teaching this material — the patterns in this post are derived from the course; the production reflections at the end are mine.

What the example teaches

The full RAG pipeline, stitched together against a real document (the Ardan Go notebook). All the pieces from previous examples now run end-to-end:

Chunk the notebook by section.
Embed each chunk.
Insert into pgvector with metadata (page number, section title).
At query time: embed query, nearest-K search, build prompt with citations, LLM answers.

What it looks like

The orchestration is straightforward once the pieces work:

func answer(ctx context.Context, query string) (Answer, error) {
    queryEmb, err := embed.Generate(ctx, query)
    if err != nil { return Answer{}, err }

    hits := db.NearestK(ctx, queryEmb, k=5)
    context := buildContext(hits)

    prompt := fmt.Sprintf(answerPromptTemplate, query, context)
    resp, err := llm.Generate(ctx, prompt)
    if err != nil { return Answer{}, err }

    return Answer{
        Text:     resp.Text,
        Citations: extractCitations(resp.Text, hits),
    }, nil
}

What I learned

The plumbing is straightforward; the prompt is where the work is. A bad prompt template makes the LLM ignore the context. A good one makes it cite specifically. Iterate on the prompt with a small evaluation set before scaling.

Citation extraction is its own problem. The LLM produces text that references “[source 2]”; the application has to map “[source 2]” back to the original chunk to render the citation as a link. The example shows a robust regex-based extractor.

Production connection

This is the shape of Genie’s RAG for the document Q&A endpoint. Same chunking, same embedding model, same prompt template. Direct lift.

Credit & reference. This post is field notes on example 08 from Ardan Labs’ Ultimate AI by Bill Kennedy + Florin Pățan, licensed Apache 2.0. The original example: cmd/examples/example08-rag-pipeline/. My fork with notes: PratikDhanave/ai-training. Highly recommend the course for anyone building AI applications in Go — the material is rigorous and the Kronk + yzma + llama.cpp pipeline gives you hardware-accelerated local inference end-to-end. Thank you, Bill and Florin.