· 2 min read · ← All posts
Ardan Labs Go RAG Vision pgvector

Field notes from working through example 28 of Ardan Labs’ Ultimate AI course by Bill Kennedy and Florin Pățan (Apache 2.0). My fork: PratikDhanave/ai-training. Thank you Bill and Florin for teaching this material — the patterns in this post are derived from the course; the production reflections at the end are mine.

What the example teaches

Image search without training a custom model. For each image: ask a vision LLM to describe it (5-10 sentences), embed the description, store in pgvector. Query: embed the search text, nearest-neighbour over descriptions.

What it looks like

for _, img := range images {
    description := vision.Describe(ctx, img,
        "Describe this image in 5-10 sentences. Include objects, " +
        "colors, mood, and any text visible.")
    emb := embed.Generate(description)
    db.Insert(img.ID, img.URL, description, emb)
}

// Query
hits := db.Nearest(embed.Generate("a red car parked at sunset"), k=10)

What I learned

Description quality is the whole game. A vision LLM that describes “a man in a hat” loses to one that describes “a Black man in his 30s wearing a charcoal fedora, mid-laugh, against a brick wall under late-afternoon light.” The prompt for the description step is more important than the embedding model.

This beats CLIP for many use cases. CLIP needs the dual encoder; this approach gets you 80% of the value with off-the-shelf vision + text embedding. The 20% gap matters at scale; doesn’t matter for a prototype.

Production connection

For a media archive search use case, I’d reach for this first. The Docling work (post #30) is the document analogue; this is the image one. Combine them and you have a “search across everything” pipeline that doesn’t need bespoke training.


Credit & reference. This post is field notes on example 28 from Ardan Labs’ Ultimate AI by Bill Kennedy + Florin Pățan, licensed Apache 2.0. The original example: cmd/examples/example28-image-vision-rag/. My fork with notes: PratikDhanave/ai-training. Highly recommend the course for anyone building AI applications in Go — the material is rigorous and the Kronk + yzma + llama.cpp pipeline gives you hardware-accelerated local inference end-to-end. Thank you, Bill and Florin.

← Back to all posts