You built a RAG pipeline. Now you need to decide where to store embeddings. Three options dominate: pgvector (self-hosted on PostgreSQL), AlloyDB AI (Google Cloud managed), and Pinecone (fully managed SaaS). Each makes a different trade-off between cost, latency, and operational burden.
| See also: BigQuery analytics | Cloud Spanner design | Voice AI embeddings | Healthcare AI | RAG patterns |
The decision is nuanced, and it depends on whether you want to own the infrastructure or rent it.
| Aspect | pgvector | AlloyDB AI | Pinecone |
|---|---|---|---|
| Setup time | 1 week | 1 day | 1 hour |
| Query latency | 50-100ms | 20-50ms | 10-20ms |
| Cost @ 1M vectors | $800/mo | $2K/mo | $3K/mo |
| Operational overhead | High (you manage) | Medium (Google manages) | None (Pinecone manages) |
| Data residency control | Full | Full | Limited |
| Metadata filtering | Native SQL | Native SQL | Limited |
You run PostgreSQL yourself. pgvector is a Postgres extension that adds vector search.
-- Enable the extension
CREATE EXTENSION vector;
-- Create embeddings table
CREATE TABLE documents (
id BIGSERIAL PRIMARY KEY,
content TEXT,
embedding vector(1536), -- OpenAI embeddings
metadata JSONB,
created_at TIMESTAMP DEFAULT NOW()
);
-- Create index for fast search
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
Query a vector:
SELECT id, content,
1 - (embedding <=> '[0.1, 0.2, ...]') AS similarity
FROM documents
WHERE metadata->>'category' = 'finance'
ORDER BY embedding <=> '[0.1, 0.2, ...]'
LIMIT 10;
Strengths:
Weaknesses:
Google’s managed PostgreSQL offering with built-in vector search. Same pgvector under the hood, but Google handles ops.
from google.cloud.sql.connector import Connector
connector = Connector()
async def query_documents(embedding: list[float]) -> list[dict]:
async with connector.connect(
"projects/my-project/instances/alloydb-instance",
driver="asyncpg",
user="postgres",
db="vectors_db",
) as conn:
results = await conn.fetch("""
SELECT id, content,
1 - (embedding <=> $1) AS similarity
FROM documents
WHERE metadata->>'category' = 'finance'
ORDER BY embedding <=> $1
LIMIT 10
""", embedding)
return results
Strengths:
Weaknesses:
Pinecone is purpose-built for vectors. You don’t touch infrastructure at all.
from pinecone import Pinecone
pc = Pinecone(api_key="...")
index = pc.Index("documents")
# Upsert vectors
index.upsert(vectors=[
("doc_1", [0.1, 0.2, ...], {"category": "finance"}),
("doc_2", [0.3, 0.4, ...], {"category": "health"}),
])
# Query
results = index.query(
vector=[0.1, 0.2, ...],
top_k=10,
filter={"category": "finance"},
)
# Returns: [("doc_1", 0.95), ("doc_3", 0.87), ...]
Strengths:
Weaknesses:
Use pgvector if:
Use AlloyDB AI if:
Use Pinecone if:
Bancnet (Open Banking) needed:
Decision: pgvector on self-managed PostgreSQL in Azure.
Results:
The trade-off: We own the HA failover, backups, and scaling. But the 37% latency gain and cost savings justified it.
Tags: #VectorDatabases #RAG #Embeddings #Postgres #Performance
Published: June 2026
Author: Pratik Dhanave
Related Projects: Bancnet RAG, Bodh embeddings