Cost Optimisation — Blog

May 18, 2026 · Engineering

The MERGE pattern that cost ten times more than INSERT-then-UPDATE — a ₹100 Cr lesson

What looked like an idiomatic BigQuery MERGE was scanning the full target table on every batch. The fix was syntactic, not architectural — and it was the single biggest contributor to a 57% data-warehouse cost reduction across the Tata Group engagement.

BigQueryFinOpsSQLCost Optimisation

May 17, 2026 · Engineering

The 57% number — how we cut the Tata Group BigQuery bill in half

₹100 Cr / ~$12M in proven savings across a year-plus engagement. The four levers that did the heavy lifting, the lever I expected to win that didn't, and the post-engagement playbook that became a Searce managed service.

BigQueryFinOpsGCPTata GroupCost Optimisation

Mar 29, 2026 · Engineering

Ardan Ultimate AI #22 — Cascading model router (cheap first, expensive on miss)

Most queries are simple. A cascading router tries a small/fast/cheap model first; if confidence is low or the task is hard, it escalates to a larger one. Costs collapse without hurting quality.

Ardan LabsGoLLM OpsCost Optimisation

Mar 28, 2026 · Engineering

Ardan Ultimate AI #21 — Adaptive retrieval (decide whether to RAG at all)

Not every question needs retrieval. A classifier gates RAG: chat or general knowledge questions skip it; factual or document-grounded questions trigger it. Saves latency and tokens on the simple half of queries.

Ardan LabsGoRAGCost Optimisation

Mar 27, 2026 · Engineering

Ardan Ultimate AI #20 — Embedding-based semantic cache

Exact-match caching misses paraphrases. "What is the refund policy?" and "How do refunds work?" should both hit the same cached answer. Semantic cache embeds queries and matches by similarity.

Ardan LabsGoCachingCost Optimisation

Feb 17, 2026 · Engineering

Cost-aware agent dispatch — when the cheap agent is enough

Not every query needs the production agent. A cost-aware dispatcher decides whether to route to the cheap-and-fast agent or the expensive-and-thorough one. Same UX, dramatically lower bill.

AgentsCost OptimisationLLM Ops

Feb 09, 2026 · Engineering

Egress costs — the gotcha that kills cloud-arbitrage plans

Cross-cloud data movement is billed by the GB. The bill is invisible until it isn't. A multi-region or multi-cloud architecture that doesn't model egress costs in design will discover them in production.

Multi-CloudCost OptimisationNetworking

#Cost Optimisation