Optimus — the anti-pattern detector

A small internal tool that became the most-used part of the Searce FinOps practice. Here is what it does, why it works, and what it deliberately does not do.

The problem it solves

A FinOps review usually starts with a CSV of the top 200 expensive queries. Sorting them by cost is easy. Knowing what’s wrong with each is the slow part. Most queries are not pathological in obvious ways. They join the right tables. They use partitions. They just touch a lot of data because the business asks for a lot of data.

The 20 minutes per query goes into reading the query, opening the schema, checking partitioning, looking at the execution plan, imagining a rewrite, mocking it up, comparing. For 200 queries that’s 65+ hours of senior-engineer time. We can do better.

The shape

Optimus is two pieces:

┌──────────────────────────┐      ┌────────────────────────────┐
│  collector (Go)          │      │  recommender (Python)      │
│                          │      │                            │
│ - reads INFORMATION_     │ ───► │ - groups by query template │
│   SCHEMA over the last   │      │ - asks Gemini to classify  │
│   30 days                │      │   each template against a  │
│ - keeps top 500 by       │      │   catalog of anti-patterns │
│   bytes_billed           │      │ - emits a markdown report  │
│ - normalises to a query  │      │   per query with the       │
│   template               │      │   recommended rewrite      │
└──────────────────────────┘      └────────────────────────────┘

The collector is Go because it talks to BigQuery’s INFORMATION_SCHEMA, dedupes templates, and writes a compact JSONL. The recommender is Python because it’s the side that orchestrates Gemini calls and renders the report.

The catalog of anti-patterns

Gemini classifies against an enumerated catalog. The catalog isn’t “things Gemini might find”; it’s the explicit list we curated. The prompt looks like:

You are a BigQuery cost reviewer. Classify the query below against this
list of anti-patterns. Return JSON with the matching pattern id,
confidence (0-1), and a one-line rewrite hint.

Patterns:
  P1: SELECT * on a wide table when only N columns used
  P2: MERGE without partition predicate on target side of ON
  P3: Cross join via SELECT FROM a, b instead of explicit JOIN
  P4: GROUP BY ROLLUP / CUBE on high-cardinality dimension
  P5: Window function without PARTITION BY on a partitioned column
  P6: Subquery in WHERE that scans the same table twice
  P7: TIMESTAMP_TRUNC in JOIN key (forces re-shuffle)
  P8: Wildcard table _TABLE_SUFFIX filter applied after join
  ... (20 in total)

Closed catalog beats open prompt for three reasons: rates are stable across runs, output is parseable JSON, and engineers can audit the catalog independently of the LLM.

What Gemini is good at, and what it isn’t

Good at: pattern-matching the catalog against unfamiliar query shapes, and writing the one-line rewrite hint in plain English. The hit rate is around 80% on the patterns we encoded; the misses are mostly edge cases the catalog should grow to cover.

Not good at: deciding whether the rewrite is worth doing. That needs the cost number, the read frequency, the downstream impact. Optimus shows the cost-per-month next to the rewrite suggestion; the engineer makes the call.

Where the 57% number comes from

For one large engagement (Tata Group), Optimus surfaced 73 high-confidence anti-patterns across 41 templates. The team prioritised by cost_per_month × confidence and worked through the top 20 in two sprints. Those 20 accounted for ~38% of the engagement’s eventual 57% reduction; the remaining 19% came from reservation transitions, storage tier changes, and dashboard pre-aggregation that Optimus didn’t touch.

Why we kept it small

Tempting features we deliberately did not build:

Auto-apply the rewrite. No. The cost of the wrong rewrite hitting production is higher than the engineering time saved.
Slack bot that pings you when your team’s bill spikes. Useful but separate; the FinOps team already had a Looker dashboard for this.
A web UI. Markdown reports rendered in GitHub are fine. The audience is engineers; the audience reads markdown.

The tool stayed at ~600 lines of Go and ~400 of Python because the problem stayed small. It runs nightly in a Cloud Run job, costs ~$3/day in BigQuery scans and Gemini tokens, and ships a fresh report PR every Monday morning.

What’s next

Two extensions the team has talked about:

Storage anti-patterns. Same pipeline, but classifying tables by access pattern (partition pruning rate, average rows scanned per query, age of last write) and recommending tier moves.
Materialised view candidates. Detecting query shapes that would benefit from a MV that doesn’t yet exist.

Both follow the same shape: surface the data, ask Gemini to classify, render a markdown report, let the engineer decide.

The lesson, if there is one: an LLM is the right tool for the classification step. It is not the right tool for the apply step, the prioritise step, or the audit step. Keep it small, keep it catalog-driven, and the bill it generates is dwarfed by the bill it eliminates.