Ardan Ultimate AI #31 — A coding agent with file tools
Cursor / Claude Code in 600 lines of Go. The agent has read/write/search tools over a project directory and a loop that lets it iterate on its own work.
Posts about agents. ← All posts
Cursor / Claude Code in 600 lines of Go. The agent has read/write/search tools over a project directory and a loop that lets it iterate on its own work.
An agent that can call tools to call tools can drift indefinitely. The escalation budget caps depth and cost; the audit trail records every step so you can replay what the agent did.
Giving an LLM a `run_command` tool is convenient and terrifying. The hardened version: allow-listed binaries, argument scrubbing, RBAC per user, audit per invocation.
Model Context Protocol standardises tool calling across LLMs. The example builds both sides: an MCP server exposing tools, and an agent that calls them. Works the same against any MCP-compatible LLM.
A panicking tool kills the agent loop. A slow tool blocks the loop forever. The example shows the boring-but-essential wrappers: recover, deadlines, structured errors.
Give an LLM a SQL tool, watch it write delete statements. The read-only version: parse the generated SQL, refuse anything that isn't SELECT, validate against an allow-listed schema, run with a strict timeout.
Stream the agent's reasoning and tool calls to the UI as they happen. The user sees "thinking about X, calling tool Y, got result Z, now answering..." — dramatically better UX than waiting for the final answer.
The smallest possible multi-tool agent. The loop is 30 lines of Go and shows exactly what an "agent" is — there's no magic, just a structured back-and-forth between the LLM and a set of tools until the model says stop.
Dual-identity tokens for the agent → MCP server → upstream API chain. Subject stays the user; Actor identifies the agent acting on the user's behalf. Walked through with a worked clinical example.
Anthropic's A2A spec standardises how agents talk to other agents (not just tools). The Go client is small; the conceptual shift is what matters.
Google publishes a 12-pattern taxonomy for agent design. Most of them have direct corollaries in production code; one or two are best ignored. The mapping I've used.
Not every query needs the production agent. A cost-aware dispatcher decides whether to route to the cheap-and-fast agent or the expensive-and-thorough one. Same UX, dramatically lower bill.
Two agents can do the same job. One takes 200ms; the other takes 5 seconds. Pick by user-facing SLO, not by which agent is "better." The dispatcher pattern.