Ardan Ultimate AI #04 — Streaming chat completions via SSE
Token-by-token streaming over Server-Sent Events. The Go HTTP handler is short; the UX win is huge. The pattern every chat app needs.
All 127 posts in date order, newest first. ← back to topics
Token-by-token streaming over Server-Sent Events. The Go HTTP handler is short; the UX win is huge. The pattern every chat app needs.
Before RAG and tools, the original way to give an LLM extra information was to inject it into the prompt. The example shows the right way to format injected context and what the LLM does (and doesn't) pay attention to.
Hand-crafting vectors stops scaling at about 10 dimensions. LLM-generated embeddings give you a 1024-dim vector that captures semantic meaning. The example shows how to generate them and what they're good for.
The foundation. Build vectors by hand for a few words, compute cosine similarity, see why "cat" and "dog" come out closer than "cat" and "car." Demystifies everything that comes after.
HS256 JWT issue + verify + audience check + expiry in pure stdlib. Why pulling a third-party JWT library is the wrong call for security-critical code.
Symmetric vs asymmetric JWT signing. The choice changes what fails when a key leaks, who can verify, and how rotation works.
PKCE is the load-bearing mitigation against authorization-code interception. The Go implementation is short; the parts every SPA gets wrong are documented here.
The flow where the device has no browser. User authenticates on their phone; the device polls until they're done. Implementation patterns in Go from the Genie reference.
Passkeys are FIDO2; FIDO2 is the spec; Ed25519 is the signature algorithm. The full registration + assertion flow in 200 lines of stdlib Go.
Dual-identity tokens for the agent → MCP server → upstream API chain. Subject stays the user; Actor identifies the agent acting on the user's behalf. Walked through with a worked clinical example.