May 10, 2026 · 5 min read
A bulk migration takes hours; the application can't be offline that long. CDC keeps the source and destination in sync while the bulk runs, and a quick cutover swaps traffic. The handoff between bulk and CDC is where most migrations go wrong.
SpannerDatastreamPub/SubDataflowMigration
May 9, 2026 · 4 min read
Notes from contributing to Bloom — SC Ventures / Standard Chartered's policy-driven secure cloud provisioning platform. Push-to-deploy self-service for bank engineering teams, with the audit controls baked in.
TerraformBankingSOC 2ISO 27001AWSAzure
May 8, 2026 · 4 min read
If you encode each SOC 2 control as a Terraform module, the audit becomes a check against module usage rather than a per-resource review. Notes from Bloom and adjacent projects.
SOC 2TerraformComplianceDevOps
May 7, 2026 · 4 min read
Notes from integrating OpenTelemetry into airshipit, an open-source bare-metal Kubernetes lifecycle project with contributions from Ericsson, AT&T, Microsoft, and others. The hard part wasn't OTel; it was making distributed traces useful across foreign code.
OpenTelemetryKubernetesOpen SourceObservability
May 6, 2026 · 4 min read
The azure-service-operator project lets you declare Azure resources as Kubernetes objects. Notes from the multi-vendor collaboration shape: how decisions got made, what slowed us down, what shipped despite it.
AzureKubernetesOpen SourceOperators
May 5, 2026 · 5 min read
The Picnic social platform served 1M+ users across a graph of Go microservices behind a GraphQL gateway. The latency win came from a counter-intuitive move: fewer services, tighter contracts.
GogRPCGraphQLMicroservicesPerformance
May 4, 2026 · 5 min read
Test coverage and observability are the boring infrastructure that makes the interesting changes safe. Notes on how the Picnic team built both, and the on-call experience they enabled.
TestingPrometheusObservabilityGoSRE
May 3, 2026 · 5 min read
The transaction engine had to absorb 30K+ TPS across partner integrations, never lose a transaction, and survive partial failures. The architecture: Go, Kafka, Pub/Sub, Redis, K8s, with idempotency at every layer.
KubernetesKafkaGoRedisPaymentsPCI
May 2, 2026 · 5 min read
A single layer of idempotency will eventually fail. Three independent layers gives you a margin. Here is the pattern that worked across ingest, worker, and emit boundaries.
IdempotencyDistributed SystemsPaymentsGo
May 1, 2026 · 4 min read
Status-code-based dispatch made every worker grow a longer and longer switch. Normalising every partner-specific error into an enumerated set let the orchestration logic stop changing as new partners landed.
GoDistributed SystemsArchitecture