#Cost Optimisation

Posts about cost optimisation. ← All posts

A2AADKAGTAIAI GovernanceAIGPAMLAPI DesignAWSAadhaarAccountingAgentsAnomaly DetectionArchitectureArdan LabsAuditAudit LogAzureBCPBankingBedrockBenchmarksBhashiniBigQueryCRAGCachingCareerCase StudyClinical Decision SupportCloud ArchitectureCloud KMSCloud RunCoding AgentsCommunicationComplianceConcurrencyConfigCost OptimisationCryptographyCultureCures ActDSLData ResidencyDatabase DesignDatabase MigrationDatabase SecurityDataflowDatastreamDebuggingDeploymentDesign PatternDevOpsDeveloper ExperienceDevice FlowDistributed SystemsDoclingElevenLabsEmbeddingsEngineeringEntity ResolutionEnvoyEvaluationFHIRFREE-AIFinOpsFinTechFoundationsFraudGCPGDPRGKEGOMEMLIMITGSoCGeminiGenieGitHubGoGo 1.23GoMLXGoogle CloudGoogle Cloud NextGovernanceGrafanaGraphQLGraphRAGHIPAAHITLHL7 v2Healthcare ITHyDEIAPPISO 27001IdempotencyIdentity FederationIncident ResponseIndic LanguagesIngestionIntegrationJWTJupyterKMSKYCKafkaKnowledge GraphKubernetesLLMLLM OpsLLM-as-JudgeLatencyLendingLessons LearnedLocal AILoggingMAFMARAMCPML EngineeringMagenticMemoryMentorshipMicroservicesMiddlewareMigrationMulti-AgentMulti-Agent AIMulti-CloudMulti-LanguageMultilingualNPCINetworkingOAuthOPAOTelOWASPObservabilityOllamaOpen BankingOpen SourceOpenTelemetryOperationsOperatorsOpinionOrchestrationPAMPCSEPDFPKCEPasskeysPatternsPaymentsPerformancePipelinePolicyPolicy as CodePostgreSQLPrivacy EngineeringProductionPrometheusPrompt InjectionPromptingProtocolsProvider AbstractionPub/SubPythonRAGRBACRBIREPLRFC 8693ReactRedisRefactorRegistryRegulationReliabilityReservationsResilienceRetrievalRetrospectiveSAMLSLOSOC 2SPIFFESPIRESQLSRESSESagaSaudi ArabiaSchemaSecuritySecurity Command CenterSelf-RAGService MeshSoftware ArchitectureSpannerSpeakingState ManagementStdlibStorageStreamingTata GroupTerraformTestingTier PromotionToken BudgetingTool CallingToolsUAEUPIUXVectorsVertex AIVideoVisionVoice AIVotingWebAuthnWhisperWorkflowWorkflowsWorkload IdentityWorkload Identity FederationWritingZero-Trustembed.FSerrgroupgRPCiter.SeqmTLSpgvectorslog
· Engineering

The 57% number — how we cut the Tata Group BigQuery bill in half

₹100 Cr / ~$12M in proven savings across a year-plus engagement. The four levers that did the heavy lifting, the lever I expected to win that didn't, and the post-engagement playbook that became a Searce managed service.

· Engineering

Ardan Ultimate AI #20 — Embedding-based semantic cache

Exact-match caching misses paraphrases. "What is the refund policy?" and "How do refunds work?" should both hit the same cached answer. Semantic cache embeds queries and matches by similarity.

· Engineering

Cost-aware agent dispatch — when the cheap agent is enough

Not every query needs the production agent. A cost-aware dispatcher decides whether to route to the cheap-and-fast agent or the expensive-and-thorough one. Same UX, dramatically lower bill.

· Engineering

Egress costs — the gotcha that kills cloud-arbitrage plans

Cross-cloud data movement is billed by the GB. The bill is invisible until it isn't. A multi-region or multi-cloud architecture that doesn't model egress costs in design will discover them in production.