B2B Engineering Insights & Architectural Teardowns

Latency-aware proxy vs DNS: how to balance S3 load

DNS round-robin stops working under load when clients start caching responses. Agoda faced this issue at the object storage level and moved the balancing to a separate layer. The problem manifested during the increase in data workloads. S3-compatible endpoints used DNS round-robin to distribute traffic. In practice, clients cached DNS responses and continued to hit … Read more

Decomposing round-trip latency: how to separate database delays from network and middleware overhead

Request timeouts do not always indicate a problem in the database. Often, degradation is hidden in the path between the application and the DB. The problem manifests when database metrics appear stable, but clients experience timeouts. At the observation level, this looks like a contradiction: latency increases while database time remains the same. The reason … Read more

eBPF Profiling in Go: How Symbolization via gopclntab Transforms Addresses into Functions

The profiler in kernel space only sees addresses. Useful insights emerge only after symbolization—and in Go, this stage is structured differently than in other languages. The problem arises when the profile has already been collected, but it cannot be interpreted. The eBPF profiler captures stack traces at the kernel level and obtains a set of … Read more

LLM Load Without Blind Spots: How to Bring Observability to the Routing Layer with OpenRouter and Grafe…

When LLM becomes part of production infrastructure, traditional monitoring is no longer sufficient. The bottleneck is no longer the application code, but the routing and model selection layer — and that’s exactly where observability is needed. In LLM systems, degradation doesn’t start with HTTP endpoint failures, but with the accumulation of subtle effects: increased latency … Read more

AI Agent Observability: Tracing Non-Deterministic Workflows via OpenLIT and Grafana Cloud

AI agents complicate observability: the same request can lead to different chains of actions. Without tracing, the system becomes opaque. The problem manifests when generative systems transition from simple LLM calls to agents. An agent plans steps, invokes tools, and makes decisions dynamically. Behavior becomes non-deterministic: the same prompt can result in different call sequences … Read more

×

🚀 Deploy the Blocks

Controls: ← → to move, ↑ to rotate, ↓ to drop.
Mobile: use buttons below.