B2B Engineering Insights & Architectural Teardowns

Kubescape 4.0: Transition to CEL Detection and Abandonment of Host-Level Agents

In Kubescape 4.0, the focus shifts from reactive security to proactive security. The main changes include runtime detection, a redesign of the agent model, and the extraction of security data from etcd. The problem manifests at scale. As the cluster grows, security begins to compete for resources with the control plane itself. Storing security metadata … Read more

Kubernetes fsGroup as a Hidden Bottleneck: Accelerating Restarts through fsGroupChangePolicy

A long restart of a stateful service rarely appears to be a security configuration issue. However, this is how the safe default in Kubernetes turned into 30 minutes of downtime for each restart. The problem manifested at scale. Atlantis, which manages Terraform through GitLab MR, operates as a singleton StatefulSet and stores state in a … Read more

ARC-AGI: How to Measure Intelligence Through Learning Ability Rather Than Accumulated Skills

Most AI benchmarks evaluate outcomes. ARC-AGI shifts the focus to the process — how effectively a system learns new things. The problem manifests at the metric level. Modern systems demonstrate a high level of automation, but this is often a result of scaling data and computations, rather than an increase in generalization ability. A skill … Read more

Reducing Friction in Agentic AI: Local Validation and Isolated Environments in AWS

AI agents are limited not by models, but by architecture. If feedback is slow, autonomy does not work. The problem manifests when an AI agent tries to close the loop of “generated → validated → corrected.” In typical cloud systems, this loop is stretched: deployment takes minutes, tests depend on resource provisioning, and errors only … Read more

Scaling Architectural Control: A Declarative Approach Instead of Manual Review

GenAI has accelerated code production, but has made consistency (alignment) a bottleneck. Manual processes can no longer keep pace, and the architecture begins to fragment. The problem does not manifest immediately — until the speed of change generation exceeds the organization’s ability to review them. Historically, control has relied on people: key experts in startups … Read more

eBPF Profiling in Go: How Symbolization via gopclntab Transforms Addresses into Functions

The profiler in kernel space only sees addresses. Useful insights emerge only after symbolization—and in Go, this stage is structured differently than in other languages. The problem arises when the profile has already been collected, but it cannot be interpreted. The eBPF profiler captures stack traces at the kernel level and obtains a set of … Read more

Automation of Design System Specifications: How Uber Eliminated Documentation Drift Using AI Agents

When component specifications lag behind implementation, the team starts building the system based on assumptions. At Uber, this turned into a systemic, large-scale problem—and was solved through agent-based automation. The problem does not arise at the moment of writing specifications, but later—when the system begins to evolve faster than the documentation. The Uber Base design … Read more

Unification of API and AI Traffic through a Unified Control Plane: An Analysis of the Higress Approach

Higress enters the CNCF Sandbox as an API gateway with the aim of consolidating multiple layers of traffic. The key question is whether this reduces complexity or merely shifts it elsewhere. Systems begin to degrade when the traffic management layer becomes fragmented. Ingress operates separately, the gateway for microservices operates separately, and solutions for AI … Read more

AI accelerated coding, but slowed down delivery: shifting the bottleneck to specification

The increase in developer productivity has not led to a comparable acceleration of releases. The reason is that the bottleneck has moved higher up the stack: into the area of requirements formalization and result verification. With the advent of AI coding, teams expected a linear acceleration in delivery. In practice, only one stage sped up—the … Read more

Live Origin at Netflix: Segment Quality Control and Write Isolation Under Load

In live streaming, an error is not a degradation but an instant user-facing incident. Netflix addresses this by moving quality control and prioritization directly into the origin layer. The main limitation arises where VOD approaches stop working. In live, there is no time buffer: a segment must be encoded, delivered, and cached within seconds. Any … Read more

×

🚀 Deploy the Blocks

Controls: ← → to move, ↑ to rotate, ↓ to drop.
Mobile: use buttons below.