Dynamic Resource Allocation (DRA) in Kubernetes expands resource management and alters scheduler behavior. In version 1.36, it is no longer just about GPUs, but also about CPUs, memory, and placement predictability.

The issue manifests in heterogeneous clusters where resources vary in type and state. Rigid requests for specific devices reduce throughput and lead to fragmentation. The scheduler either fails to find a suitable resource or makes a suboptimal choice. Additionally, the system degrades during device failures when there is no clear model of their state and readiness. This is particularly noticeable in AI/ML workloads, where topology and availability of accelerators are critical.

In Kubernetes 1.36, the development of Dynamic Resource Allocation is moving towards flexibility and manageability. A key change is the ability to specify a priority list of resources instead of a rigid selection. This reduces the likelihood of Pod blocking and increases utilization. Concurrently, compatibility with the legacy approach through extended resources has been added, making migration to ResourceClaim gradual. This is a pragmatic compromise: operators can implement DRA without disrupting existing workloads.

The implementation relies on several levels. At the device level, partitionable devices have emerged, allowing physical hardware to be divided into logical parts. This is critical for expensive accelerators, where placement density is important. Resource quality management is enhanced through device taints and tolerations—faulty or reserved devices are excluded from the general pool. Simultaneously, binding conditions prevent premature Pod assignment until external resources are ready, reducing the number of startup failures.

Observability has also become part of the model. Through resource health status, the state of devices is now reflected directly in the Pod status. This eliminates the need to analyze driver logs and speeds up diagnostics. Additionally, resource pool status provides a snapshot of resource availability, simplifying capacity planning. Meanwhile, device metadata standardizes the transmission of information into containers via JSON files, eliminating the need to call the Kubernetes API at runtime.

A separate layer of changes pertains to the scheduler. A lexicographic order for resource evaluation has been introduced, allowing drivers to influence placement strategy. This enhances predictability and may improve decision-making latency. The logic for constraint evaluation has also been updated: the matchAttribute and distinctAttribute functions now work better with sets of values, while includes() reduces fragility when attribute formats change.

The expansion of DRA to CPUs and memory is an important step. This brings advanced placement mechanisms, including a NUMA-aware approach, to core resources. However, it increases the complexity of configuration and the understanding of topology requirements. This approach provides more control but requires careful tuning.

New capabilities for PodGroups with ResourceClaim remove scaling limitations. Previously, there were limits on resource sharing between Pods. Now, management becomes more centralized and less dependent on external orchestrators. This is particularly important for large distributed workloads.

The result of these changes is an evolutionary enhancement of DRA as a universal resource management layer. Planning flexibility has improved, state transparency has increased, and integration with existing systems has been simplified. However, performance metrics are not directly specified, so the assessment of the effect remains at the level of architectural advantages.

Read

Dynamic Resource Allocation (DRA) in Kubernetes expands resource management and alters scheduler behavior. In version 1.36, it is no longer just about GPUs, but also about CPUs, memory, and placement predictability.

🚀 Deploy the Blocks