Kubernetes has long offered Pod Topology Spread Constraints to distribute pods evenly across failure domains. However, a notable gap persisted: the scheduler would consider , even those tainted and untolerated by the pod, while calculating skew. This often led to unexpected states, even when tolerable nodes existed.
all nodes
Pending
With KEP-3094, Kubernetes v1.33 addresses this with two new optional fields:
NodeAffinityPolicy
NodeTaintsPolicy
These allow fine-grained control over which nodes are considered during pod distribution.
Without nodeTaintsPolicy: Honor, a pod might be expected to land on a node it cannot tolerate, resulting in a Pending state. With it, tainted nodes are excluded from the skew calculation unless tolerated—making scheduling behavior intuitive and predictable.
This enhancement may seem minor, but it's a key step toward making Kubernetes scheduling more accurate, transparent, and user-driven. For platforms relying on strict node segregation (e.g., GPU pools, burstable zones), it eliminates a major source of scheduling surprises.
For cluster operators, enabling NodeTaintsPolicy and NodeAffinityPolicy gives finer control with no risk to existing workloads—an opt-in that brings measurable value.
FAQs
What issue does KEP-3094 solve in Kubernetes scheduling?
KEP-3094 addresses the problem where the scheduler previously considered all nodes, including tainted or affinity-mismatched ones, when calculating Pod Topology Spread. This often led to Pending pods even if tolerable nodes were available.
What are nodeAffinityPolicy and nodeTaintsPolicy in topologySpreadConstraints?
These new optional fields in topologySpreadConstraints allow the scheduler to honor pod-level node affinity and taint toleration rules during skew calculations.
nodeAffinityPolicy: Honor considers only nodes that match the pod's node affinity.
nodeTaintsPolicy: Honor excludes tainted nodes unless the pod tolerates them.
How does this change affect scheduling behavior?
When these policies are set to Honor, the scheduler excludes nodes the pod cannot land on from skew calculations. This ensures pods aren't marked as Pending due to unreachable nodes, resulting in more predictable and accurate scheduling.
Is this feature enabled by default in Kubernetes v1.33?
Yes. The feature gate NodeInclusionPolicyInPodTopologySpread is enabled by default in Kubernetes v1.33. If the new fields are unset, behavior remains unchanged for backward compatibility.
Why is this important for production clusters?
It provides predictable distribution, especially in environments with node taints (e.g., GPU pools) or affinity rules. It ensures spread constraints reflect actual node eligibility, eliminating scheduling inconsistencies and reducing manual debugging.
Like what you read? Support my work so I can keep writing more for you.