Cluster Autoscaler on Rancher RKE2
Step-by-step guide to RKE2 autoscaler setup with Rancher. Learn cluster autoscaler Helm deployment, scaling benefits, limits & troubleshooting.
Modern Kubernetes clusters need to scale on demand to handle varying workloads. Cluster Autoscaler (CA) is a Kubernetes component that automatically adjusts the number of nodes in your cluster by adding or removing worker nodes. It scales up when pods cannot be scheduled due to insufficient resources, and scales down when nodes are underutilized and their pods can be rescheduled elsewhere. In a Rancher RKE2 environment, the Cluster Autoscaler integrates with Rancher's provisioning system to manage node pools dynamically, ensuring your cluster is both elastic and efficient. This article will guide DevOps engineers, Kubernetes admins, and Rancher users through setting up the Cluster Autoscaler on RKE2 using Rancher’s UI, CLI, or Helm, and discuss its benefits, limitations, troubleshooting, and tuning.
How the Cluster Autoscaler Works in RKE2¶
The Cluster Autoscaler watches for pods in a Pending state that cannot be scheduled due to insufficient cluster capacity. It checks every 10 seconds by default (configurable via --scan-interval
) for unschedulable pods. If pending pods are detected, CA will scale up the cluster by requesting a new node (within the limits you set for the node pool). Kubernetes will then register the new node and schedule the pending pods on it. Conversely, if a node has been underutilized for a while (no critical workloads, and its pods can fit on other nodes), CA may scale down (remove) that node. This ensures you’re not paying for idle resources. Importantly, CA bases decisions on pod resource requests (not actual usage), so proper resource requests/limits on pods are crucial.
In Rancher RKE2 clusters, the autoscaler uses Rancher as its “cloud provider.” Rancher-managed clusters use node pools (via node drivers or Cloud Credentials) to provision nodes. The autoscaler will communicate with the Rancher server API to create or delete nodes in a node pool when scaling events occur. This means your cluster must be launched with Rancher’s node drivers (e.g. using an infrastructure provider like AWS, vSphere, etc. through Rancher). If you imported a custom cluster or manually provisioned nodes, the Rancher autoscaler provider won’t have an API to create new nodes. Assuming a Rancher-provisioned RKE2 cluster, the autoscaler will interact with Rancher to adjust the node pool sizes on demand.
By design, the Cluster Autoscaler runs on a control-plane (master) node for stability. Rancher RKE2 clusters taint control-plane nodes (so regular workloads don't run there), but we configure the autoscaler pod to tolerate master node taints and use a node selector to schedule on a control-plane node. This ensures the autoscaler isn't itself running on a worker node that it might scale down. Kubernetes best practices also recommend marking the autoscaler pod as a critical add-on (using priorityClassName: system-cluster-critical
) so it won’t be evicted during resource pressure. Additionally, by default CA will not scale down any node hosting certain system pods (non-mirrored pods in the kube-system
namespace) to avoid disrupting core services. This behavior can be tuned (as we’ll see later), but the default adds a layer of safety.
Preparing Rancher API Access for Autoscaler¶
Before deploying the autoscaler, you need to provide it with credentials and information to access the Rancher API. The autoscaler will use these details to invoke Rancher’s cluster provisioning endpoints to add or remove nodes. Specifically, you should prepare:
- Rancher API URL: The URL of your Rancher server (e.g.
https://<rancher-server>
). - API Token: A Rancher API access token with permissions to manage clusters/nodes. It’s easiest to generate this in Rancher UI under Account & API Keys (choose Create API Key, no scope). Using an admin-level token is simplest (the autoscaler in our examples uses an admin token), though you can also create a restricted token with specific roles for better security.
- Cluster Identification: The autoscaler needs to know which cluster to scale. For the Rancher provider, you typically provide the cluster name and cluster namespace (the namespace of the cluster’s provisioning object in Rancher). In newer Rancher (Cluster API-driven) provisioning, the cluster’s name (as seen in Rancher UI) and the namespace (often
fleet-default
for Rancher’s default project) are used.
These details are passed to the autoscaler via a cloud-config file. You can create a Kubernetes Secret or ConfigMap containing the Rancher connection info. For example, a secret manifest might look like:
apiVersion: v1
kind: Secret
metadata:
name: cluster-autoscaler-cloud-config
namespace: kube-system
type: Opaque
stringData:
cloud-config: |-
url: https://<your-rancher-server>
token: <your-api-token>
clusterName: <your-cluster-name>
clusterNamespace: fleet-default
In this file, url
is the Rancher server URL, token
is the API token, and clusterName
/clusterNamespace
refer to the target RKE2 cluster managed by Rancher. Once this secret (or config map) is created in the cluster, the autoscaler will mount it and use it to authenticate to Rancher.
If your Rancher server uses a self-signed certificate (common in lab setups), you should also provide the Rancher server’s CA certificate to the autoscaler, so it can trust the TLS connection. This can be done by creating a ConfigMap with the CA cert and mounting it in the autoscaler pod (or by adding the CA to the container’s trusted store). For simplicity, using a trusted SSL cert for Rancher or adding the cert to the autoscaler container is recommended to avoid TLS errors when the autoscaler connects to Rancher.
With credentials and cluster info ready, we can proceed to deploy the Cluster Autoscaler.
Installing Cluster Autoscaler via Rancher UI¶
Rancher’s Cluster Explorer UI makes it straightforward to deploy the Cluster Autoscaler using Helm charts:
- Add the Autoscaler Helm repository: In the Rancher UI, go to your RKE2 cluster and navigate to Apps & Marketplace (in older versions, "Apps" or "Catalog"). Click Repositories and add the official Kubernetes Autoscaler Helm repo:
- Name:
autoscaler
(or any name you prefer) - URL:
https://kubernetes.github.io/autoscaler
This repository hosts the cluster-autoscaler Helm chart.
- Name:
- Install the Cluster Autoscaler chart: Still in Apps & Marketplace, click Charts (or Launch from the repo) and find cluster-autoscaler. Choose to install it in the
kube-system
namespace (this is a common practice for cluster-wide add-ons). In Rancher, select the System project andkube-system
for the namespace. - Configure chart values: The Helm chart requires certain values to be set for our RKE2 use case. In the UI, you'll be presented with default values which you can override. Switch to Edit as YAML for easier editing. Provide the following key overrides (merging into the existing values):
autoDiscovery.clusterName
: set this to your cluster’s name (e.g.production-apps
). This labels the autoscaler to auto-discover nodes belonging to the cluster’s node group.cloudProvider
: set to"rancher"
so that the autoscaler uses the Rancher provider logic.cloudConfigPath
: set to the path where the Rancher config will be mounted (e.g./config/cloud-config
). This path is inside the autoscaler container.- Mount the Rancher config secret: Use the chart’s options to mount the secret we created. The official chart supports mounting extra secrets via
extraVolumeSecrets
. For example, you can add:
extraVolumeSecrets:
cluster-autoscaler-cloud-config:
name: cluster-autoscaler-cloud-config
mountPath: /config
This will mount our secret at /config
in the container, and the file will be available as /config/cloud-config
(since our secret key is cloud-config
). Ensure cloudConfigPath
matches this (i.e. /config/cloud-config
)
extraArgs
: include any custom flags for the autoscaler. At minimum, you'll want to set:
--v=4
(or higher) for verbose logging (useful for troubleshooting).--stderrthreshold=info
and--logtostderr=true
to send logs to pod stdout.--balance-similar-node-groups=true
to balance scale-out across similar pools.--skip-nodes-with-system-pods=false
if you want to allow scaling down nodes even if they run non-criticalkube-system
pods (by default the autoscaler skips such nodes; setting false means it will consider removing nodes running system pods that are replaceable, like DNS).--skip-nodes-with-local-storage=false
to allow removing nodes even if they have pods with local storage (default true would skip those nodes). Be cautious with this: turning it off can evict pods with local storage (data may be lost unless pods handle it).- Other tuning flags as needed (we'll discuss in Performance Tuning below). For example,
--scale-down-utilization-threshold=0.6
(60% utilization) and--scale-down-unneeded-time=10m
(how long a node should be underutilized before removal) can be set as in our example.
nodeSelector
andtolerations
: configure the autoscaler to run on control-plane nodes. For RKE2, you can add:
nodeSelector:
node-role.kubernetes.io/control-plane: "true"
tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"
- key: "node-role.kubernetes.io/etcd"
operator: "Exists"
effect: "NoExecute"
Launch the application: After setting the values, deploy the chart. Rancher will install the cluster-autoscaler Deployment in your cluster. Verify that the cluster-autoscaler pod is running in the kube-system
namespace. It should register as running on a master node (check with kubectl get pods -n kube-system -o wide
to see the node).
At this point, the autoscaler is running but not yet actively managing any node pool until we configure the node pool scaling ranges (next section). The UI method conveniently uses the Helm chart’s templates, which include creating the necessary RBAC (ServiceAccount and ClusterRole) for the autoscaler, so you typically don’t need to apply those manually – but if something went wrong, ensure the ServiceAccount cluster-autoscaler
exists in kube-system
and has the proper ClusterRole/Binding (the chart usually creates these).
Installing Cluster Autoscaler via Helm CLI or Manifest¶
If you prefer using the command line or an Infrastructure-as-Code approach, you can deploy the autoscaler without Rancher’s UI:
- Using Helm CLI: First, ensure you have kubectl access to the cluster (e.g. via the kubeconfig from Rancher). Add the autoscaler Helm repo and install the chart:
helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm repo update
helm install cluster-autoscaler autoscaler/cluster-autoscaler \
-n kube-system --create-namespace \
--set autoDiscovery.clusterName=<your-cluster-name> \
--set cloudProvider=rancher \
--set cloudConfigPath=/config/cloud-config \
--set extraVolumeSecrets.cluster-autoscaler-cloud-config.mountPath=/config \
--set extraVolumeSecrets.cluster-autoscaler-cloud-config.name=cluster-autoscaler-cloud-config \
--set extraArgs.v=4,extraArgs.stderrthreshold=info,extraArgs.logtostderr=true \
--set extraArgs.balance-similar-node-groups=true,extraArgs.skip-nodes-with-system-pods=false,extraArgs.skip-nodes-with-local-storage=false \
--set extraArgs.scale-down-utilization-threshold=0.6,extraArgs.scale-down-unneeded-time=10m \
--set nodeSelector."node-role.kubernetes.io/control-plane"="true" \
--set tolerations[0].key="node-role.kubernetes.io/control-plane",tolerations[0].operator="Exists",tolerations[0].effect="NoSchedule" \
--set tolerations[1].key="node-role.kubernetes.io/etcd",tolerations[1].operator="Exists",tolerations[1].effect="NoExecute"
This long command adds the necessary overrides (it’s equivalent to what we did in the UI). It assumes you already created the secret cluster-autoscaler-cloud-config
in kube-system
as described earlier. The extraVolumeSecrets
values tell Helm to mount that secret. We also explicitly set various extraArgs
and scheduling constraints via --set
. (In practice, you may use a values file instead for cleanliness).
- Using RKE2 HelmChart Manifest: RKE2 clusters come with a Helm Controller that can deploy charts based on custom resources. You can drop a
HelmChart
manifest into RKE2’s manifest directory or apply it with kubectl. Below is an exampleHelmChart
custom resource for the Cluster Autoscaler:
apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
name: cluster-autoscaler
namespace: kube-system
spec:
chart: cluster-autoscaler
repo: https://kubernetes.github.io/autoscaler
targetNamespace: kube-system
bootstrap: true
valuesContent: |-
autoDiscovery:
clusterName: production-apps
cloudProvider: rancher
extraArgs:
logtostderr: true
stderrthreshold: info
v: 4
cloud-config: /mnt/config.yaml
scale-down-utilization-threshold: "0.6"
scale-down-unneeded-time: "10m"
scale-down-unready-time: "20m"
balance-similar-node-groups: "true"
expander: "least-waste"
skip-nodes-with-local-storage: "false"
skip-nodes-with-system-pods: "false"
scale-down-non-empty-candidates-count: "60"
extraVolumeMounts:
- mountPath: /mnt/config.yaml
name: autoscaler-config
readOnly: true
subPath: config.yaml
extraVolumes:
- name: autoscaler-config
configMap:
name: autoscaler-config
nodeSelector:
node-role.kubernetes.io/control-plane: "true"
tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"
- key: "node-role.kubernetes.io/etcd"
operator: "Exists"
effect: "NoExecute"
Regardless of the method (UI, Helm CLI, or HelmChart manifest), after deployment you should have a running cluster-autoscaler
pod. You can confirm it's running and then proceed to enable autoscaling on your cluster’s node pools.
Enabling Node Pool Autoscaling in Rancher¶
Deployment alone is not enough, we must tell the autoscaler which node pools it can scale and within what range. In Rancher, each RKE2 cluster has one or more machine pools (node pools). We enable autoscaling on a pool by annotating the cluster’s machine pool configuration with a min and max size.
To do this via Rancher UI:
- Go to Cluster Management in Rancher and edit your RKE2 cluster (click ⋮ -> Edit Config, then switch to the YAML view).
- In the cluster YAML, locate the section for
machinePools
. Identify the machine pool that you want the autoscaler to manage (e.g. your worker pool). Under that machine pool, add the following annotations:
machineDeploymentAnnotations:
cluster.provisioning.cattle.io/autoscaler-min-size: "1"
cluster.provisioning.cattle.io/autoscaler-max-size: "3"
Replace the values with the desired minimum and maximum node counts for that pool. For example, if you want the pool to scale down to no fewer than 1 node and up to 3 nodes, use "1" and "3" as above. Each pool can have different limits.
- Save the changes. Rancher will update the cluster’s configuration. The autoscaler will detect these annotations via the Rancher API and know it is allowed to scale that node group between the given bounds.
If you prefer kubectl
, you can also patch the MachineDeployment
or the Cluster
custom resource with these annotations. The annotations actually live on the MachineDeployment object in the downstream cluster (which corresponds to your node pool). The Rancher autoscaler provider reads cluster.provisioning.cattle.io/autoscaler-min-size
and ...max-size
to determine limits. Any pool without these annotations will be ignored by the autoscaler (it won’t scale it).
Only worker node pools should be scaled. Do not put these annotations on your control-plane pool. Master/etcd nodes are not meant to be auto-scaled by CA, and doing so could destabilize your cluster. Stick to worker pools (Rancher will typically name them or you can identify by the roles in the YAML).
At this stage, the Cluster Autoscaler is fully configured and ready to operate. It runs in the background, monitoring your cluster’s pods and node utilization.
Testing the Autoscaler¶
To verify that everything is working, you can simulate a workload that triggers scaling:
- Scale Up Test: Deploy a workload that requests more resources than currently available. For example, if you have 1 small worker node, run a deployment with a couple of pods each requesting significant CPU or memory (so that not all pods can schedule at once). For instance:
apiVersion: apps/v1
kind: Deployment
metadata:
name: stress-test
spec:
replicas: 2
selector:
matchLabels:
app: stress-test
template:
metadata:
labels:
app: stress-test
spec:
containers:
- name: cpu-hog
image: busybox
command: ["sh", "-c", "yes > /dev/null"]
resources:
requests:
cpu: "2" # request 2 CPUs
limits:
cpu: "2"
If your single node has less than 2 CPUs, one of the pods will remain pending. Within ~10–30 seconds, the autoscaler should detect the unschedulable pod and increment the node pool size. In Rancher UI, you’ll see a new node provisioning, and in kubectl get nodes
a new node will appear once ready. The pending pod will then schedule onto the new node.
- Scale Down Test: After the new node is added and pods are running, reduce the load. You can delete the deployment or scale it down to 0 replicas. The autoscaler will observe that a node is now underutilized (perhaps completely empty) for a period of time (by default, 10 minutes). After that time, it should remove the extra node. Watch the Rancher UI or
kubectl get nodes
for one node to go away. The autoscaler will respect theautoser-min-size
(so it won’t go below 1 node in our example). Any pods on the removed node will be rescheduled onto remaining nodes automatically.
During tests, you can observe the autoscaler’s decisions by checking its logs:
kubectl -n kube-system logs -f deployment/cluster-autoscaler
The logs will indicate why it’s scaling up or down, which pool it chose, or any blockers. For example, you might see messages like “Scaled up node group X to 2” or reasons for not scaling down a node. Monitoring the logs is a great way to troubleshoot issues or confirm behavior.
Benefits of Using Cluster Autoscaler in RKE2¶
Using the Cluster Autoscaler with Rancher RKE2 offers several benefits:
- Automatic Right-Sizing: The autoscaler ensures your cluster always has the “right” amount of resources. It adds nodes when demand increases (ensuring critical applications aren’t starved for capacity) and removes nodes when demand drops (so you don’t pay for idle VMs). This dynamic adjustment can greatly improve cluster efficiency and uptime.
- Cost Efficiency: Especially in cloud environments, automatically removing underutilized nodes saves costs during off-peak times. Conversely, adding nodes on-demand prevents over-provisioning resources for peak capacity that might rarely be used. The combination leads to an elastic, cost-optimized infrastructure.
- Improved Resilience and User Experience: By scaling out when needed, autoscaling can handle sudden spikes in workload without manual intervention. Your users experience fewer slowdowns or failures due to resource exhaustion. The cluster can react faster than human operators in many cases, adding capacity within minutes of detecting a need.
- Synergy with Pod Autoscalers: When used alongside Horizontal Pod Autoscalers (HPA) or other scaling mechanisms, the Cluster Autoscaler completes the picture for full-stack scalability. For example, HPA might increase the replicas of a deployment, and if those new pods can’t fit on existing nodes, the Cluster Autoscaler will kick in to provide new nodes for them. This synergy allows truly hands-off scaling for your applications.
- Rancher Integration: In an RKE2 cluster managed by Rancher, the autoscaler leverages Rancher’s robust provisioning system. This means you can use it across multiple infrastructure providers (any supported node driver) with a consistent experience. Rancher handles the low-level provisioning (creating VMs on AWS, vSphere, etc.), while CA simply requests more or fewer nodes. The integration abstracts away cloud-specific autoscaling (no need to manually manage AWS Auto Scaling Groups, for instance, when using Rancher’s provider).
Overall, cluster autoscaling “greatly enhances your cluster health by adding more nodes if needed” and is quite straightforward to set up on RKE2 clusters. It reduces the need for manual capacity planning and can provide a more resilient, cost-effective environment.
Limitations and Considerations¶
Despite its advantages, it’s important to understand the limitations and nuances of Cluster Autoscaler in a Rancher RKE2 context:
- Not based on actual utilization: The autoscaler makes decisions based on Kubernetes scheduler state (pending pods and requests), not live CPU/memory usage metrics. For example, if you have a node running at 10% CPU but all pods on it have large resource requests reserved, the autoscaler may consider that node fully utilized (because from the scheduler’s perspective those resources are “claimed”). This can lead to less aggressive downscaling in environments where pods over-request resources.
- Scale-down is cautious: By default, CA won’t remove a node if it would violate certain safety checks. It avoids scaling down nodes that have: These safeguards mean scale-down might not happen even when a node looks underutilized, if any unmovable pods are present. In Rancher RKE2, typically core components (cattle agents, etc.) run on master nodes or as DaemonSets on workers, so it usually can move things. But if you run custom system pods or have local storage volumes, be aware of this limitation.
- Pods that cannot be moved (e.g. pods with
PersistentVolume
local to the node, or pods with local storage like emptyDir ifskip-nodes-with-local-storage=true
). - System pods that are not mirrored by a DaemonSet (unless you set
--skip-nodes-with-system-pods=false
as we did). - Recently started or heavily utilized nodes (it gives new nodes a stabilization period and respects Pod Disruption Budgets).
- Pods that cannot be moved (e.g. pods with
- Delay in provisioning: When a scale-up is triggered, the autoscaler requests Rancher to add a node. The reaction time includes both the autoscaler’s scan interval (up to 10 seconds by default) and the time for Rancher and the cloud provider to actually create the VM, install RKE2, and join it to the cluster. This can take a few minutes depending on your cloud/infrastructure. During this time, pending pods remain unscheduled. So, autoscaler is not instantaneous – there will be a short period where workload demand outpaces supply until the new node is ready. Plan for this in latency-sensitive scenarios (you might over-provision a bit or use faster node provisioning if possible).
- Requires Rancher-provisioned nodes: As mentioned, the Rancher cloud provider integration only works if Rancher can create/delete nodes. If your RKE2 cluster was created with custom nodes (e.g. you manually installed RKE2 on some servers and imported to Rancher), the autoscaler cannot add more because there’s no node driver to call. Similarly, if using an unsupported node driver or cloud, it may not work. Check that your environment is supported (most major cloud providers and vSphere via Rancher node drivers are supported).
- No control-plane scaling: The Cluster Autoscaler (and Rancher) will not automatically scale your control-plane/etcd nodes. High-availability masters must be planned and added manually. Autoscaler focuses on worker nodes only. Ensure your control plane has enough capacity to handle increased load (API, controller, etc.) when workers scale up.
- Potential race conditions or busy-loop: In edge cases, misconfiguration can cause autoscaler to behave unexpectedly. For example, if min and max sizes are mis-set or if there’s a constant stream of small pods landing and leaving, the cluster could oscillate. The autoscaler does have built-in backoff timers to prevent constant churn, but it’s good to monitor behavior in real scenarios.
- Rancher API Access and Permissions: Using an admin token is straightforward, but in regulated environments you might want to restrict the token. Rancher has a concept of restricted roles for autoscaler (so you don’t have to use a full admin token). Implementing that is more complex and beyond our scope here, but be aware that the autoscaler having broad API access is a potential risk. If the token lacks some permissions, the autoscaler might fail to operate fully (for instance, missing rights to list or update certain Kubernetes API objects, as noted in some Rancher autoscaler docs).
By understanding these considerations, you can better plan your cluster capacity and know what to expect from the autoscaler’s behavior.
Troubleshooting Tips¶
Setting up the Cluster Autoscaler can involve multiple components (Rancher, cloud provider, Kubernetes). If things aren’t working as expected, here are some troubleshooting tips:
- Check the Autoscaler Pod Status: Make sure the
cluster-autoscaler
pod is running (kubectl -n kube-system get pods
). If it’s CrashLooping or Error, describe the pod to see why. Common issues include missing volume mounts (e.g. if the secret or config map forcloud-config
wasn’t created or named correctly, the pod might fail to start due to missing file) or lack of permissions. - View Logs for Clues: The autoscaler’s logs are very informative. Look for lines indicating it recognized your node group and any errors. For example, if misconfigured, you might see errors about failing to contact Rancher or missing credentials. If the autoscaler is running but not scaling up when you expect, the logs might say something like “No unschedulable pods” or “max node limit reached for node group X” etc. This can tell you whether it sees the pending pods and what decision it made. Continuously watching the logs is a good way to verify it's actively monitoring the cluster.
- Verify Rancher Config: If scaling isn’t happening, double-check the content of your cloud-config. Is the URL correct (including the
/v3
path if needed, depending on Rancher version)? Is the token valid and not expired? Are clusterName and clusterNamespace exactly matching the cluster? (They are case-sensitive and must match the Rancher cluster’s name and namespace as seen in Rancher’s cluster list or the provisioning CR). If these are wrong, the autoscaler may silently do nothing or log authentication errors. - Ensure Annotations are in Place: A very common oversight is forgetting to put the
autoscaler-min-size
/max-size
annotations on the machine pool. Without those, the autoscaler will not consider any node group for scaling (it thinks everything has size == min == max, essentially fixed). Confirm via Rancher UI (Cluster YAML) or viakubectl get clusters.provisioning.cattle.io <cluster-name> -o yaml
that your worker pool has those annotations set. Also, ensure the values make sense (min < max, etc.). If you update annotations, the change should be picked up on the next autoscaler loop. - RBAC and Permissions: If you deployed the autoscaler via Helm, it should have created a ClusterRole and bound it to the autoscaler ServiceAccount. If you applied manifests manually, ensure you also applied the provided RBAC manifest (as in the official docs or examples). The autoscaler needs permissions to list nodes, pods, etc., and to create events. If it can’t, it might log errors about missing permissions. Rancher’s cloud provider also might require permissions on Rancher-related API groups (like
provisioning.cattle.io
andcluster.x-k8s.io
). If you see errors referencing those, you might need to adjust RBAC (the Rancher docs or community guides provide the needed rules). - Connectivity and CA Certs: If the autoscaler cannot connect to the Rancher server, it won’t scale. In an on-prem setup, ensure the Rancher URL is reachable from the cluster (if Rancher is private, the cluster nodes need network access to it). Also, if using a self-signed Rancher cert, ensure the autoscaler has the CA trust configured as noted earlier. Look for any TLS handshake or certificate errors in the logs.
- Node Provisioning Issues: Sometimes the autoscaler does its job, but the cloud side fails. For example, autoscaler requests a new node from Rancher, Rancher tries to create a VM but hits a quota or configuration issue. In such cases, you’d see Rancher UI showing an error provisioning a node. This isn’t directly an autoscaler fault but will prevent scale-up. Monitor the Rancher Cluster Events or the Rancher UI for errors creating new nodes if nothing appears after autoscaler says it tried to scale.
- Tune Logging and Verbosity: If you need deeper insight, you can increase the
-v
flag (verbosity) to 5 or more to get more debug information. Just be cautious as very high verbosity can generate a lot of log output. - Simulate Conditions: If scale-down isn’t happening, try to identify what pod or resource might be blocking it. For instance, use
kubectl drain --ignore-daemonsets <node>
with--dry-run
on a node to see if Kubernetes reports any pods that cannot be evicted. This can reveal if a certain pod or volume is preventing node removal. That might hint at needing to adjust a flag (like the skip-nodes-with-local-storage) or simply that those pods need to run somewhere at all times.
By following these tips, you can usually pinpoint why the autoscaler isn’t behaving as expected. In many cases, it’s a configuration detail or a safety mechanism that can be adjusted once understood.
Performance Tuning and Advanced Settings¶
The default settings of Cluster Autoscaler are conservative to fit general use cases, but you might tune them for your environment. Here are some key parameters (configured via extraArgs
on the deployment) and how to use them:
- Scale Down Delay and Utilization Threshold:
--scale-down-unneeded-time
and--scale-down-utilization-threshold
control how aggressively to remove nodes. For example, by settingscale-down-unneeded-time=10m
, we specify that a node must be underutilized for 10 minutes before being considered for removal (the default is 10 minutes, which we kept in our example). Theutilization-threshold=0.6
means if a node’s usage is below 60% (based on requested resources) and its pods can fit elsewhere, it’s eligible to remove (default ~50%). Raising this threshold makes the autoscaler more eager to remove nodes (even if they are 50% utilized, for instance), while lowering it makes it more cautious. - Max Empty Bulk Delete:
--scale-down-non-empty-candidates-count
(we set "60" in our example) adjusts how many nodes can be considered simultaneously for scale-down. In very large clusters, you might increase this to let the autoscaler evaluate more nodes at once for potential removal. The default is smaller (perhaps 30); we raised it to 60 to be safe in case of large cluster scenarios. - Expander Strategy: The autoscaler’s expander setting decides which node group to scale when multiple groups could fit a pending pod. We used
--expander=least-waste
, which means it will choose the node group that leaves the least idle resources after scheduling the pending pods (often a good default for cost efficiency). Other options includemost-pods
(chooses the group that would accommodate the most pods) andrandom
. If you have multiple node types (e.g., some pools with GPU nodes, some with high-memory nodes), consider how you want autoscaler to pick between them. There’s alsoprice
if you supply node costs, to choose the cheapest option. - Balancing Similar Node Groups: We enabled
--balance-similar-node-groups=true
. This helps if you have identically configured node pools (for example, two pools in different AZs). The autoscaler will try to keep their sizes balanced when scaling up/down. This prevents situations where one pool grows large while another stays small, which can be important for availability across AZs or just uniform usage. - Skipping vs. Considering Nodes for Scale-down: We deliberately set
--skip-nodes-with-system-pods=false
and--skip-nodes-with-local-storage=false
in our configuration to allow maximum flexibility in scaling down. By default, CA would skip any node that has non-daemonsetkube-system
pods (like a DNS or metrics server pod) or any pod with local storage. By turning these to false, we told CA to consider removing such nodes anyway. This can improve downscaling in clusters where every node always has some system pod (which is often true). The risk is that those pods will be terminated; in most cases, Deployments like CoreDNS will reschedule on other nodes, so it's fine. But use caution: if you have something stateful with local storage, it could be lost. Tune these flags based on how your workloads use local storage and how your system pods are deployed. If unsure, leave them as true (skip) to be safe, at the cost of possibly keeping an extra node around. - Scale-Up Delay: Though not explicitly set above, there are also flags like
--max-node-provision-time
(how long to wait for cloud to provision before giving up) and--scale-up-from-zero
(allows scaling from 0 nodes in a group if enabled by annotation and if any pod can only run on that type of node). By default, autoscaler can handle scaling from 0 if properly annotated, but you must ensure at least one pod explicitly requires that node group (so CA knows to bring it from 0). - Cluster Scale Limits: The
autoscaler-min-size
/max-size
on the node pool ultimately cap the scaling. If you find autoscaler not scaling further, check if it hit the max. Conversely, if it scaled down to min and you expected removal of all nodes, remember it will not go below the configured min. Adjust those as needed if your usage patterns change. - Version Compatibility: Use an autoscaler version that matches your Kubernetes version. For example, if running Kubernetes 1.27, use Cluster Autoscaler image tagged with 1.27.*. The Helm chart often picks an image tag based on your cluster version automatically, but verify it. Mismatched versions can cause subtle issues.
In our example values, we included many of these tunings. You can see how they were added as extraArgs
in the manifest, e.g., --cloud-provider=rancher
, --cloud-config=/mnt/config.yaml
, verbosity -v=4
, and the various scale-down flags. Adjusting these can tailor the autoscaler’s behavior to your requirements. For instance, in a dev/test cluster you might want faster scale-down to save cost (shorter unneeded-time
), whereas in production you might keep it a bit longer to avoid thrashing nodes for short traffic dips.
Always monitor the impact of any tuning in a non-prod environment if possible. The defaults are usually safe; only tune when you have a clear need and understanding of the parameter.
Conclusion¶
The Cluster Autoscaler is a powerful addition to any Kubernetes cluster that experiences variable workloads. In Rancher RKE2 clusters, it brings cloud-like elasticity by leveraging Rancher’s provisioning APIs. We covered how to deploy it via Rancher’s UI for an easy point-and-click setup, as well as via Helm or manifests for those who prefer automation and GitOps. We also walked through configuring the necessary Rancher API access, enabling autoscaling on node pools, and testing the behavior.
By automatically scaling nodes up when resources are tight and scaling down when there’s excess capacity, the autoscaler helps ensure your cluster is always right-sized. This leads to better resilience (pods get the capacity they need) and cost savings (unused nodes don’t stick around for long). As with any automation, it’s important to understand its parameters and limits – we discussed common gotchas like delayed provisioning and the need to properly annotate node pools.
With the Cluster Autoscaler on RKE2, your Kubernetes cluster can truly grow and shrink on-demand, hands-free. This not only reduces the operational burden on DevOps teams but also optimizes the infrastructure usage over time. Many users find that it is “fast and easy to set up for RKE2 clusters” and greatly enhances the cluster's ability to self-manage. By following the best practices and tips outlined above, you can confidently implement autoscaling in your Rancher-managed Kubernetes environment and reap the benefits of a dynamic, automated infrastructure.