Kubernetes Horizontal Pod Autoscaling

Photo of the outside wall of an old apartment building showing many windows that are unevenly spaced
  • Cluster Autoscaler
  • Horizontal Pod Scaler
  • Vertical Pod Scaler

Horizontal Pod Autoscaler API Versions

$ kubectl api-versions | grep autoscaling
autoscaling/v1
autoscaling/v2beta1
autoscaling/v2beta2

Requirements

$ kubectl top pods
error: Metrics API not available
NAME                                     CPU(cores)   MEMORY(bytes)        
metrics-server-7d9f89855d-l4rrz 7m 17Mi

Installation of Metrics Server

$ kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server/
helm upgrade --install metrics-server metrics-server/metrics-server
Release "metrics-server" does not exist. Installing it now.
NAME: metrics-server
LAST DEPLOYED: Wed Sep 22 16:16:55 2021
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
***********************************************************************
* Metrics Server *
***********************************************************************
Chart version: 3.5.0
App version: 0.5.0
Image tag: k8s.gcr.io/metrics-server/metrics-server:v0.5.0

Verifying the Installation

$ kubectl top podsNAME CPU(cores) MEMORY(bytes) metrics-server-7d9f89855d-l4rrz 7m 15Mi
$ kubectl top nodesNAME                                     CPU(cores)   MEMORY(bytes)   
metrics-server-7d9f89855d-l4rrz 7m 15Mi
$ kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes | jq
{
"kind": "NodeMetricsList",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {},
"items": [
{
"metadata": {
"name": "docker-desktop",
"creationTimestamp": "2021-10-04T12:33:01Z",
"labels": {
"beta.kubernetes.io/arch": "amd64",
"beta.kubernetes.io/os": "linux",
"kubernetes.io/arch": "amd64",
"kubernetes.io/hostname": "docker-desktop",
"kubernetes.io/os": "linux",
"node-role.kubernetes.io/master": ""
}
},
"timestamp": "2021-10-04T12:32:07Z",
"window": "1m0s",
"usage": {
"cpu": "380139514n",
"memory": "2077184Ki"
}
}
]
}
$ kubectl get --raw /apis/metrics.k8s.io/v1beta1/namespaces/default/pods/web-servers-65c7fc644d-5h6mb | jq{
"kind": "PodMetrics",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {
"name": "web-servers-65c7fc644d-5h6mb",
"namespace": "default",
"creationTimestamp": "2021-10-04T12:36:48Z",
"labels": {
"app": "web-servers",
"pod-template-hash": "65c7fc644d"
}
},
"timestamp": "2021-10-04T12:35:55Z",
"window": "54s",
"containers": [
{
"name": "nginx",
"usage": {
"cpu": "0",
"memory": "6860Ki"
}
}
]
}
$ kubectl get hpaNAME  REFERENCE  TARGETS  MINPODS   MAXPODS   REPLICAS   AGE
web-servers Deployment/web-servers <unknown>/20% 1 10 1 8m6s

Configuring Horizontal Pod AutoScaling

apiVersion: apps/v1
kind: Deployment
metadata:
name: web-servers
labels:
app: web-servers
spec:
replicas: 1
selector:
matchLabels:
app: web-servers
template:
metadata:
labels:
app: web-servers
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
resources:
limits:
cpu: 100m
requests:
cpu: 50m
apiVersion: v1
kind: Service
metadata:
labels:
app: web-servers
name: web-servers
namespace: default
spec:
ports:
- name: web-servers-port
port: 80
selector:
app: web-servers
sessionAffinity: None
type: NodePort

autoscaling/v1 API Version

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: web-servers-v1
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web-servers
minReplicas: 1
maxReplicas: 10
targetCPUUtilizationPercentage: 20

autoscaling/v2beta2 API Version

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: web-servers
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web-servers
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 20
- type: Resource
resource:
name: memory
target:
type: AverageValue
averageValue: 30Mi
$ kubectl get hpaNAME  REFERENCE  TARGETS  MINPODS  MAXPODS  REPLICAS  AGE
web-servers Deployment/web-servers 6930432/30Mi, 0%/20% 1 10 1 10d
$ kubectl describe hpa web-servers
Name: web-servers
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Mon, 04 Oct 2021 15:39:00 +0300
Reference: Deployment/web-servers
Metrics: ( current / target )
resource memory on pods: 6930432 / 30Mi
resource cpu on pods (as a percentage of request): 0% (0) / 20%
Min replicas: 1
Max replicas: 10
Deployment pods: 1 current / 1 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ReadyForNewScale recommended size matches current size
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events: <none>

Operation of Horizontal Pod AutoScaling

$ kubectl port-forward svc/web-servers 8080:80
$ hey -n 10000 -c 5 http://localhost:8080/
$ kubectl get hpa web-servers
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE web-servers Deployment/web-servers 20049920/30Mi, 48%/20% 1 10 1 14d
$ kubectl get hpa web-servers
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE web-servers Deployment/web-servers 9233066666m/30Mi, 66%/20% 1 10 10 11d
$ kubectl describe hpa web-servers
Name:                                                  web-servers
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Mon, 04 Oct 2021 15:39:00 +0300
Reference: Deployment/web-servers
Metrics: ( current / target )
resource memory on pods: 9233066666m / 30Mi
resource cpu on pods (as a percentage of request): 66% (33m) / 20%
Min replicas: 1
Max replicas: 10
Deployment pods: 10 current / 10 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ScaleDownStabilized recent recommendations were higher than current one, applying the highest recent recommendation
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited True TooManyReplicas the desired replica count is more than the maximum replica count
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRescale 4m1s horizontal-pod-autoscaler New size: 3; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 3m1s horizontal-pod-autoscaler New size: 6; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 2m horizontal-pod-autoscaler New size: 10; reason: cpu resource utilization (percentage of request) above target
$ kubectl describe deployments web-serversName:                   web-servers
Namespace: default
CreationTimestamp: Mon, 04 Oct 2021 15:43:14 +0300
Labels: app=web-servers
Annotations: deployment.kubernetes.io/revision: 3
Selector: app=web-servers
Replicas: 10 desired | 10 updated | 10 total | 10 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=web-servers
Containers:
nginx:
Image: nginx
Port: 80/TCP
Host Port: 0/TCP
Limits:
cpu: 100m
Requests:
cpu: 50m
Environment: <none>
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Progressing True NewReplicaSetAvailable
Available True MinimumReplicasAvailable
OldReplicaSets: <none>
NewReplicaSet: web-servers-77cbb55d6 (10/10 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 4m50s deployment-controller Scaled up replica set web-servers-77cbb55d6 to 3
Normal ScalingReplicaSet 3m50s deployment-controller Scaled up replica set web-servers-77cbb55d6 to 6
Normal ScalingReplicaSet 2m49s deployment-controller Scaled up replica set web-servers-77cbb55d6 to 10
$ kubectl get podsNAME                                READY   STATUS    RESTARTS   AGE
metrics-server-7d9f89855d-l4rrz 1/1 Running 13 23d
web-servers-77cbb55d6-2vrn5 1/1 Running 0 3m30s
web-servers-77cbb55d6-7ps7k 1/1 Running 0 5m31s
web-servers-77cbb55d6-8brrm 1/1 Running 0 4m31s
web-servers-77cbb55d6-gsrk8 1/1 Running 0 4m31s
web-servers-77cbb55d6-jwshp 1/1 Running 0 11d
web-servers-77cbb55d6-qg9fz 1/1 Running 0 3m30s
web-servers-77cbb55d6-ttjz2 1/1 Running 0 3m30s
web-servers-77cbb55d6-wzbwt 1/1 Running 0 5m31s
web-servers-77cbb55d6-xxf7q 1/1 Running 0 3m30s
web-servers-77cbb55d6-zxglt 1/1 Running 0 4m31

Conclusion

Further Reading

--

--

--

>> www.loft.sh << Build Your Internal Kubernetes Platform With Virtual Clusters, Namespace Self-Service & Secure Multi-Tenancy

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

How to pay the bills of your Citi Bank Credit Card?

Dark theme automation in Android 10

Going beyond a single-core

An Easier Way to Remove the Duplicates From the Front and Keep the Stability of the Array of…

MRHB DeFi and Sukhavati Labs Partner to Bring Web3 to Excluded Communities

Digging Into Registry Persistence: Sometimes Rabbit Holes Yield Unexpected Surprises

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Loft Labs

Loft Labs

>> www.loft.sh << Build Your Internal Kubernetes Platform With Virtual Clusters, Namespace Self-Service & Secure Multi-Tenancy

More from Medium

Kubernetes Application High-Availability — Part 2 (More Basics)

Kubernetes monitor using Prometheus and Thanos , (2)Deployment

Using Kubernetes Ephemeral Containers for Troubleshooting

A time lapse picture of blue and red streaks of light

OpenShift vs Kubernetes: What’s the Difference?