Create the Kubernetes service
For our next step, we have to create a service. The sample application will listen to the public endpoint by using this service. Create a service configuration file with the following content:
$ cd /Users/bob/hpa/
$ cat service.yaml
apiVersion: v1
kind: Service
metadata:
name: hpa-demo-deployment
labels:
run: hpa-demo-deployment
spec:
ports:
- port: 80
selector:
run: hpa-demo-deployment
This service will be a front-end to the deployment we created above, which we can access via port 80.
Apply the changes:
$ kubectl apply -f service.yaml
service/hpa-demo-deployment created
We have created the service. Next, let’s list the service and see the status:
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hpa-demo-deployment ClusterIP 10.100.124.139 80/TCP 7s
kubernetes ClusterIP 10.100.0.1 443/TCP 172m
Here, we can see:
- hpa-demo-deployment = Service Name
- 10.100.124.139 = IP address of the service, and it is open on Port 80/TCP
Install the Horizontal Pod Autoscaler
We now have the sample application as part of our deployment, and the service is accessible on port 80. To scale our resources, we will use HPA to scale up when traffic increases and scale down the resources when traffic decreases.
Let’s create the HPA configuration file as shown below:
$ cd /Users/bob/hpa/
$ cat hpa.yaml
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: hpa-demo-deployment
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: hpa-demo-deployment
minReplicas: 1
maxReplicas: 10
targetCPUUtilizationPercentage: 50
Apply the changes:
$ kubectl apply -f hpa.yaml
horizontalpodautoscaler.autoscaling/hpa-demo-deployment created
Verify the HPA deployment:
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-demo-deployment Deployment/hpa-demo-deployment 0%/50% 1 10 0 8s
The above output shows that the HPA maintains between 1 and 10 replicas of the pods controlled by the hpa-demo-deployment. In the example shown above (see the column titled “TARGETS”), the target of 50% is the average CPU utilization that the HPA needs to maintain, whereas the target of 0% is the current usage.
If we want to change the MIN and MAX values, we can use this command:
📝Note: Since we already have the same MIN/MAX values, the output throws an error that says it already exists.
Increase the load
So far, we have set up our EKS cluster, installed the Metrics Server, deployed a sample application, and created an associated Kubernetes service for the application. We also deployed HPA, which will monitor and adjust our resources.
To test HPA in real-time, let’s increase the load on the cluster and check how HPA responds in managing the resources.
First, let’s check the current status of the deployment:
$ kubectl get deploy
NAME READY UP-TO-DATE AVAILABLE AGE
hpa-demo-deployment 1/1 1 1 23s
Next, we will start a container and send an infinite loop of queries to the ‘php-apache’ service, listening on port 8080. Open a new terminal and execute the below command:
# kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://hpa-demo-deployment; done"
📝Note: If you do not have DNS entries for the service, use the service name.
To view the service name:
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hpa-demo-deployment ClusterIP 10.100.95.188 80/TCP 10m
Before we increase the load, the HPA status will look like this:
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-demo-deployment Deployment/hpa-demo-deployment 0%/50% 1 10 1 12m
Once you triggered the load test, use the below command, which will show the status of the HPA every 30 seconds:
$ kubectl get hpa -w
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-demo-deployment Deployment/hpa-demo-deployment 0%/50% 1 10 1 15m
...
...
hpa-demo-deployment Deployment/hpa-demo-deployment 38%/50% 1 10 8 25m
Here, you can see that as our usage went up, the number of pods scaled from 1 to 7:
$ kubectl get deployment php-apache
NAME READY UP-TO-DATE AVAILABLE AGE
hpa-demo-deployment 7/7 7 7 21m
You can also see pod usage metrics. The load-generator pod generates the load for this example:
$ kubectl top pods --all-namespaces
NAMESPACE NAME CPU(cores) MEMORY(bytes)
default hpa-demo-deployment-6b988776b4-b2hkb 1m 10Mi
default load-generator 10m 1Mi
default hpa-demo-deployment-d4cf67d68-2x89h 97m 12Mi
default hpa-demo-deployment-d4cf67d68-5qxgm 86m 12Mi
default hpa-demo-deployment-d4cf67d68-ddm54 131m 12Mi
default hpa-demo-deployment-d4cf67d68-g6hhw 72m 12Mi
default hpa-demo-deployment-d4cf67d68-pg67w 123m 12Mi
default hpa-demo-deployment-d4cf67d68-rjp77 75m 12Mi
default hpa-demo-deployment-d4cf67d68-vnd8k 102m 12Mi
kube-system aws-node-982kv 4m 41Mi
kube-system aws-node-rqbg9 4m 40Mi
kube-system coredns-86d9946576-9k6gx 4m 9Mi
kube-system coredns-86d9946576-m67h6 4m 9Mi
kube-system kube-proxy-lcklc 1m 11Mi
kube-system kube-proxy-tk96q 1m 11Mi
kube-system metrics-server-9f459d97b-q5989 4m 17Mi
Monitor HPA events
If you want to see what steps HPA is performing while scaling, use this command and check for the events section:
$ kubectl describe deploy hpa-demo-deployment
Name: hpa-demo-deployment
Namespace: default
CreationTimestamp: Mon, 30 Aug 2021 17:15:34 +0530
Labels:
Annotations: deployment.kubernetes.io/revision: 1
Selector: run=php-apache
Replicas: 7 desired | 7 updated | 7 total | 7 available | 0 NewReplicaSet: hpa-demo-deployment-d4cf67d68 (7/7 replicas created)
...
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 12m deployment-controller Scaled up replica set hpa-demo-deployment-d4cf67d68 to 1
Normal ScalingReplicaSet 5m39s deployment-controller Scaled up replica set hpa-demo-deployment-d4cf67d68 to 4
Normal ScalingReplicaSet 5m24s deployment-controller Scaled up replica set hpa-demo-deployment-d4cf67d68 to 5
Normal ScalingReplicaSet 4m38s deployment-controller Scaled up replica set hpa-demo-deployment-d4cf67d68 to 7
We can see that the pods were scaled up from 1 to 4, then to 5, and finally to 7.
Decrease the load
Next, we’ll decrease the load. Navigate to the terminal where you executed the load test and stop the load generation by entering + C.
Then, verify the status of your resource usage:
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-demo-deployment Deployment/hpa-demo-deployment 0%/50% 1 10 1 25m
$ kubectl get deployment hpa-demo-deployment
NAME READY UP-TO-DATE AVAILABLE AGE
hpa-demo-deployment 1/1 1 1 25m
Another way to verify the status is:
$ kubectl get events
51m Normal SuccessfulCreate replicaset/hpa-demo-deployment-cf6477c46 Created pod: hpa-demo-deployment-cf6477c46-b56vr
52m Normal SuccessfulRescale horizontalpodautoscaler/hpa-demo-deployment New size: 4; reason: cpu resource utilization (percentage of request) above target
52m Normal ScalingReplicaSet deployment/hpa-demo-deployment Scaled up replica set hpa-demo-deployment-cf6477c46 to 4
52m Normal SuccessfulRescale horizontalpodautoscaler/hpa-demo-deployment New size: 6; reason: cpu resource utilization (percentage of request) above target
52m Normal ScalingReplicaSet deployment/hpa-demo-deployment Scaled up replica set hpa-demo-deployment-cf6477c46 to 6
51m Normal SuccessfulRescale horizontalpodautoscaler/hpa-demo-deployment New size: 7; reason: cpu resource utilization (percentage of request) above target
51m Normal ScalingReplicaSet deployment/hpa-demo-deployment Scaled up replica set hpa-demo-deployment-cf6477c46 to 7
53m Normal Scheduled pod/load-generator Successfully assigned default/load-generator to ip-192-168-74-193.us-west-2.compute.internal
53m Normal Pulling pod/load-generator Pulling image "busybox"
52m Normal Pulled pod/load-generator Successfully pulled image "busybox" in 1.223993555s
52m Normal Created pod/load-generator Created container load-generator
52m Normal Started pod/load-generator Started container load-generator
Destroy the cluster
Finally, we’ll destroy the demo EKS cluster with this command:
$ eksctl delete cluster --name my-hpa-demo-cluster --region us-west-2
2021-08-30 20:10:09 [ℹ] eksctl version 0.60.0
2021-08-30 20:10:09 [ℹ] using region us-west-2
2021-08-30 20:10:09 [ℹ] deleting EKS cluster "my-hpa-demo-cluster"
...
...
2021-08-30 20:12:40 [ℹ] waiting for CloudFormation stack "eksctl-my-hpa-demo-cluster-nodegroup-ng-1"
2021-08-30 20:12:41 [ℹ] will delete stack "eksctl-my-hpa-demo-cluster-cluster"
2021-08-30 20:12:42 [✔] all cluster resources were deleted