Thiago YouTube Thumbnail

Horizontal Pod Autoscaler (HPA) is a powerful feature in Kubernetes that automatically adjusts the number of pod replicas based on resource utilization. In this guide, we’ll walk through setting up and experimenting with HPA in a Kubernetes environment.

Before we begin, make sure you have the following set up on your computer:

  • Docker
  • Kind (Kubernetes in Docker)

If you haven’t installed these tools yet, don’t worry! Check out the links in the video description for my tutorials on installing them on Windows, Mac, and Ubuntu.

Ubuntu – Kind Docker

Mac – Kind Docker

Windows – Kind Docker

Understanding HPA

HPA is a control loop that continuously monitors pod resource usage and automatically changes the number of replicas to maintain the required level of utilization for CPU and Memory.

For example, if we set a target CPU utilization of 50%, HPA will increase or decrease the number of pods to keep the average CPU usage across all pods close to 50%.

Hands-on Demo

Let’s go through the process of setting up and testing HPA in a Kubernetes cluster. We’ll use a Makefile to simplify our commands.

1. Create a Kubernetes Cluster

First, let’s create a Kind cluster:

This command creates a cluster named “hpa” using a predefined configuration.

2. Enable Metrics Server

To use HPA, we need to enable the Metrics Server:

Verify that the Metrics Server is running:

3. Deploy a Resource-Intensive Application

Now, let’s deploy our application:

4. Create a Traffic Generator

We’ll need a way to generate load on our application:

Add the wrk load testing tool to our traffic generator:

5. Apply HPA Configuration

Let’s apply our HPA configuration:

6. Monitor Resources

To see how our cluster is performing, we can monitor node and pod resources:

7. Generate Load

Now, let’s generate some load on our application:

For CPU-intensive load:

For memory-intensive load:

8. Observe HPA in Action

While the load is being generated, observe how HPA adjusts the number of pods:

You should see the number of pods increasing as the load increases, and decreasing as it subsides.

Cleanup

Once you’re done experimenting, clean up your cluster:

Conclusion

In this hands-on guide, we’ve explored how to set up and use Horizontal Pod Autoscaler in Kubernetes. HPA is a powerful tool for maintaining application performance and efficiency by automatically scaling your workloads based on resource utilization.

However, it’s important to note that HPA is just one of several scaling options available in Kubernetes and cloud environments:

  1. Vertical Pod Autoscaler (VPA): Unlike HPA, which scales the number of pods, VPA adjusts the CPU and memory resources of existing pods. This can be useful when you want to optimize resource allocation without increasing the pod count.
  2. Cluster Autoscaler: This solution scales the number of nodes in your cluster. It’s particularly useful in cloud environments where you can dynamically add or remove virtual machines.
  3. Custom Metrics Autoscaling: Kubernetes allows you to scale based on custom metrics, not just CPU and memory. This can be valuable for applications with specific performance indicators.
  4. Cloud-specific Solutions: Many cloud providers offer their own autoscaling solutions that integrate well with Kubernetes. For example:
  • AWS: Elastic Kubernetes Service (EKS) with Cluster Autoscaler
  • Google Cloud: GKE Autopilot
  • Azure: Azure Kubernetes Service (AKS) with Virtual Machine Scale Sets

You can find more Kubernetes tutorials and tips on my blog at https://thiagodsantos.com/blog/.

Share This Tutorial:

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *