This post mainly comes as a result of an interview I did at the beginning of 2023, and while I won't mention the company, I will say the technical interview was to build an app that worked with Kubernetes. Before looking at that app, we need to explain Kubernetes.
Index
Kubernetes is a topic that could fill a thousand hours of content; my goal in this post is for the reader to understand what it is and what it can do, so if you want to dive deeper, you will have a basic foundation.
1 - What is Kubernetes?
If you are reading this, I assume you know what a container is and why they are important.
Kubernetes is a tool that allows us to manage and automate containers in the cloud.
This is very important, because with Kubernetes we can configure applications to scale and decide with which settings we want them to do so.
For example, on a travel website in Spain, at night, we hardly have any traffic because everyone is asleep, but during the day there is much more, so we need to increase the server. The same thing happens before the summer vacation period, everyone logs in to check prices, so again we have to scale up the server.
If we stop thinking of the server as a single machine with RAM and a processor, and instead see it as a complete application, the concept becomes much easier to understand.
To go back to the analogy, at night, when everyone is asleep, we only need the minimum resources for the app to work, so we have our frontend (ideally static deployment) pointing to an API. But it does not point directly to the API; it points at the Kubernetes cluster, which is in charge of routing the call to the correct container (using a load balancer), and this is what points to your container, which in Kubernetes are called pods:
Before continuing, in the previous paragraph I mentioned several terms, while it is not mandatory to know exactly how each works, it's good to know what they are:
- Kubernetes cluster: a group of nodes working together to run containerized applications.
- Node: Machines, either individual or physical, that run the pods.
- Pods: the image that we're going to run. In the companies I've worked for, we always used one image per pod, but you can place more than one image in a pod if necessary.
- Services: although I haven't mentioned them yet, services exist and allow pods to communicate with each other.
Returning to our use case, during the day, we have much more traffic and need 3 instances to make sure everything runs smoothly:
And during certain periods, when there's more traffic, we need 5 instances:
Therefore, Kubernetes is the platform that manages the infrastructure to make the necessary changes to allow this to happen. This kind of software is usually called an orchestrator.
Kubernetes (also referred to as k8s) doesn't just handle auto-scaling. If a container breaks for any reason, it replaces it with a new one, allowing us to achieve high availability in our applications.
And as developers, that's the main thing we need to know about k8s.
I'll mention a couple of features that I know about, but Kubernetes is a vast and complex world that requires a long course to understand fully.
1.1 - What is ETCD in Kubernetes
ETCD in Kubernetes is a distributed key-value database. It stores things like configuration, service locations, process coordination, or secrets.
1.2 - Portability of Kubernetes
The main advantage in my opinion, and the reason it's used by so many companies, is because Kubernetes can run on any cloud service provider or even on your own machine.
Furthermore, this portability comes from the definition of Kubernetes itself, which is written in yaml, as we'll see later.
1.3 - Kubernetes on your local machine
If you are working locally, the easiest way to set up Kubernetes is to have Docker Desktop installed, because it comes with Kubernetes (although you need to enable it in the options):
2 - Kubernetes Manifest
The Kubernetes manifest is a file that specifies how Kubernetes will create and manage a specific resource in a Kubernetes cluster.
This file can be written in either yaml or json and describes the state of Kubernetes elements, where you define deployments, replicas, services, etc.
This is the file that allows you to define the ideal or desired state, so it contains the configuration that will ensure your application scales when needed.
2.1 - Parts of a Kubernetes manifest
The file is based on 4 parts:
- apiVersion: defines the API version to use.
- Kind: defines the type of object you are defining, whether it's a pod, deployment, or service.
- Metadata: includes labels, names and annotations.
- Spec: defines the desired state of a Kubernetes object, as well as its properties and settings.
- Status: describes the state of an object, the number of replicas and the version.
Each service you use can have different specifications.
2.2 - Example of a k8s file
Now let's look at an example file, in this case with MySQL, since the database is part of most infrastructures out there.
The first part is to define the PersistentVolume, here we set up the physical volume of the node used to store MySQL data, in this case on disk, on the path defined in hostPath. If you use AWS, Azure, Google Cloud, etc, this is where you define the configuration.
apiVersion: v1
kind: PersistentVolume
metadata:
name: mysql-pv
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data/mysql"
In the same file we define the PersistentVolumeClaim
, which requests a volume with the specified storage amount. In our case, we reference the PersistentVolume
we created.
apiVersion: v1
kind: PersistentVolume
metadata:
name: mysql-pv
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data/mysql"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
We move on to the definition of the service we want to use, in this case mysql, where we define how a pod is exposed to the outside and ensures communication with the pod.
Just like in docker, if the default port for mysql is taken on your machine, you can use another one and k8s will map your machine's port to the pod's port.
apiVersion: v1
kind: PersistentVolume
metadata:
name: mysql-pv
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data/mysql"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: Service
metadata:
name: mysql
spec:
ports:
- port: 5306
targetPort: 3306
protocol: TCP
selector:
app: mysql
clusterIP: None
Finally, the deployment contains all the internal information for the container. In our case we specify which image and version (mysql version 5.7), environment variables, the port, and the volumes we'll use, referencing those we created at the beginning.
apiVersion: v1
kind: PersistentVolume
metadata:
name: mysql-pv
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data/mysql"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: Service
metadata:
name: mysql
spec:
ports:
- port: 5306
targetPort: 3306
protocol: TCP
selector:
app: mysql
clusterIP: None
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: mysql
spec:
selector:
matchLabels:
app: mysql
strategy:
type: Recreate
template:
metadata:
labels:
app: mysql
spec:
containers:
- image: mysql:5.7 #arm64v8/mysql
name: mysql
env:
- name: MYSQL_ROOT_PASSWORD
value: "your-root-password"
ports:
- containerPort: 3306
name: mysql
volumeMounts:
- name: mysql-storage
mountPath: /var/lib/mysql
volumes:
- name: mysql-storage
persistentVolumeClaim:
claimName: mysql-pvc
We can have all the definitions in a single file (if we separate each one with "---
") or have a file for each definition.
In my opinion, depending on how large your system is, it can be worthwhile to have a single file instead of many. If you have 10 services, use multiple files; if you just have a database and an app, you can use one file.
If you save this information in a file called deployment.yaml, you can run it with the command below
kubectl apply -f deployment.yaml
And you'll get a result like the following:
persistentvolume/mysql-pv created
persistentvolumeclaim/mysql-pvc created
service/mysql created
deployment.apps/mysql created
If you now run kubectl get pods you should be able to see the image you can connect to without issues:
And if you're like me and use Docker Desktop, you can see it there too:
3 - What is Helm?
Finally, we have Helm. If you go to its GitHub repo, you'll see that it's a package manager with pre-configured packages for Kubernetes called Charts.
A chart is nothing more than a combination of everything we saw in the previous section, in other words, all that an application requires to run correctly and be deployed as a unit.
So, instead of having all that configuration we've shown above, you only need to do helm install stable/mysql
That Helm package contains everything needed for the app to work.
Another of Helm's features is that it supports templates and variables. So if you use Kubernetes at your company, you typically have a helm chart for backend apps and another for front end apps, which contain variables that allow you to specify the image, name, or number of replicas. This saves a lot of configuration, leaving only a handful of lines in each repository. That configuration will be passed to the template, and all the apps will work the same way.
4 - Should I learn Kubernetes as a developer?
It's important to have a basic knowledge of Kubernetes if you work as a back-end or full-stack developer, mostly because most companies use containers and anything you learn about working with these containers can always be useful for you.
Now, going to the extreme of knowing everything by heart isn't really necessary, just knowing the basic commands to see pods or read logs, plus being able to understand the documentation should be enough as a developer. If you go beyond that and learn k8s well (not sure if anyone truly knows Kubernetes perfectly in the world), you'll become a hybrid between developer and DevOps, which is really great and very well paid 😉.
If there is any problem you can add a comment bellow or contact me in the website's contact form