Introduction to Distributed Systems

📅 27 Dec 2021 ⏱️ 15 min 💾 Code 🎥 YouTube 🇪🇸 Spanish Version 💬 0

Nowadays, companies are beginning to take initiatives to move away from the monolith and embrace distributed systems.

But this comes with its own share of headaches, since the architectures of both systems will be completely different.

The main difference is that when we use monoliths, we generally overlook the domain of each section of the application and which parts of the database a process can access, etc. This brings improvements like performance. In the end, if you make a call to the database, it's faster than making a call to an endpoint and then to the database.

The problem comes when we go from hundreds of requests to thousands, and we need to scale the system. At that moment, we realize that it's not possible, or that having microservices would make it much more robust and reliable.

What I mean is that distributed systems are a solution, just as using a monolith or a modular monolith is a viable solution.

Each solution has its pros and cons. Using distributed systems for a small system like a blog is overkill (using more than you need) and, in this case, a monolith or modular monolith would be ideal.

1 - What is a distributed system?
- 1.1 - Advantages of distributed systems
- 1.2 - Disadvantages of distributed systems
2 - Distributed systems vocabulary
- 2.1 - Vendor abstractions
Conclusion

1 - What is a distributed system?

As its name suggests, a distributed system is a system where its components are located on different machines in a network. These machines can be physical machines, that is, different servers, or virtual machines (containers, pods, etc.) but configured in such a way that they can communicate with each other and, from the user's perspective, it is not apparent that they are different servers or machines.

A very common example of a distributed system would be a microservices architecture, but microservices are just the tip of the iceberg. Throughout this course, we will see different techniques that will allow us to build fully distributed systems in a reliable way.

1.1 - Advantages of distributed systems

Reliability: As mentioned, distributed systems are distributed, which means we won't have all our infrastructure on a single machine. We'll also have replicas, etc., so if a machine fails, for example, if a hard drive fails, the application should still keep working while we replace the damaged hard drive.
Scalability: One of the greatest benefits is that we can scale parts of our application individually, and that's a big advantage over monoliths. In a distributed architecture, if we have a part that gets a heavier load at a certain hour of the day, we just need to scale horizontally or vertically to handle the extra demand.
Performance: Related to the previous point, we can distribute the workload among different machines, so they don’t affect each other.

1.2 - Disadvantages of distributed systems

Latency: One key point is that everything is separated, so between different systems we'll need to communicate over the network, which introduces some delay (latency) compared to other architectures like monoliths.
Observability: When we start with distributed architectures, we have more applications, more individually running processes acting as triggers, etc. To avoid losing track if the system grows a bit, we must be precise with observability, use software and techniques that let us follow the data flow, which will be very useful in case of any issues and will also help us get a better understanding of the system.
The mindset shift: With the rise of distributed systems, we must not only change the way we design systems, but also the way we program, using new techniques. This change takes time and practice. If you are new to the world of distributed systems, remember that nobody is born with this knowledge and every change requires dedication, don’t get discouraged.

2 - Distributed systems vocabulary

With this change in mindset comes new technologies, and these technologies can be confusing to understand at first. Especially because a lot of times, they don’t make sense on their own, but once you see the final result, it all makes sense.

That’s why, throughout this course, we will explore these techniques or ways of working.

In this post, we’re just scratching the surface by mentioning the names and shedding a little light on their meanings.

This will be the initial architecture we will work on:

distributed architecture

It’s important to note that we will see specific posts for each of these technologies with their implementations, and the code will always be available in the project Distribt on GitHub .

Some of the elements we will see are:

API Gateway Pattern: A service that allows us to have a single entry point to our application.
Service discovery/registry: The service that will let us locate services, whether internal or external to each other.
Synchronous communication with HTTP or gRpc.
Asynchronous communication using queues or message buses with the consumers-producers pattern.
Eventual consistency: All microservices contain updated information, eventually, not at the same time (usually milliseconds).
Sagas, CRQs, and much more!

2.1 - Vendor abstractions

A vendor is simply the application, software, or service you use to implement a feature. For example, if we think of the database, it could be MySql, MSSql, MariaDb, Sqlite, MongoDb, DynamoDb, Oracle, etc.

Each one has a different implementation, which means that for every service we want to use, we must create the implementation in a specific way.

For me, it is very important to create common code, which will be used by multiple services as a library, where we abstract all that logic and provide a single entry point that is agnostic to the vendor we're going to use.

These libraries are usually called Common or Shared, and I personally recommend them, as if one day you want to switch from one service to another, you'll only have to make the change once and update the libraries in the services that use them. And remember that this step can be automated.

Conclusion

In this post, we’ve seen an introduction to distributed systems.
The advantages and disadvantages of distributed systems.
Finally, we’ve covered some of the vocabulary used in distributed systems.

If you want to learn more, don’t forget to follow the website via RSS, or the YouTube channel, where I’ll upload videos about each aspect.

This post was translated from Spanish. You can see the original one here.
If there is any problem you can add a comment below or contact me in the website's contact form.

📢 Share

💬 Comments

➡️ Next Post

Patrón API Gateway

Introduction to Distributed Systems

Table of Contents