Idempotency in Software Development

This post comes as a result of a bug that was found in a client's software.

 

To summarize the situation, there were two records of an item in the database, when in theory there should only be one. After some investigation and seeing that the bug was not from my application, I reported it and was told there was nothing they could do since the event had been processed twice.

 

Since it wasn't my company, I didn't argue, although I did send them a link to the idempotency page on Wikipedia.

 

 

 

1 - What is idempotency?

 

Idempotency, when we talk about software, is a technique that allows us to perform an operation multiple times, always obtaining the same result.

For example, if we process an event twice, the second time should not alter the outcome. In programming, we specify a property in an event or request to prevent it from being executed more than once.

For this example, we’re going to see it within the Distribt project, since idempotency is very important in distributed systems, but later on, we’ll also provide an example of idempotency in REST.

 

A very common example is that we have a product with an inventory of 10 and create an order that contains two of these products; we must generate an event to reduce the inventory. This action follows the same logic we saw in the post about event sourcing.

event sourcing

 

So far so good, but what happens if the system fails and the event is processed twice?

Well, in the current situation, we would process twice and end up with 6 products instead of 8, meaning the inventory count would be incorrect, since the event was processed twice, but only one order was created.

 

This is where idempotency comes in.

 

 

2 - Implementing idempotency in a System

 

The first thing we should do is include a property in all our events that contains this unique id. It's very common to use “IdempotentId” or “messageidentifier” as the name of the property, but you can obviously use whatever name you want.

 

For this example, I’m using my Distribt library, and that property is in the IMessage interface, which means both integration and domain messages will contain it.

public interface IMessage{    /// <summary>    /// Must be unique;    /// </summary>    public string MessageIdentifier { get; }    /// <summary>    /// Name for the message, useful in logs/databases, etc    /// </summary>    public string Name { get; }}

And when creating the messages, a GUID is automatically assigned.

Personally, automating with a property you know won't repeat is my favorite option; technically you can use the ID or a combination, but anything you define manually runs the risk of repeating. However, if you can guarantee it's a unique key, you can use whatever you like.

private static IntegrationMessage<T> ToTypedIntegrationEvent<T>(T message, Metadata metadata){    //The first parameter is the message identifier 👇    return new IntegrationMessage<T>(Guid.NewGuid().ToString(), typeof(T).Name, message, metadata);}

Now all events have the property assigned with a different value.

Remember, idempotency does not prevent you from processing two simultaneous events if you have a bug and produce two different events, as technically they are different events. What it prevents is the same event from being executed multiple times.

 

Finally, what you should do in the consumer is check that the event hasn’t been processed, and if it has, act accordingly.

 

Idempotency is very popular in distributed systems, but I also personally recommend using it in APIs in certain scenarios, to prevent the same HTTP call from being executed twice.

When using an HTTP call, we’ll usually send the IdempotentId as a header, although each system is different.

 

 

3 - How to approach creating an idempotent system

 

When building a system, there are several things to keep in mind.

 

You need to consider whether it is a synchronous system (HTTP calls) or an asynchronous system (distributed system) because depending on the system, there are certain things you can or cannot do.

 

The main thing is to know when a call or event has already passed through the system. There are various ways to do this—in fact, I have seen all kinds of approaches.

To do this, we can create a table where the primary key is the IdempotentId and a column that contains the entire event. From here, everything else is optional: a second column with the date and time, a fourth with the operation status, and a final one with the result of the process.

This type of table can be either in a cache like Redis, in a database, or even in memory, though that might not be the best option in production.

 

If you’re in a distributed system, you could simply store the ID since you won’t be returning any information.

The status field is also optional, and keeping it updated requires some work.

 

Now comes the explanation for the date and time field. Some will tell you that storing all the information is good, others that it's bad. The reality is that this is called TTL (time to live), and many companies and systems store it for 24 hours. This means that if the event arrives after 48 hours, it will be processed again.

The decision to store IDs or even events indefinitely depends not only on the domain but also on the economics, as storing lots of messages in distributed systems can be expensive.

 

My recommendation is to store them for 24 hours or maybe a week, but not more than that.

 

 

And now comes the main difference between a synchronous and an asynchronous system.

If you have an asynchronous system where nobody is waiting for a response, you can simply finish the process there, check that the ID is in the database, and if it is, end the process.

 

 

3.1 - Idempotency in synchronous systems (REST)

 

In the case of a synchronous system, things change, since whoever makes the call expects an immediate response.

 

At this point, we have two options.

The first is to fail the second call (and all subsequent ones):

idempotency sequence diagram

At first glance, this action seems to make the most sense because obviously, the second call is erroneous and shouldn’t happen.

 

But in the real world, that second call is happening because whoever is sending the information has no idea where the first one is—either due to a bug or a network failure, etc. Therefore, in this case, it’s best to respond to the second call with the result of the first.

 

This means we must store the result temporarily. But in doing so, we make the client’s life much easier as the system is much simpler and easier to manage. If we don’t return the result, we’d have to create another system for them to look up whatever it is they were looking for.

In the case of a purchase order, we simply return the ID.

 

And in this way, we ensure the API behaves more consistently.

 

This post was translated from Spanish. You can see the original one here.
If there is any problem you can add a comment bellow or contact me in the website's contact form

© copyright 2025 NetMentor | Todos los derechos reservados | RSS Feed

Buy me a coffee Invitame a un café