Design a Notification System

Tired of waking up in the middle of the night because you received a notification? In this post, we’ll look at designing a notification system that is considerate of our beloved users.

 

 

When designing notification systems, not only do we lack a single perfect solution, but we actually have multiple scenarios. The design of a WhatsApp notification notifying you of a new message is not the same as a notification for a new video uploaded by a YouTuber with millions of subscribers. Even though the idea is the same, implementing the second the same way as the first is not recommended due to its cost. Today, we will focus on one-to-one notifications.

 

I am doing this intentionally to keep this post in the easy-to-grasp group.

 

 

1 - Notification System Requirements

 

A notification system has some simple requirements:

  • The system must know which notification we want to send.
  • Type of notification: this includes whether we want to send an email, an SMS, or a push notification to phones.
  • Scalable and available: as in every post in this series, the idea is for it to scale easily and remain available at all times. Also, escalation times are clear: they will be much higher during the day than at night, and depending on whether it’s an enterprise or leisure system, peaks will occur at different hours.

Additionally, we can mention user priorities. Here we might allow the user to choose which types of notifications they wish to receive, or set time ranges in which they do not wish to be notified.

 

If this were a real interview, you could ask what type of system you are building, or where it will be used, since if it’s a WhatsApp notification, you want it immediately, but if it's a forum notification, maybe you don't want notifications during the night or would like to group several together.

 

 

2 - Notification Contract Design

 

Let’s move on to define how message producers (internal applications) generate notifications. At this point, we're defining a contract, not mentioning an API as I did in other videos. The reason is simple: most notification systems work using the producer-consumer pattern, so we generate an event, not an API call.

 

The contract consists of several parts:

 

  • Metadata: In this section, we specify the unique identifier of the message, the date and time it was generated, as well as which channels or types of notifications we want to send. In summary, information about the notification.
{
    "metadata": {  👈
        "uniqueIdentifier": string,
        "timestamp": datetime,
        "channels": [sms, email, push]
    }
}

 

  • Recipient: The second part of our message is the recipient. Normally, it will only contain the user’s ID, since the notification system has access to the entire internal system.
{
    "metadata": { 
        "uniqueIdentifier": string,
        "timestamp": datetime,
        "channels": [sms, email, push]
    },
    "recipient": { 👈
        "userId": string
    },
}

 

  • Notification Content: Here we have all the information the recipient is going to receive. The object to be sent is generally an object containing all properties or fields for every possible channel, so it tends to be a large object. If a specific type isn’t sent, those fields will be empty.
{
    "metadata": { 
        "uniqueIdentifier": string,
        "timestamp": datetime,
        "channels": [sms, email, push]
    },
    "recipient": { 
        "userId": string
    },
    "content": { 👈
        "email": {
            "subject": string,
            "body": string,
            "attachments": [ (url, filename, mimeType) ],
        },
        "sms": {
            "content": string
        },
        "push": {
            "title": string,
            "body": string,
            "action": string
        }
    }
}

As we see, it contains all the information for the three ways we have to send messages: both SMS content, push notification, and Email.

If there is an attachment, in most cases the URL of the attachment is sent, not the file itself, as this avoids file size problems, but still, it depends on each specific scenario. If your system sends invoices, you may include the attachment in the email, but if you’re sending books, that might not be recommended.

 

 

3 - Notification System Architecture Design

 

As I like to do (and recommend in interviews), start simple and evolve. In this case, we start with something simple.

 

We have one or more systems that generate notifications. These notifications are published in a producer-consumer system, such as an eventbus, a queue, etc., which has a subscriber that reads the message and sends it to the corresponding third-party service (to send emails, notifications, or SMS), obviously after reading the contact information from the user's API.

 

arquitectura simple notificaciones

In the real world, depending on your system’s size, usage, and your company’s implementations, this solution may be enough, but in a design interview, we need to go further.

 

 

For our case, let’s continue with scaling. For this, the first thing we need is more instances of our consumer.

You might think that since we're behind a producer/consumer system, we don't need it, since we can consume at whatever rate we want, and that's true, but it's not always enough, if we have too many notifications, a significant backlog can build up. For example, if we can process 10,000 messages per second but receive 12,000 messages per second, every second we’re leaving 2,000 unprocessed, which go into the backlog.

 

So we need to scale up our consumer with more instances, and depending on what producer-consumer system you use, you’ll scale in one way or another. Kafka is a common choice for implementing Producer-Consumer. If you want to know more about its internals, I recommend buying my book Building Distributed Systems.

arquitectura de notificaciones v2

NOTE: If you’d chosen direct communication via API, you would also need more instances, but you’d need a load balancer in front of them.

 

It doesn’t end here. Instead of sending the messages directly from this application, we use this APP to correctly distribute the messages. They go to a specific queue for each possible type of notification, and ideally, after that, a native cloud function (Lambda Function, Azure function, etc.) processes the message and sends it to the third-party system we want to use.

sistema de notificaciones v3

Additionally, if you’re using Azure or AWS, you can create connectors that allow these actions (sending email, notifications, etc.) to be executed directly from the queue, without needing to create a function in the middle.

 

In most interviews where you’re asked for a notification system, this is about as much as you’ll be asked. WhatsApp alerts or alerts for a monitoring system work like this.

 

 

 

3.1 - Including User Preferences in a Notification System

 

For our scenario, we will include user preferences, something, by the way, I should consider on my own blog…

 

Here, we have two key questions:

  • First, in which layer do we introduce user preferences? In the APP layer that redistributes notifications.
  • Second, what do we do with notifications that shouldn’t happen yet? There are several answers to this question.

 

Before answering, we need to explain why we put that logic in the redistribution layer. The reason is simple: in many systems, it’s the only app you’ll need, since the rest of the infrastructure can be configured. The idea of the native function is to ensure every message it receives is sent. Not to add extra functionality, the main idea of these native cloud functions is for them to accomplish only one task, and user logic checks and actions, as we’ll see now, are additional functionality.

 

Now comes the question of what to do with notifications.

 

Suppose a user only wants to be contacted between 8am and 4pm, the rest of the time, they do not.

So, we have an API with user information that returns these details, and if we want to send a notification, we must check the user’s time preferences.

 

If our notification is within the user's allowed window, we propagate it immediately. If not, we have two options: the first is to ignore it and drop it. It may sound crazy, but if you have a store, for example, and want to put items on flash sale for the next two hours, but the user doesn’t want notifications at that time, we won’t send it, and we won't save it either, since when the user gets it, it will no longer be valid. Or YouTube live notifications, if you get the notice after it ends, it’s pointless too.

 

The case that really concerns us is those useful notifications that should be stored and propagated when the user is available. One solution is to have a queue for each hour users wish to receive notifications. For example, we have a queue (or stream) for notifications to be sent at 8am, another for 9am, and so on for each hour of the day.

 

Then, we have an app that reads that queue and propagates messages to the original producer-consumer system, which processes the messages as if they had just arrived.

flujo completo de un sistema de notificaciones

The flow would be as follows:

1 - An app in our system generates an event.

2 - Our consumer consumes the message.

3 - We check user preferences and read their info (email, phone number).

4 - If out of hours, we move the message to that hour's queue.

Note: if it’s within the allowed time, skip to step 9.

5 - A timer or cron job runs an app every hour (or with different parameters).

6 - We read the message from the corresponding queue.

7 - We propagate the message to the original notification queue/stream.

8 - We receive the message and read the user's info.

9 - We propagate the message to the appropriate notification queue.

10 - We read the message and send it to the third-party application that sends that type of notification.

 

If a message cannot be sent at any of the steps, we could implement a Dead-letter-queue for reviewing them later.

 

 

3.2 - Priority Messages

 

Even though this is the whole architecture, we have one last design point. Now that we have scheduled notifications, there may be some notifications that should override this rule. For example, if an entire system goes down, no matter what user preferences are configured, we must be able to send that notification. Therefore, the event contract should include a priority field; if it’s the maximum, the system should skip user preference checks altogether.

{
    "metadata": { 
        "uniqueIdentifier": string,
        "timestamp": datetime,
        "channels": [sms, email, push],
	"priority": low | medium | high 👈
    },
    "recipient": { 
       ...
    },
    "content": { 
        ...
    }
}

To do this, simply add the value in the metadata, and in this way, bypass any possible validation in our consumer app and send it directly to the appropriate queue. In this particular case, you can use priority 1, 2, 3 or high, medium, low, whatever works for you.

 

 

 

This post was translated from Spanish. You can see the original one here.
If there is any problem you can add a comment bellow or contact me in the website's contact form

Uso del bloqueador de anuncios adblock

Hola!

Primero de todo bienvenido a la web de NetMentor donde podrás aprender programación en C# y .NET desde un nivel de principiante hasta más avanzado.


Yo entiendo que utilices un bloqueador de anuncios como AdBlock, Ublock o el propio navegador Brave. Pero te tengo que pedir por favor que desactives el bloqueador para esta web.


Intento personalmente no poner mucha publicidad, la justa para pagar el servidor y por supuesto que no sea intrusiva; Si pese a ello piensas que es intrusiva siempre me puedes escribir por privado o por Twitter a @NetMentorTW.


Si ya lo has desactivado, por favor recarga la página.


Un saludo y muchas gracias por tu colaboración

© copyright 2025 NetMentor | Todos los derechos reservados | RSS Feed

Buy me a coffee Invitame a un café