What is Rate Limiting? Implementation in .NET

This is a post I’ve been wanting to do for quite a while, and I’ve waited until the release of .NET 7 to create it, because the new version of .NET brings built-in rate limiting functionality, which makes things a lot easier.

1 - What is Rate Limiting?
2- Why use rate limiting?
3 - Implementing rate limiting in .NET
4 - Types of Rate Limiting
5 - Custom Rate Limiting policies
6 - Rate limiting per endpoint
Conclusion

1 - What is Rate Limiting?

When we create an API, sometimes, especially when it’s publicly accessible, we want to limit its usage.

This is done to prevent overuse. A clear example is in paid applications: the paid version allows X number of calls, while the free version allows many fewer:

ejemplo rate limiting

A real world example is the YouTube API, which only lets you “query” 10,000 units (videos, comments, etc.) per day.

What is not rate limiting? It’s not a filter to restrict where a client can access or permissions, the proper solution for that is either API Keys or JWTs.

2- Why use rate limiting?

One reason is to directly put restrictions on clients, for example, only allowing X number of calls and offering a paid or premium version with a higher limit.

Another example I’ve also seen, even if it’s less common now, is when the app (or its database) can only handle X number of simultaneous calls, if you go beyond that number, the performance drops considerably or even crashes. Limiting the API calls ensures this doesn’t happen.

And of course, it protects you from DDoS attacks and similar cases.

These reasons, or a combination of them, are the most common, often, you have clients who can make all the calls they want, but there’s a little asterisk saying "10 per minute" or something similar.

3 - Implementing rate limiting in .NET

As I mentioned, in the new .NET version (version 7), rate limiting is built-in. That means if you create or update your project to net7 or higher, you’ll be able to use the middleware that lets you configure rate limiting.

For this post, I implemented rate limiting in the application for the Distribt distributed systems course, which has its code available on GitHub.

This is the full diagram, but for this example, you only need to focus on the green dot and a bit on the API key we discussed in the previous post.

arquitectura distribt

The library is quite complete, and we’ll explore the different options it provides. In any case, you’ll need to use the extension method to configure the rate limiting middleware with UseRateLimiter().

webappBuilder.Services.AddRateLimiter(options => {...})
app.UseRateLimiter();

By the way, add the UseRateLimiter middleware before your first endpoint because, as you may recall, middlewares execute in order.

Before moving on to the different types of rate limiting, I want to explain a bit how the library itself works, let’s look at a very simple example:

webappBuilder.Services.AddRateLimiter(options =>
    options.AddConcurrencyLimiter(policyName: "concurrencyPolicy", limiterOptions =>
    {
        limiterOptions.PermitLimit = 10;
    }));

This example is the simplest: the API with this configuration will only be able to handle 10 simultaneous calls. But what about the rest?

3.1 - Queues in Rate Limiting

One option is to enable a queue, where you specify its size and the order (FIFO or LIFO)

webappBuilder.Services.AddRateLimiter(options =>
    options.AddConcurrencyLimiter(policyName: "concurrencyPolicy", limiterOptions =>
    {
        limiterOptions.PermitLimit = 10;
        limiterOptions.QueueLimit = 100;
        limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
    }));

3.2 - Rejecting calls with Rate Limiting

But once the simultaneous call limit and the queue are both full, what happens to the new requests coming in?

Well, if everything’s “full,” those requests will be rejected, and you can specify the status code to return with RejectionStatusCode.

webappBuilder.Services.AddRateLimiter(options =>
{
    options.RejectionStatusCode = 429; // Too many request
    options.AddConcurrencyLimiter(policyName: "concurrencyPolicy", limiterOptions =>
    {
        limiterOptions.PermitLimit = 10;
        limiterOptions.QueueLimit = 100;
        limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
    });
});

In our case, we return 429 which is commonly used to indicate too many requests have been made.

3.3 - Rate Limiting response message

Returning an error code is fine, but it’s not very informative, ideally, you’d include an error message using the OnRejected delegate:

webappBuilder.Services.AddRateLimiter(options =>
{
    options.RejectionStatusCode = 429; // Too many request
    options.OnRejected = async (context, token) =>
    {
        await context.HttpContext.Response.WriteAsync("too many requests, please try again later ");
    };
    options.AddConcurrencyLimiter(policyName: "concurrencyPolicy", limiterOptions =>
    {
        limiterOptions.PermitLimit = 10;
        limiterOptions.QueueLimit = 100;
        limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
    });
});

Ideally you’d want to standardize your responses to meet the normalization requirements of the rest of your APIs.

api response

3.4 - Identifying the caller in rate limiting

To identify who is making the call and apply different rate limiting filters, you use the PartitionKey, and in the next section, we’ll see how to configure it.

But basically, it’s as follows:

partitionKey: "the identifier here"

4 - Types of Rate Limiting

Within rate limiting there are several configurations you can apply and also several ways (algorithms) to build that configuration. Until now, we’ve used options.AddConcurrencyLimiter(...), which is fine, but you can also set up your configuration with options.GlobalLimiter, and we’ll see a couple of examples.

Note: GlobalLimiter limits the entire app, whereas .Add.. needs to be specified per endpoint/controller, etc.

4.1 - Concurrency limit

This is the clearest and simplest example: you allow X number of simultaneous calls. If you allow 10 simultaneous calls, the 11th will be denied

limite de concurrencia rate limiting And this would be the code:

webappBuilder.Services.AddRateLimiter(options =>
{
    options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
    options.OnRejected = async (context, token) =>
    {
        await context.HttpContext.Response.WriteAsync("too many requests, please try again later ");
    };
    options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
        RateLimitPartition.GetConcurrencyLimiter(
            partitionKey: "the identifier here",
            factory: _ => new ConcurrencyLimiterOptions()
            {
                PermitLimit = 10,
                QueueLimit = 0,
                QueueProcessingOrder =  QueueProcessingOrder.OldestFirst
            }));
});

4.2 - Token bucket limit

The name comes from its real world representation.

We have a limited number of calls (tokens), for example 100, and when a request arrives, it spends a token; once you’ve used them all, you can’t use the service anymore.

limite de deposito de tokens rate lmiting

Here you can see the code:

webappBuilder.Services.AddRateLimiter(options =>
{
    options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
    options.OnRejected = async (context, token) =>
    {
        await context.HttpContext.Response.WriteAsync("too many requests, please try again later ");
    };

    options.AddTokenBucketLimiter("policy-name", limiterOptions =>
    {
        limiterOptions.TokensPerPeriod = 2;
        limiterOptions.TokenLimit = 100;
        limiterOptions.ReplenishmentPeriod = TimeSpan.FromMinutes(5);
    });
});

In this example, we have a limit of 100 tokens, and every 5 minutes you get two tokens back.

4.3 - Fixed window limit

Similar to the previous case, the difference here is that there’s no token limit:

webappBuilder.Services.AddRateLimiter(options =>
{
    options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
    options.OnRejected = async (context, token) =>
    {
        await context.HttpContext.Response.WriteAsync("too many requests, please try again later ");
    };

    options.AddFixedWindowLimiter("policy-name", limiterOptions =>
    {
        limiterOptions.PermitLimit = 2;
        limiterOptions.QueueLimit = 10;
        limiterOptions.Window = TimeSpan.FromMinutes(5);
    });
});

4.4 - Sliding window limit

Similar to the previous case, but the difference is that each window has a set number of segments. In the following example:

webappBuilder.Services.AddRateLimiter(options =>
{
    options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
    options.OnRejected = async (context, token) =>
    {
        await context.HttpContext.Response.WriteAsync("too many requests, please try again later ");
    };

    
    options.AddSlidingWindowLimiter("policy-name", limiterOptions =>
    {
        limiterOptions.PermitLimit = 5;
        limiterOptions.QueueLimit = 10;
        limiterOptions.SegmentsPerWindow = 5;
        limiterOptions.Window = TimeSpan.FromMinutes(5);
        

    });
});

Here we have 5 calls per window, and the window lasts 5 minutes, but as well, it’s divided into 5 segments, so you can only make one call per minute.

5 - Custom Rate Limiting policies

We can create a policy completely customized for our system. To do that, all you have to do is create a class that inherits from IRateLimiterPolicy<T>, where T is the partition key you’re going to use, normally it’ll be a string.

When you create the class, you implement the members you need, and you’ll have something like this:

public class DistribtRateLimiterPolicy : IRateLimiterPolicy<string>
{
    public RateLimitPartition<string> GetPartition(HttpContext httpContext)
    {
        return RateLimitPartition.GetFixedWindowLimiter(
            partitionKey: httpContext.Request.Headers["apiKey"].ToString(),
            partition => new FixedWindowRateLimiterOptions
            {
                PermitLimit = 10,
                Window = TimeSpan.FromMinutes(60),
            });
    }

    public Func<OnRejectedContext, CancellationToken, ValueTask>? OnRejected { get; } =
        (context, _) =>
        {
            context.HttpContext.Response.StatusCode = StatusCodes.Status429TooManyRequests;
            context.HttpContext.Response.WriteAsync("too many requests, please try again later ");
            return new ValueTask();
        };
}

In this case, since it’s Distribt and we know we have an API Key, we simply read that key for the partition key, which represents a client. But for other public services, you could have a policy for when the API Key exists, for example 10 calls per hour, but if it doesn't exist, only one call per hour.

Note: this is just like any other class, it supports dependency injection, etc.

Now we need to change the configuration a bit: you no longer need to use services.AddRateLimiter but instead, when you call useRateLimiter, you specify the policy.

app.UseRateLimiter();
app.MapGet("/rate-limiting-test", () =>
{
    return "Hello World!";
}).RequireRateLimiting(new DistribtRateLimiterPolicy());

6 - Rate limiting per endpoint

Most rate limiting configurations are set up for the API as a whole (you have X calls, no matter where), but you can also limit per endpoint.

For this, instead of using app.UseRateLimiting() in your configuration, you create a policy and then use it with RequireRateLimiting.

webappBuilder.Services.AddRateLimiter(options =>
{
    options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
    options.OnRejected = async (context, token) =>
    {
        await context.HttpContext.Response.WriteAsync("too many requests, please try again later ");
    };

    options.AddConcurrencyLimiter("policy-name", limiterOptions =>
    {
        limiterOptions.PermitLimit = 2;
        limiterOptions.QueueLimit = 0;
    });
});

....

app.MapGet("/rate-limiting-test", () =>
{
    Thread.Sleep(2 * 60 * 1000);
    return "Hello World!";
}).RequireRateLimiting("policy-name");

A similar approach works if you want to disable rate limiting, just use the DisableRateLimiting filter or extension method.

app.MapGet("/", () => "Hello World!").DisableRateLimiting()

As I mentioned, this works both with minimal APIs and standard MVC controllers.

In MVC, you’ll use the filter attributes [EnableRateLimiting("policy-name")] and [DisableRateLimiting] at either the controller or endpoint level.

Conclusion

In this post, we’ve seen how to create Rate Limiting in .NET

We’ve covered the different types of Rate Limiting

This post was translated from Spanish. You can see the original one here.
If there is any problem you can add a comment bellow or contact me in the website's contact form

Comparte

Facebook

Twitter

What is Rate Limiting? Implementation in .NET

Table of Contents

1 - What is Rate Limiting?

2- Why use rate limiting?

3 - Implementing rate limiting in .NET

3.1 - Queues in Rate Limiting

3.2 - Rejecting calls with Rate Limiting

3.3 - Rate Limiting response message

3.4 - Identifying the caller in rate limiting

4 - Types of Rate Limiting

4.1 - Concurrency limit

4.2 - Token bucket limit

4.3 - Fixed window limit

4.4 - Sliding window limit

5 - Custom Rate Limiting policies

6 - Rate limiting per endpoint

Conclusion

Comparte

Recent Posts

What is Rate Limiting? Implementation in .NET

Table of Contents

1 - What is Rate Limiting?

2- Why use rate limiting?

3 - Implementing rate limiting in .NET

3.1 - Queues in Rate Limiting

3.2 - Rejecting calls with Rate Limiting

3.3 - Rate Limiting response message

3.4 - Identifying the caller in rate limiting

4 - Types of Rate Limiting

4.1 - Concurrency limit

4.2 - Token bucket limit

4.3 - Fixed window limit

4.4 - Sliding window limit

5 - Custom Rate Limiting policies

6 - Rate limiting per endpoint

Conclusion

Comparte

Recent Posts

Hazte miembro de la comunidad!

Apoya a la comunidad!

Guía completa de desarrollo full stack con .NET

Apoya a la comunidad!

Construyendo SISTEMAS DISTRIBUIDOS