A topic that hasn’t been discussed much lately is the use of Stored Procedures in SQL, mostly because people ignore their existence, which isn’t always the best idea as sometimes they can be the best solution.
Table of Contents
1 - What is a Stored Procedure?
A stored procedure (SP) is simply a set of SQL instructions that are executed in the database. It’s a way to group instructions that we’re going to execute from code, but they could also be triggered directly from the database or any accessible system.
What matters here is that inside a stored procedure, you can have one or more SQL statements.
And a stored procedure can both receive parameters and return a result.
2 - Are Stored Procedures Obsolete?
Something you might have noticed if you work in companies is that stored procedures are rarely used anymore. Most database interactions are done directly in code, either with ORMs (Entity Framework Core/Dapper in C#, ActiveRecord in Ruby, JPA/Hibernate in Java, etc.) or by writing SQL in code using your language’s connector.
This happens because nowadays databases run on servers that are more than capable of handling all we need and more. Not only that, but upgrading the server, if it gets strained during certain hours, is not expensive compared to the overall cost of running a company.
So companies, quite wisely in my opinion, choose the speed of development ORMs provide rather than paying €300/month for a server with 32GB RAM and 16 vCPUs. For most businesses, a server like that is more than enough.
Such a server can easily handle more than 150,000-200,000 reads per second (for reads by Id) and about 10,000-15,000 inserts. This would be an example for apps that mainly perform basic CRUDs.
With those numbers, the cost of the database shouldn’t be an issue, especially if you’re a B2B company whose clients pay per use or through subscriptions.
But of course, “most” is not “all”. There are companies where, due to the nature of the data, either because of its volume or structure, the database’s cost is much higher, and any performance improvement, however small, has a great impact in the long run.
NOTE: I once worked for a company where our database cost us €3,000 per day (yes, daily).
3 - When should you use a Stored Procedure?
As I just mentioned, using Stored Procedures is extremely important for those companies where there’s heavy usage and where tiny performance improvements will have a significant long-term impact.
There are several reasons why this is true. SQL is a real language, optimized to the extremes and basically, it’s as fast as a language can get, especially if you know when and where to use indexes. The SQL developer role exists for a reason.
SPs let you do exactly what you need and return only what you need, reducing network transfer, which is an important factor in high-usage scenarios over time.
Even though for me, that’s the main reason, there’s also the topic of security: you can assign permissions by Stored Procedure, which means every application can have a different user connecting to the database, each with permissions for certain SPs.
Of course, permissions can be set by table, but it’s much easier to manage permissions by Stored Procedures than by tables or even by columns.
Finally, what I’ve noticed is that when all your business logic is in SQL, it’s much easier to identify when something is happening. For example, imagine that you have a bug where a certain column has a value it shouldn’t.
At this point you need to find out where and why that change happened. The first step is to investigate where it could happen.
If you have everything in stored procedures, you’ll have one or more SPs updating this value, and you only need to look up where those SPs are used in your company’s repositories, which is simple since they have unique names, then you can analyze the code from there.
If you don’t use stored procedures, you need to do the same thing, BUT searching the codebase directly, which can take much longer since it’s harder to look for a needle in a haystack (all company repositories) than in a sewing shop (the SQL repo).
I want to clarify something here: I recently read a tweet saying ORMs have ruined optimization and that with an ORM you never know if there’s an optimization problem... Well, that couldn’t be further from the truth.
Whether you use an ORM or not, the database always keeps track of which query runs, when it runs, and all of its details. It’s easier to optimize if it’s in a stored procedure, sure, but it’s certainly not lost in limbo.
4 - Should we use Stored Procedures?
Although I’ve been pretty positive about stored procedures, it’s important not to forget that their use makes application development much slower, since it’s an extra layer to manage and maintain, and that always matters.
Ignoring security and the other points mentioned before (about being easier or more convenient in certain cases), using or not using stored procedures basically boils down to two main questions.
The number of queries per second your system receives (not just a single application, but as a whole) and second, how complex those queries are.
Let me explain: for a select * by ID
it doesn’t matter if you use a stored procedure or not, you won’t notice any performance difference whether it’s a single call or 100,000 per second.
Where you will notice the difference is when doing complex queries, or even implementing logic in those Stored Procedures, it’s not ideal, but depending on the situation, you might need logic inside SQL. It’s very common to see scenarios where an SQL query takes 4–5 seconds (which is a lot), but its code equivalent takes almost a minute, simply due to complexity and all that’s being done to the data.
Finally, while Entity Framework has improved in recent versions and is no longer as inefficient when it comes to joins, it’s still generated code, so in my opinion, whenever you have more than two joins, you should always do the query manually, either with a Stored Procedure or by sending the query manually through Entity Framework.
5 - Stored Procedures in Entity Framework Core
While it’s true that you don’t need Entity Framework Core to use stored procedures, it offers the possibility, which is very helpful when you have situations where 90% of your code is basic CRUDs, but the other 10% are very complex queries.
One thing to keep in mind about stored procedures is that it doesn’t matter whether you execute them inside a Unit of Work or not, they are atomic and are always executed individually.
To see how it works from Entity Framework, the first thing you must do is create a stored procedure directly in SQL:
CREATE OR REPLACE FUNCTION get_user_by_id(user_id INT)
RETURNS TABLE (
"Id" INT,
"UserName" TEXT,
"Email" VARCHAR,
"IsDeleted" BOOLEAN,
"DeletedTimeUtc" TIMESTAMPTZ,
"LastUpdateUtc" TIMESTAMPTZ
) AS $$
BEGIN
RETURN QUERY
SELECT u."Id", u."UserName", u."Email", u."IsDeleted", u."DeletedTimeUtc", u."LastUpdateUtc"
FROM "Users" u
WHERE u."Id" = user_id;
END;
$$ LANGUAGE plpgsql;
Note: Each database has its own format for stored procedures, so I recommend checking the details for your specific database. In our case, the example above is for PostgreSQL; here’s the MySQL equivalent:
DELIMITER //
CREATE PROCEDURE GetUserById(IN user_id INT)
BEGIN
SELECT Id, UserName, Email, IsDeleted, DeletedTimeUtc, LastUpdateUtc
FROM Users
WHERE Id = user_id;
END //
DELIMITER ;
As you can see, the difference is just syntax, but it means you can’t use procedures written for one DB in the other.
To call it from the database itself (in Postgres), we do a select on that function:
select * from get_user_by_id(1);
Now let’s move on to the section where we call this SP from the code.
Remember, in this project we’re using the repository pattern, and not only that, but we also have a generic repository that contains most of the CRUD code.
What we’re going to do is override the GetById
method to call the Stored Procedure, which we do using raw SQL as shown in an earlier post.
public override async Task<User?> GetById(int id)
=> await Entities
.FromSqlInterpolated($"SELECT * FROM get_user_by_id({id})")
.FirstOrDefaultAsync();
A quick note before we finish: if 98% of your queries are handled by Entity Framework, there’s no issue with invoking one or two stored procedures like this. But if 98% of your code consists of Stored Procedures, then you should use your DB’s driver directly rather than EF, something like the following:
using (var conn = new NpgsqlConnection(connectionString))
{
conn.Open();
using (var cmd = new NpgsqlCommand("SELECT * FROM get_user_by_id(@user_id)", conn))
{
cmd.Parameters.AddWithValue("user_id", userId);
using (var reader = cmd.ExecuteReader())
{
if (reader.Read())
{
int id = reader.GetInt32(reader.GetOrdinal("Id"));
string userName = reader.GetString(reader.GetOrdinal("UserName"));
string email = reader.GetString(reader.GetOrdinal("Email"));
bool isDeleted = reader.GetBoolean(reader.GetOrdinal("IsDeleted"));
DateTime? deletedTimeUtc = reader.IsDBNull(reader.GetOrdinal("DeletedTimeUtc")) ? (DateTime?)null : reader.GetDateTime(reader.GetOrdinal("DeletedTimeUtc"));
DateTime lastUpdateUtc = reader.GetDateTime(reader.GetOrdinal("LastUpdateUtc"));
return new user(id, username, email, isDeleted, deletedTimeUtc, lastUpdateUtc);
}
else
{
return null;
}
}
}
}
If there is any problem you can add a comment bellow or contact me in the website's contact form