Regular Expressions in C#

 

1 - What is a regular expression? 

A regular expression is a sequence of characters that defines a pattern to search for in text.

Therefore, we call this character sequence the pattern.

 

C# provides us with an engine or library that allows us to check regular expressions. When we use regular expressions within C#, we use the Regex type.

 

1.1 - Defining regular expressions

Pattern in regular expressions 

When we define regular expressions, we must understand that what we are doing is looking for matches to a pattern within a text, which is a string of characters inside some text, whether in a string variable or in a file.

This pattern can be defined using different categories, such as characters, operators, and constructors. I have created a Cheat Sheet to create regular expressions, in this link, which contains all the possibilities.

For this post we will use the following sentence:

Sentence number 01 on the blog netmentor.es to explain regular expressions, repetition of characters and 01 numbers01

note: the sentence is a bit strange in order to have variety when checking patterns. 

Some of the most common ones can be the following:

Literal search"numero 01"Sentence numero 01 on the...
Start of sentence"^Frase"Frase numero 01...
Conditional"numero [01]"Sentence numero 01 on the...
Repetitions"numero [01]{2}"Sentence numero 01 on the...
Repetitions 2 (0 -N times)"numero [01]*"Sentence numero 01 on the...
Range"[b-o]{4}"...on the blog of...
Whitespace"\s"Sentence number 01 on the...
Any number"\d*"Sentence number 01 on the...
Any letter"\w*"Sentence number 01 on the...

 

Groups in regular expressions

By creating patterns we have the option to create groups. To do this, we enclose part of the pattern in parentheses, commonly used to search and find values. 

In the example sentence of this post, we could create a pattern like the following “Frase numero (\d*)” which would positively match the following text: "Frase numero 01" and the group would be only 01.

 

2 - Regular expressions in C#

In C#, we are going to use the Regex type to create regular expressions. The first thing to keep in mind is that to initialize it we will use the static methods over variable initializations, since within the static method .Match for example (it applies to all), it calls the initialization of the type new Regex(), but also includes a caching system, so if we do two Regex.Match(pattern) it will be faster than initializing two objects. 

Don't forget that to escape characters in C# we use the character \ so if we want to indicate the character \ for the regular expression we must either put the character @ before the phrase or indicate two backslashes \\.

 

2.1 - Regex class in C#

When we execute a check in Regex, depending on which method we are calling we can get different types of response. 

We have simple types such as string or bool, but we also have two types from the RegularExpressions library, which are 

Match which contains the following important properties: 

  • success: if the pattern exists in the sentence.
  • Groups: Returns the Groups
  • Index: Position in the text where the "match" is found
  • NextMatch(): Iterates to the next element if it exists.
  • MatchCollection which is a collection of Match types, which means that if we use LINQ  we can iterate, or search.

 

The most common methods we are going to use within the Regex class are the following: 

  • Regex.IsMatch(string sentence, string pattern) returns bool

This tells us if the pattern provided is within that sentence. Be careful with this, as if the pattern is repeated multiple times it still returns true. 

bool resultIsMatch = Regex.IsMatch(frase, @"netmentor\.es");
//resultIsMatch is true

 

  • Regex.Match(string sentence, string pattern) returns Match
Match resultMatch = Regex.Match(frase, @"netmentor\.es");

 

  • Regex.Matches(string sentence, string pattern) returns MatchCollection
MatchCollection resultMatches = Regex.Matches(frase, "\\d");
foreach(Match match in resultMatches.ToList())
{
    var posicion = match.Index;
}

 

  • Regex.Replace(string sentence, string pattern, string replacement) returns the original sentence updated with the replacement
string fraseCambiada = Regex.Replace(frase, "las expresiones regulares", "regex");
//el valor de nuestra frase original cambia a ".... para explicar regex, repeti..."

 

  • Regex.Split(string sentence, string pattern) returns a string[] array that contains the different parts of the text divided by the pattern.
string[] arrayFrases = Regex.Split(frase, ",");
//array that contains 2 string
//arrayFrases[0] -> Sentence number 01 on the blog netmentor.es to explain regular expressions
//arrayFrases[1] ->  repetition of characters and 01 numbers01.

As I said, we have more methods, but these 5 will cover 99.9% of the times we need to check any element with regular expressions. 

 

2.2 - Options when invoking Regex in c#

When we are running regular expressions in c# we have method overloads available, in which we can include an additional parameter in the method. This parameter is RegexOptions This means that when we run the regular expression, it will contain additional or different options from the default values. 

We have several options, but the most important ones are: 

  • RegexOptions.IgnoreCase

As we know, regular expressions distinguish between uppercase and lowercase. If we enable the IgnoreCase option, it will not take case into account.

Match resultMatchCase = Regex.Match(frase, @"NETMENTOR\.es", RegexOptions.IgnoreCase);

 

  • RegexOptions.Compiled

It allows us to compile a regular expression. What it does is compile it to binary during compilation, but it performs much better at runtime if we use it a lot. For common cases of once in a while, it's not necessary. 

Regex regexCompilada = new Regex(@"netmentor\.es", RegexOptions.Compiled );
Match resultMatchRight = regexCompilada.Match(frase);

 

Microsoft has an example  where it compares the same regular expression compiled and Interpreted over a text file.

When checking a file with 10 lines, the Interpreted version takes 0.0047491ms milliseconds, while the compiled one takes 0.0141872 ms, about 3 times more. While if the file has 13,000 lines the interpreted one takes 1.192ms and the compiled one 0.7ms, which is a little more than half. 

 

Conclusion

In the workplace, the main use we usually give to regular expressions is when validating fields that the user uses, for example to check that the user does not enter characters in fields where there should only be numbers, check that the email is correct, etc.

 

This post was translated from Spanish. You can see the original one here.
If there is any problem you can add a comment bellow or contact me in the website's contact form

Uso del bloqueador de anuncios adblock

Hola!

Primero de todo bienvenido a la web de NetMentor donde podrás aprender programación en C# y .NET desde un nivel de principiante hasta más avanzado.


Yo entiendo que utilices un bloqueador de anuncios como AdBlock, Ublock o el propio navegador Brave. Pero te tengo que pedir por favor que desactives el bloqueador para esta web.


Intento personalmente no poner mucha publicidad, la justa para pagar el servidor y por supuesto que no sea intrusiva; Si pese a ello piensas que es intrusiva siempre me puedes escribir por privado o por Twitter a @NetMentorTW.


Si ya lo has desactivado, por favor recarga la página.


Un saludo y muchas gracias por tu colaboración

© copyright 2025 NetMentor | Todos los derechos reservados | RSS Feed

Buy me a coffee Invitame a un café