Regular Expressions in C# – Quantifiers and Delegates
In this tutorial you will learn about Quantifiers, Grouping constructs, Backreferences, Backreference Constructs, Alternation Constructs, Miscellaneous Constructs, System.Text.RegularExpressions Namespace, Delegates in the namespace System.Text.RegularExpressions and Typical Examples of Regular Expressions.
Quantifiers
Quantifiers add optional quantity data to a regular expression. A quantifier expression applies to the character, group, or character class that immediately precedes it.
Quantifier |
Description |
* |
Specifies zero or more matches; for example, |
+ |
Specifies one or more matches; for example, |
? |
Specifies zero or one matches; for example, |
{n} |
Specifies exactly n matches; for example, |
{n,} |
Specifies at least n matches; for example, |
{n,m} |
Specifies at least n, but no more than m, matches. |
Grouping constructs:
These allow you to capture groups of subexpressions and to increase the efficiency of regular expressions
Backreferences provide a convenient way to find repeating groups of characters. They can be thought of as a shorthand instruction to match the same string again.
Backreference Constructs
These are optional parameters that add backreference modifiers to a regular expression
Backreference construct |
Definition |
\number |
Backreference. |
\k |
Named backreference . You can use single quotes instead of angle brackets; for example, |
Alternation Constructs
These modify the regular expression to allow either/or matching
Alternation construct |
Definition |
| |
Matches any one of the terms separated by the | . for example, |
(?(expression)yes|no) |
Matches the "yes" part if the expression matches at this point; otherwise, matches the "no" part. . The "no" part can be omitted. |
(?(name)yes|no) |
Matches the "yes" part if the named capture string has a match; otherwise, matches the "no" part. . The "no" part can be omitted. |
Miscellaneous Constructs
These are sub-expressions that modify a regular expression
Construct |
Definition |
(?imnsx-imnsx) |
Sets or disables options such as case insensitivity to be turned on or off in the middle of a pattern |
(?# ) |
Inline comment inserted within a regular expression. . The comment terminates at the first closing parenthesis character. |
# [to end of line] |
X-mode comment. The comment begins at an unescaped # and continues to the end of the line |
System.Text.RegularExpressions Namespace:
The .NET Base Class Libraries include a namespace and a set of classes for utilizing the power of regular expressions The regexp classes are contained in the System.Text.RegularExpressions.dll assembly, and you will have to reference the assembly at compile time in order to build your application.
The namespace System.Text.RegularExpressions defines the following classes
Capture: Contains the results of a single match
CaptureCollection: A sequence of Capture’s
Group: The result of a single group capture, inherits from Capture
Match: The result of a single expression match, inherits from Group
MatchCollection: A sequence of Match’s
MatchEvaluator: A delegate for use during replacement operations
Regex: An instance of a compiled regular expression
Regex class also contains several static methods:
Escape: Escapes regex metacharacters within a string
IsMatch: Methods return a boolean result if the supplied regular expression matches within the string Match: Methods return Match instance
Matches: Methods return a list of Match as a collection
Replace: Methods that replace the matched regular expressions with replacement strings
Split: Methods return an array of strings determined by the expression
Unescape: Unescapes any escaped characters within a string
Delegates in the namespace System.Text.RegularExpressions
MatchEvaluator: The delegate that is called each time a regular expression match is found during a Replace operation.
Enumerations in the namespace System.Text.RegularExpressions
Regexoptions: Provides enumerated values to use to set regular expression options.
Typical Examples of Regular Expressions:
Example 1: Confirm a Valid format for the Data Entered
Here is a small function written in C#
……………………………………………………………………
bool IsValidDataFormat(string strDataEntered) …………………………..
{ ………………………………………………………………….
// Return true if strDataEntered has a valid Email ID format ……………..
return Regex.IsMatch(strDataEntered, @"^([\w-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}.
\.[0-9]{1,3}\.)|(([\w-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$"); ………..;
} …………………………………………………………………;
……………………………………………………………………
Example 2: Clean up a string having invalid characters:
……………………………………………………………………
string CleanInput(string strInput) …………………………………….
{ ………………………………………………………………….
…..// Replace invalid characters with empty strings. …………………..
…..return Regex.Replace(strInput, @"[^\w\.@-]", ""); …………………..
}…………………………………………………………………..
……………………………………………………………………
Example 3: Find a sub-string corresponding to a particular pattern
……………………………………………………………………
using System; ……………………………………………………….
using System.Collections; …………………………………………….
using System.Text.RegularExpressions; ………………………………….
int FindString(string strinput) ……………………………………….
{ ………………………………………………………………….
…..string matchMyStyle = “<.”; ………………………………………
……………………………………………………………………
…..Regex RE1 = new Regex(matchMyStyle , RegexOptions.Multiline); ………..
…..MatchCollection ListMatched = RE1.Matches(strinput); ………………..
…..int i = ListMatched.Count; ……………………………………….
…..return i; ………………………………………………………
}…………………………………………………………………..
……………………………………………………………………
Example 4: Find a specific occurrence of a Pattern Match
……………………………………………………………………
string strinput ="< Employee > < / Employee > "; ……………………….;
string strGetValue =""; ………………………………………………
int FindOccurrence = 2; //to find the second occurrence ………………….
……………………………………………………………………
string matchMyStyle = @"< ….."; //match pattern ……………………….
……………………………………………………………………
……………………………………………………………………
Regex RE1 = new Regex(matchMyStyle , RegexOptions.Multiline); …………….
MatchCollection ListMatched = RE1.Matches(strinput); …………………….
……………………………………………………………………
int i = ListMatched.Count; ……………………………………………
……………………………………………………………………
…..// if total no of occurrences is less than expected do nothing ……….
if ( FindOccurrence< =i ) …………………………………………….
{ ………………………………………………………………….
…..//get the value of specified of occurrence. ………………………..
…..strGetValue = ListMatched[FindOccurrence-1].Value.ToString(); ………..
……………………………………………………………………
}……………………………………………………….………….
……………………………………………………………………
Example 5: Adding Comments to a Regular Expression
To add a comment within a Regular Expression use the # sign and include the option RegexOptions.IgnorePatternWhitespace as shown in the example below
……………………………………………………………………
int FindString(string strinput) ……………………………………….
{ ………………………………………………………………….
……………………………………………………………………
…..string matchMyStyle = @"<.# Need to Find this"; ………………….;;;
……………………………………………………………………
…..Regex RE1 = new Regex(matchMyStyle , ……………………………;;;
…..RegexOptions.Multiline|RegexOptions.IgnorePatternWhitespace); ………..
…..MatchCollection ListMatched = RE1.Matches(strinput); ………………..
…..int i = ListMatched.Count; ……………………………………….
…..return i; ………………………………………………………
}…………………………………………………………………..
……………………………………………………………………
Summary:
In the article above we discussed –
a) What are Regular Expressions?
b) .NET support and implementation for Regular Expressions.
c) Using Regular expressions with C# and .NET