SQL Server Regular Expression CLR UDF

Face it: data cleanup is a fact of life. While SQL Server has a handful of string manipulation functions, nothing even comes close to the power of RegEx. Fortunately, by leveraging the CLR functionality in SQL Server 2005 and SQL Server 2008, we can add a host of new features, including regular expressions.

Steps

  1. First, fire up Visual Studio (2005 or 2008 – it doesn’t matter).
  2. Create a new project – name it something clever, like “RegEx”
  3. After creating the project, you should be prompted to connect to a database where you’ll eventually want to deploy the project. This is completely optional and can be changed later.
  4. Right-click the project name (“RegEx”) and choose Add → User-Defined Function
    Add User Defined Function
    Add User Defined Function
  5. Name the file RegExMatch.
  6. Paste the following code into that file
  7. When done, simple build (Ctrl+Shift+B) and deploy (Right-click the project name (“RegEx”) → Deploy).
using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Text.RegularExpressions;

public partial class UserDefinedFunctions
{
    [Microsoft.SqlServer.Server.SqlFunction(IsDeterministic = true, IsPrecise = true)]
    public static SqlString RegExMatch(SqlString expression, SqlString pattern)
    {
        if (expression.IsNull || pattern.IsNull)
            return SqlString.Null;

        Match match = new Regex(pattern.ToString()).Match(expression.ToString());

        return match.Success ? new SqlString(match.Value) : SqlString.Null;

    }
};

Code Explanation

  • Line 1-6: Including necessary assemblies. The only item you need to add is line 6 – System.Text.RegularExpressions
  • Line 11: We indicate the function requires 2 parameters, the input string and the regular expression to apply.
  • Line 13-14: Check if either input string is NULL. If so, return NULL and do nothing else.
  • Line 16: There’s a lot packed on this line, but essentially, it creates an object named match based on the results of the regular expression match operation.
  • Line 18: Use the ternary operator to check if the match was a success. If so, return the matching string. Otherwise, return a NULL.

Examples

We’re going to use a simple regular expression to check for a valid US postal code (AKA Zipcode + 4):

^d{5}(-d{4})?$

This regular expressions checks for exactly 5 digits followed by an option group of hyphen and 4 more digits.

select
	  dbo.RegExMatch('90210','^d{5}(-d{4})?$')
	, dbo.RegExMatch('90210-1234','^d{5}(-d{4})?$')
	, dbo.RegExMatch('90210-','^d{5}(-d{4})?$')
	, dbo.RegExMatch('9021A','^d{5}(-d{4})?$')

And results:

90210	90210-1234	NULL	NULL

Other notes

If you need to change the target database, do so in the project’s properties:

  1. Right-click the proeject name (“RegEx”) → Properties
  2. Select the Database tab (2nd item from the bottom)
  3. Click “Browse…” to create and test the connection string.
    Database Properties
    Database Properties

Thanks to Jeff’s SQL Server Blog for the initial Regular Expression Replace code.

Leave a Reply

Your email address will not be published. Required fields are marked *

* Copy This Password *

* Type Or Paste Password Here *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>