Using a Regular Expression to Detect a Number–#SQLNewBlogger

Another post for me that is simple and hopefully serves as an example for people trying to get blogging as #SQLNewBloggers. Here are some hints to get started.

I had a customer recently that was looking to work with Data Masker for SQL Server and had questions about how to handle some situations. In this case, they needed to detect a number type in a field that was overloaded with multiple types of data. Here’s an example of what they had in their “string” (varchar) field. Look at the stringvalue column below:

2022-06-07 08_23_13-SQLQuery1.sql - ARISTOTLE.SimpleTalk (ARISTOTLE_Steve (58))_ - Microsoft SQL Ser

If the string was a “nnn nnn nnnn” number value, then they wanted to change it. If it had other values, then leave it alone. This is really a query problem and a WHERE clause to structure.

One would think this is where you use ISNUMERIC() and try that. If I run this, I get zero rows back.

SELECT d.stringvalue
FROM dbo.ddmdemo AS d
WHERE ISNUMERIC(d.stringvalue) = 1

This isn’t really a number, as the sequence has spaces. What if we try this:

SELECT d.stringvalue
FROM dbo.ddmdemo AS d
WHERE ISNUMERIC(REPLACE(' ', '', d.stringvalue)) = 1

It also returns no values.

Really, this appears to really be a regular expression type of query, so I could do this, using LIKE.

SELECT *
FROM dbo.ddmdemo AS d
WHERE d.stringvalue LIKE '[0-9]%'

That, however, gives me two rows in this set of data. I see these results:

2022-06-07 08_31_14-SQLQuery1.sql - ARISTOTLE.SimpleTalk (ARISTOTLE_Steve (58))_ - Microsoft SQL Ser

The reason is that I am matching the first character only. The argument is a pattern and using square brackets implies a single character in a range. Since there are a lot of different patterns, and the “234223 Test” matches that, I ought to be more specific.

This particular pattern from the customer is 3 numbers, space, 3 numbers, space, 4 numbers. Anything else is non matching. Since there could be trailing spaces, I’d really want this:

SELECT d.stringvalue
FROM dbo.ddmdemo AS d
WHERE d.stringvalue LIKE '[0-9][0-9][0-9] [0-9][0-9][0-9] [0-9][0-9][0-9][0-9]'

This returns my single row. It would match any row that is of the pattern “nnn nnn nnnn” where n is a numerical value from 0-9.

There are other considerations here, and certainly this is likely to be a complex set of masking rules, but this shows a relatively simple way to detect a numerical pattern in a string.

SQL New Blogger

This was an interesting case. I initially thought  LIKE and an expression, but thought maybe there was a quicker way with isnumeric(). I didn’t find one, so I explained that and then the way that did work for me.

To me, this gives someone who glances at my blog a bit of insight into how I think and what I considered. This might be how they think, or someone on their team thinks. This might get me an interview.

Write about the problems you solve and how/why you do it.

About way0utwest

Editor, SQLServerCentral
This entry was posted in Blog and tagged , , . Bookmark the permalink.