Better Test Data Domains with SQL Data Generator

This is a series on SQL Data Generator, covering some interesting scenarios I’ve run into. If you’ve never tried it, SQL Data Generator is a part of the SQL Toolbelt. Give it a try today with an evaluation today.

I’ve been working more with SQL Data Generator (SDG) because it solves some problems that many software developers have with test data. Often each developer needs to create their own set of test data, with these being the common actions:

  • Use a backup of production, perhaps old.
  • Randomly insert a few values each developer comes up with
  • Use random data from SDG, a new project each time.
  • Load a known data set from production or test systems

While I think small amounts of random data work well, I think the data should reflect (somewhat) the types of data in production. Totally random strings don’t work, but similar words, structures, etc. make sense.

However having a consistent set of data for each developer is a great idea. SDG can consistently generate data sets, but to make them meaningful, you might want to have control over the types of data inserted.

Using Real Words

I talked about the difference between random words and real words. However, what if you want to include specific types of items?

Let’s evolve some data. If I have a varchar(500) column, SDG defaults to this:

[A-Z0-9]*

Which gives me this:

2015-08-28 10_03_55-New notification

That’s not great if I wanted to examine specific rows and determine if they were being returned by a query. This is just too random and hard to verify.

However I have options. For example, I could use the “Insert File List” item. This gives me a list of files that could be helpful. In this case, let’s choose “Color”.

2015-08-28 10_05_33-SQL Data Generator - Fun_Article_Titles_Words.sqlgen

I see this in the RegEx box.

($”Color.txt”)

Now my test data looks like this:

2015-08-28 10_06_14-

What’s in “color.txt”? Let’s see. The file is under the Data Generator 3 folder, in a Config location. I see lots of XML and text files.

2015-08-28 10_08_09-Config

If I open Color.txt, I see what I expect.

2015-08-28 10_08_17-Get Started

Now, let’s experiment. Let’s create a SQL Server file. I’ll put values in like this:

2015-08-28 10_08_17-Get Started

I need to change permissions on the config folder to allow saving, but I put it there.  I also had to close and re-open SDG to pick up the new file.

Now I’ll add that to the RegEx box.

2015-08-28 10_15_44-SQL Data Generator - Fun_Article_Titles_Files.sqlgen _

That’s cool. What if I made a file with a number of random words in it. Like a dictionary of sorts. I could do this. I create dictionary_small.txt.

2015-08-28 10_22_10-Get Started

Now I include that a number of times.

2015-08-28 10_23_17-New notification

Those are some great descriptions. I’m sure I could have fun with this in other ways as well. Let’s create some good and bad data for a cleaning operation.

2015-08-28 10_28_38-New notification

The ability to include data from flat files is a great option in SDG for putting together a data set that developers can actually use, understand, and count on for loading up new databases or tables, especially when creating quick code branches to test something out.

About way0utwest

Editor, SQLServerCentral
This entry was posted in Blog and tagged , , , , . Bookmark the permalink.
%d bloggers like this: