This is a series on SQL Data Generator, covering some interesting scenarios I’ve run into. If you’ve never tried it, SQL Data Generator is a part of the SQL Toolbelt. Give it a try today with an evaluation today.
I needed to generate some data for some development work on the SSC database. No, I’m not allowed to change code directly, but I was looking to send some changes to the development team, already done, and then hopefully just have them test and deploy it.
In my case, to avoid exposing any real data in case of issues, I downloaded the schema only to my laptop. The I created a database with all the objects. One of my first areas of work was on the points and scoring systems, but to do that, I needed points.
I fired up my copy of Data Generator, let it detect the objects, and pre-populate the fields and clicked “Generate Data”. That worked well, and I had a bunch of data in my system.
My first area of work was to rewrite some procedures that perform calculation. I did that, ran a simple SUM, and got this:
Msg 220, Level 16, State 2, Line 3
Arithmetic overflow error for data type int
Not what I expected. I just generated some data and ran a sum. What could be the issue?
It turns out that the default settings for integer columns are shown here:
That’s great if you want a random distribution, but it’s not so good in this case. The points values I want to store for each row should be fro 1 to 7, and randomly distributed. I’d actually like them to be weighted towards 1 and 2, but for this project, it doesn’t matter.
I decided to fix things by first deleting all the points data. Once this was done, I could then select the table on the left, and select the column.
This changes the right panel to the specific settings for this column. I changed the values, as you can see here, to be more in line with my needs. Only values from 0 to 7 are included.
I could actually use different settings for different columns. For example, for the PointsCategory column, I used these settings, from 1 to 1,000.
With these new settings, I generated new data for this table, and then my aggregate calculations worked.
Data generation is a very handy thing to have, especially in development environments where you don’t want live data. In my case, while I think my systems are fairly safe, I’d hate to lose my laptop, with a copy of the SQLServerCentral database and a million emails that people might not want shared.