Watch Your DataTypes in Aggregates–#SQLNewBlogger

Another post for me that is simple and hopefully serves as an example for people trying to get blogging as #SQLNewBloggers.

I’ve got a database of NBA statistics with data like this for players. I downloaded a CSV and loaded it into SQL Server.

2017-03-22 10_32_00-SQLQuery1.sql - (local)_SQL2016.NBA (PLATO_Steve (102))_ - Microsoft SQL Server

I decided to play with the data a bit and at one point wanted to see who scored the most points for a team and year. So I ran this query:

SELECT
    year,
    team,
    MAX(pts)
FROM dbo.player_regular_season
WHERE
    year = ‘1972’
    AND team = ‘LAL’
GROUP BY
    year,
    team;

The result was 705. That’s a decent number of points, and if I weren’t careful, this might seem fine. 1972 was a long time ago, and they didn’t score as many points as they do today in games.

In fact, if I were putting this in a summary report with lots of data, it might be the case that someone glancing at this would make a poor decision based on the data.

Why?

Let’s look at the data.

2017-03-22 10_46_16-SQLQuery1.sql - (local)_SQL2016.NBA (PLATO_Steve (102))_ - Microsoft SQL Server

Even a quick glance would let me know this seems funny. There are values of 1575 and 1084 in there, but the MAX() I returned was 705. If I look deeper at the import, I can see why.

2017-03-22 10_47_25-SQLQuery1.sql - (local)_SQL2016.NBA (PLATO_Steve (102))_ - Microsoft SQL Server

Anything stand out there? If you look, pts is a varchar, not a numerical value. In the character world, 705 beats 1575. I really need this query:

2017-03-22 10_48_30-SQLQuery1.sql - (local)_SQL2016.NBA (PLATO_Steve (102))_ - Microsoft SQL Server

Always be aware of the datatypes you work with and manipulate. Knowing a little bit about the meaning and use of the data can help you spot anomalies like this. As much as I like random test data, I’d also be sure you have some real data cases when you have users check your work. It’s easy for them to miss problems like this without good reference cases.

Or use good test data that you’ve setup and unit tests.

About way0utwest

Editor, SQLServerCentral
This entry was posted in Blog and tagged , , . Bookmark the permalink.