Masking Data

This editorial was originally published on Mar 7, 2008. It is being re-run as Steve is out of town.

I saw an interesting thread awhile back where one of our very talented community members was asking about how to go about altering data in an application for a demo. It’s a valid scenario and one that I’m sure many people have run into at some point in their career. You want to show data that’s somewhat real so that it showcases the application and what it can do, but you don’t want to show real names, amounts, or any identifying information.

A bit of a quandary and it seems that many people solve it in one of three ways. They just use test data, which is a very small set of data and doesn’t show as well. Or they may alter everyone’s name to something like “Steve Jones”, all phone numbers to 555-555-5555, etc., which looks funny.

Or you just show the production data and wink and say “I don’t usually do this, but since you’re such a valued client…”

So for the Friday poll: Do You Alter Production Data When It’s Copied?

Meaning when the data gets moved to a non-production system (demo, test, development, etc.), do you alter the data and obfuscate it to remove any identifying information. Make it “safe” data that can’t be used to somehow compromise your production system.

It’s a good practice, and one that I used to follow at a couple companies. I didn’t have any tools, but I did write scripts, load a few base tables of names, and then run those scripts as part of the restore job. They would randomly reassign new names to people, companies, addresses, etc. We would also redo phone numbers in sequential orders (555-555-0001, 555-555-0002, etc), and even randomly add products to sales or amounts to financial figures. It wasn’t perfect, and if you worked on the production system a lot you could guess which people were which, but it worked well for testing and client demos.

I actually ran into a product recently (Camouflage) that does this and it’s a great idea. It’s something that quite a few companies should be implementing to ensure that their non-production systems are that much more secure.

Steve Jones

About way0utwest

Editor, SQLServerCentral
This entry was posted in Editorial and tagged , . Bookmark the permalink.