I upgraded SSMS to 17.5 recently and found an interesting addition. This version has incorporated the ability to classify data. With the GDPR coming for many of us, this is a welcome addition.
This is a quick look at this feature.
Classify a Database
If I select a database and right click it in SSMS, I get a few new items in the Tasks menu (as shown).
I’ll select “Classify Data”, and I get a new tab opened. I see there are some recommendations and also a list of classifications of data.
Below this, I see a list of the recommendations. This has grabbed tables that appear to continue to contain some data that might be sensitive and require classification. One of the tenets of the GDPR is that you know your data. You aren’t allowed to figure this out later, but rather you must proactively know what data you are collecting and processing.
Here we can see a few drop downs to the right. I’ll scroll and look at these. First is the Information Type. This is listed as a name, but I have other options I can set. The list is the types of that that might be sensitive information about a data subject (a human or entity) that I need to classify.
Beside this is the sensitivity label. My choices here are shown below. These range from public information, which removes some of my responsibility to highly confidential and applying to the GDPR.
If I’m happy with these recommendations, I can select them all (or a subset) on the left. I can click the “Accept” button to add them to the classifications I have for this database.
This doesn’t save them, but adds them to the list. At the top of this tab I can see the need to “Save” my changes.
Once I’ve done this, I could add more, or view a report. The report shows me this:
My guess was that these are implemented as extended properties, which makes sense. That’s how many things could make SQL Server better, and I’m right. If I examine the EP for the firstname column in one table, I see this:
This was the column I changed to public information. The lastname column in the same table is marked as confidential.
Ultimately is this useful? Yes. I can see other products taking advantage of this, such as the new Data Masker from Redgate, which could let you know which columns are sensitive and not masked. I’d also expect that this is useful and important for ETL and other operations to carry this metadata to new columns that might contain transformations or movement of this data.