One of the things that I’ve seen blogged about and noted in the news over the last year is the amount of data being collected by various software systems. In particular, I’ve seen numerous complaints and concerns over what data Microsoft collects with its platforms. Perhaps this is because Microsoft is the largest vendor, and plenty of other software collects usage information, primarily to determine if features are being used or working correctly. I think much of this is overblown, but I can understand having sensitivity about our computer usage, especially for home operating systems.
Microsoft conducted an extensive review and published data about what is being collected and why. Windows 10 and other products have undergone review and must comply with the published policies. There’s even a Microsoft Privacy site to address concerns and explain things. It’s an easy to read policy that explains what Microsoft is collectin, if you’re connected to the Internet (if you’re not, don’t worry, no bloating files). That’s a huge step forward in an area that is evolving and something I wouldn’t have expected to see in the past. I am glad Microsoft is making strides here, even if I may not agree with specific items in their policies. I do think that most of the companies collecting this data are doing so to improve the products, not spy on customers. I’m sure some do, but likely smaller organizations with some sort of criminal intent.
As data becomes more important, telemetry for software is potentially a data leakage vector where private, personal, or customer information might be leaked. Certainly as more speech and other customized services are used in businesses, I worry about what data could be accidentally disclosed. After all, it’s not that super powerful smart phone that is actually converting audio to text in many cases; it’s a computer somewhere in the vendor’s cloud.
With databases, this has also been a concern from some people. I’ve seen the Customer Experience Improvement Program for years and usually opted in. I’m rarely doing something sensitive and I hope that with more data, Microsoft improves the platform. That’s the stated goal, and I’d seen them talk about this a few times. The SQL Server has moved forward and published an explicit policy that spells out what and when data is collected. It was actually just updated recently and all new versions of the platform must provide this information (if anything is different) and adhere to what they disclose. There is a chance that user data could leak into a crash dump, though users have the opportunity to review data before it is sent to Microsoft. I’m not sure how many will, but they have the chance.
I would like to be sure that anything sent is secured, and perhaps have an easy way to audit the data sent in a session, but I know this entire process is evolving. One important item to note is that customers can opt-out of data collection for any paid for versions of SQL Server. That’s entirely fair, but if you have regulatory concerns, you should be sure that you don’t restore production data to development machines. You shouldn’t anyway, but just an FYI.
Usage data is going to be a part of the future of software, especially as more “services” integrate into what we think of as software. Those services aren’t always going to be under our control and certainly part of the reason many of these are inexpensive is that additional data is captured about the people using the software. I hope all companies publish and adhere to some sort of privacy statement, and maybe that’s a good start. Rather than any regulation on specific privacy that must exist, start forcing companies to publish and stick to whatever policy they choose.