Steve Ballmer retired from Microsoft almost three years ago. It seems like a small eternity, and since his purchase of the NBA Los Angeles Clippers, I haven’t heard much about any ventures with which he’s been involved. Apparently, he’s spent some of his time, and quite a bit of money to build a site that discloses data about revenue and expenditures for the US Government. There’s a piece in the NYT about this project, as well as a shorter Engadget summary of the site, which provide a short look at the project.
USAFacts.org is the site, and it’s a treasure of data sets. I look at this as a really interesting way to examine data sets that might be more difficult to gather than you’d expect. The Data Act deadline takes effect in May of 2017, which should also provide another way for anyone to look at public data and perform an analysis. I am disappointed the downloads aren’t working yet, but I hope that this will come soon, along with some sourcing information about where the data comes from and how it was gathered.
Having data sets to analyze is important for any organization. Certainly within our organizations we spend a lot of time producing reports and queries that help various people analyze data. In fact, finding, collating, cleaning, and organizing information can be a taxing proposition in any size organization. Our data sets and sources are so diverse and often inconsistently producing data that it’s amazing at times that our organizations run well. It seems on a regular basis someone wants to rebuild the methodology used to gather and organize information. I am not surprised that I constantly find incorrect calculations in software because the basis we use changes too often.
The big issue for me is that so many of us are amateurs when it comes to analyzing information. There aren’t many organized classes or a good structure for most data professionals to learn how to analyze data. We learn on the job, we make guesses and assumptions, and overall do a good job. However, data analysis is highly inconsistent from person to person. I’d like to see that change, and as I see more and more people blogging and talking about how they look at a particular data set, I hope more people are thinking about how to analyze information and how the choices we make for calculations, visualizations, and even ordering can affect how the results are interpreted.
I’m glad Mr. Ballmer has started this project, and I look forward to seeing how people might use this data and other data sets to provide some analysis of the world.