Tim O’Reilly has been an advocate of open data access and standards for some time, especially from governments. He’s pushed for more interoperability and certainly more accessability from all sorts of groups. He gave an interview earlier this year to LinuxVoice where he talked about a variety of things, but data was foremost on his mind.
There are some good thoughts, but I was pleased to see him looking for more software to adapt how it works with data rather than asking data to match the application. An interesting thought he had was in the area of control systems. Does every device or sensor need a separate application and way of interacting or should we have some guiding design principles that let similar applications work in similar ways with different data? That almost sounds like good data modeling and normalization principles in action, backing a data driven application.
I also liked his acknowledgment of the fact that so much of our data isn’t very portable. Between social networks and proprietary storage, it becomes hard to move data around. The pattern of downloading data, perhaps editing, perhaps not, and then uploading elsewhere works great with ETL tools, but it’s cumbersome for many users and applications to deal with. Building ways for us to interact with disparate data, allowing for queries to remote sources, sometimes transforming and copying data, all of this needs to be easier to implement and integrate inside software.
In some ways, I think the 3.0 model of our Internet interaction will take place around data. I think SSIS will continue to be one of the most valuable tools in SQL Server (along with lots of demand for work), but it still needs improvement and enhancement to catch up to other ETL tools. I really hope Microsoft believes this and continues to invest in the tool.
I also think that the data professionals that really stand out in the next decade will be those that learn to make the choices about when to use R, JSON, XML, HADOOP, or whatever non-RDBMS tool to meet a need. But also when not to use these tools. The better data professionals will make good decisions about when to query data, and when to move it to another system.
It’s an exciting time to work with data as the opportunities and rewards continue to expand and grow. I look forward to what the future will bring us.