At one point in my career I was working on a new application and we were debating about storing certain metrics related to the application usage and customer’s behavior. My boss asked for my opinion and I said that more data was better than less and we could always delete the data if we found it was not being used.
However we rarely delete data. We seem loathe to remove old data, even when we find it’s slowing down our systems. Worse still, there aren’t great methods for even stripping out, and preserving older data other than custom work in each system. I ran across a piece that talks about the tremendous amount of data that we are constantly acquiring, a deluge that overwhelms us in so many scientific areas. There are some endeavors collecting so much data that they must restore to storing data in networked systems, making the data sets available only through software that can combine the information from various databases.
In the corporate world of data we usually don’t deal with such large amounts of data, but we are often dealing with hardware that constrains our ability to work effectively with the data we have. We find that our disparate systems are spread across so many places that it can become hard to aggregate the bits together and extract information.
I do think that we will start to feel the stresses of dealing with so much data and finding the meaningful information from it. The people that learn to do this well, and filter their bits effectively will become very valuable to their companies in the future.
The Voice of the DBA Podcasts
We publish three versions of the podcast each day for you to enjoy.