We continue to deal with larger and larger data sets all the time. In fact, it seems that most people find themselves outgrowing the capabilities of some of their OLTP databases, often RDBMS stores, and need to upgrade hardware or re-architect software. It doesn’t matter if you have a 100GB database on a few cores or 10TB on dozens of cores, there is often a need to upgrade to meet our workload demands.
In addition to the transactional needs, there is a growing demand for reporting and analysis workloads. Some people use a separate warehouse, and some want to just query data where it is. Certainly ETL processes and platforms have grown tremendously over the last few decades for those that want to implement the former process, but there is plenty of demand for the latter. In fact, I’m amazed how many customers have inquired if Redgate’s SQL Clone product will enable them to do this and spread their workload to other systems (it’s not designed for this).
I’ve been thinking that with SQL Server 2019 we will start to access data where it lives, not move it to another place we want it. To me, this is more of what future data orchestration might involve. I ran across an article that takes a slightly different approach, thinking AI and other products will help better move data around, and perhaps that’s true, but I do think more and more we want to query data where it lives, and use larger, distributed compute platforms to do this.
The scale out capabilities of SQL Server 2019, with the separation of compute and storage in Big Data Clusters, is a huge change that I think will be the future for many of us that look to meet reporting needs. The ability to grow hardware to match the workload needs is huge. This alone is a good reason to think about doing this in a hybrid or public cloud scenario.
Of course this doesn’t come cheap, easy, or quick. There is work to be done to evolve systems, but it is an area I think is worth experimenting in during the coming year. I bet many companies would be interested in some PoC work here to determine how to better meet the reporting requirements of larger data sets. Perhaps this is something you suggest to someone in your organization.