I’ve been working with databases for a long time. They’ve always been relational databases to me, unless they were a key-value, document, or NoSQL class of store. A few years ago at the Pass Summit, David Dewitt gave a keynote on changes to storage that Microsoft was implementing, talking about columnar storage.
At first the structure felt confusing, but as he proceeded, it started to make sense. We don’t like SELECT * for many reasons, but one is that lots of unnecessary data gets moved off disk, into memory, and across a network. This is the nature of a row based store, which is what we usually have in relational databases.
The columnar store puts all the columns together. The row values from different columns are separate, but if you are aggregating values in columns, the columnar store works very well.
So well, that we have columnstore indexes in SQL Server, which copy your data into a column-oriented format. While this might seem wasteful, you decide what gets copied, and you get the benefits of this format, which dramatically speeds up some types of queries.
The reason we have both stores is that we have a need for both to fulfill different query needs. I wouldn’t keep both stores for every table, but for some, it’s the best way to ensure your clients don’t spend a lot of time waiting for results.
I see more clients using columnstore indexes, and I was still seeing some sessions, but not as many as a few years back when the technology was new. If you’ve never tried building a columnstore index, this might be something you experiment with in development systems and understand how this can change your query performance and storage needs. We have a great Stairway Series to get you started, so take some time this year and read through it and practice the examples.