This year at the PASS Summit, Dr. Rimma Nehme (b|t) will give another keynote address. She has spoken at a number of previous Summits, and is a very popular speaker. This time her topic is CosmosDB, which is where she’s been working at Microsoft for some time. I was lucky enough to see Dr. Nehme at Build 2017 talk some CosmosDB, and I’m sure she’s got more to say this now.
Actually, I’ve been a bit surprised by all of the writing about CosmosDB that I see taking place. Perhaps it’s that the CosmosDB Twitter account is very active, but it seems that so many .NET developers and other non-SQL Server professionals are talking about how they are building projects and platforms on CosmosDB. I follow plenty of SQL people, but it seems there is a tremendous amount of activity on CosmosDB (and R/Data Science), more than I see with plain old SQL Server (POSQL?).
When I first saw DocumentDB, the forerunner of CosmosDB, I thought it was a good platform for some workloads. Certainly the heavy write a document, read the same thing of the XBOX profiles and some of their games seems like a good use of the technology. When they added MongoDB, I was slightly more interested, but still not a lot. I really haven’t seen many workloads that seem to exceed what SQL Server can do in a relational platform. At least, not if you write efficient code and buy enough hardware. The change to CosmosDB, however, was really interesting. Allowing graph APIs, key values, and documents in one service is something I am sure is attractive to many developers. Especially the heavy use of JSON, which seems to be a format many developers prefer over datasets.
I know Microsoft would like more people to use CosmosDB as it’s completely a PaaS, pay-as-you-go service, and that could easily increase their revenue if they can get many people using the product. The have invested quite a bit into the platform, with latency guarantees, multiple consistency models, and plenty of scalability. It’s certainly on my list of topics to dig more into how CosmosDB works and where it might be applicable to solving data problems.
I think there are applications where a CosmosDB datastore will work well, but I wonder if this is more of just a front end, gather data service, that will require additional databases behind the service were data can be combined and aggregated for other reporting. I don’t know that querying your graph tables alongside key-values and documents will be as easy as it might be in a relational system, but hopefully we’ll see some people (perhaps me) experimenting and reporting back on the complexities and advantages of such a datastore.