CosmosDB and Consistency

Update: Added a few things for studying for DP-900

I was doing a little work with CosmosDB recently, and there were a few things that surprised me about the platform. I’m also not 100% sure I completely understand these, but I’m slowly expanding my knowledge.

I had known there were various consistency levels, but hadn’t ever really looked at them in detail. There is a page on MS Docs that has some animated gifs that help explain the levels. I think the images are useful, and I’m going to try and understand what this means from the perspective of a relational person.

Strong Consistency

This means the replicas remain in sync, and the clients always see the same data, without running into uncommitted or partial writes. This is similar to what SQL Server provides by default for one database.

This also prevents multi-region writes. This is a synchronous commit across regions, essentially.

Bounded Staleness

In this case replicas might be slightly behind the one where writes have occurred. In essence, a delay. I think of this as snapshot replication, where we might set a time between updates.

CosmosDB is not using this, and it’s better about keeping things up to date with a certain number of versions or time between updates, but you set upper bounds on how stale info is. I wish we could do this with replication in SQL Server.

Essentially, this is consistency, but with a delay across regions.

Session Consistency

This one seems slightly tricky. I’m sure if I read it and watch the image 100 more times it will be easy, but this really means that within a session, I get consistency. It also looks like this is for reads and writes in the region. I believe this is the default.

I don’t quite know what this means for SQL Server. It’s likely something like async AGs? Not sure what I think here.

Consistent Prefix

This level allows some dirty reads, so it gets into that area that makes me, as a relational person, uncomfortable. At least we never see out of order writes. From the visual, this looks like there could be various delays between regions. Feels like normal replication to me.

Eventual

This is likely what many people think about with non-relational platforms, and it worries them. There’s no guarantee of write order, or when you will see the updates in any particular region.

For some problem domains, this is really bad. For many that I’ve worked in, it’s fine. As long as the delays aren’t too long, this is tolerable for many applications.

It still makes me slightly nervous.

Summary

Dealing with the nitty gritty of consistency, across many copies, and many clients, is weird. It’s complex, and hard to think about for a human. However, I am glad there are some choices, allowing applications to pick what suits them.

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31