Over the last few weeks, I’ve noticed a few complaints from different friends and customers about issues with Microsoft’s Fabric service. I had assumed these were isolated incidents in just a few places, and customers were being refunded according to an SLA. Then I saw Joey D’Antoni’s post this week about Fabric going down. It lists quite a few of the incidents in June, including a few global ones.
However, the most surprising thing in the post was this link, noting Fabric doesn’t have a dedicated SLA. Instead, it’s under the general Microsoft support agreement, meaning your organization needs to have a support plan to get help. I think that makes some sense, but I’d really expect that a data service in the cloud, including an analytic service that touts itself as real-time, would have a high SLA. Some agreement on the order of 4 9s at least, if not 5.
For those of you who think you can do better on premises, I’ll remind you that four 9s mean you get 52 minutes of downtime a year, or 1 minute a week (roughly). Five 9s is 5 minutes a year, and the weekly calculation doesn’t matter. I’ve had systems run for a year, but a lot, and not a lot of databases, especially with patching 6 times a year. This year, with AI (my guess) finding lots of holes and GDR releasing quite often, I would guess that three nines is out the window for most SQL Server systems that aren’t HA clustered in some way.
As a point of reference, Denny notes that Azure SQL database, Business Critical, has a four-and-a-half 9s level of reliability and financial refunds for costs if that’s exceeded. You will need some sort of business insurance if you worry about revenue issues, but my guess is that most of us live with the downtime and work around it (and hopefully, plan for it).
I’ve been skeptical of Fabric (outside of Power BI). It feels cobbled together, so many issues are reported, and I feel like it’s immature. If I were working on a new analytics project, and we didn’t have a solution, I might PoC it, but I’d be more likely to consider Databricks, Snowflake, or Redshift rather than Fabric.
Perhaps you have a different view, or you have had success with Fabric. I know some people who have, but it seems one-sey two-sey and not commonplace. There seem to be so many workarounds and issues; it makes me skeptical that Fabric is really ready for primetime if Microsoft won’t stand behind it. Databricks only gives credit up to three 9s, but if they fall below 95%, they issue a 100% credit. Snowflake offers four 9s.
I feel that if Microsoft were confident in their reliability, they’d offer refunds if they didn’t perform.
Steve Jones


