GitHub Downtime

I didn’t notice any issues with GitHub, but others did. The majority of my interaction is just through the git protocol, so things tend to work fast, and I don’t have any database access. I rarely use the Issues, and other parts of GitHub, which were affected when GitHub had a MySQL cluster fail over. There’s a good write up of the post incident analysis that’s worth reading, from a database perspective.

I’m not a big MySQL guy, only running an instance to power T-SQL Tuesday. The structure of a write primary and many read replicas that GitHub describes makes sense. It’s similar to what I’ve done in SQL Server, and certainly the idea of some quorum management, handled at GitHub with the Orchestrator software, is something that needs to be configured properly. Allan Hirt has talked about the complexities of quorum in large installations, and it’s not a simple thing to configure.

In reading about this, there are a couple things that strike me. First, the analysis talks about a degredation of service because East coast applications had to send writes to West Coast database servers. There were some problems with the way the database servers were working, but it seems to me that there should be some sort of application failover that’s possible. If you can’t have an application and database fail separately without customer impact, then there should be some way to fail applications over. Perhaps not, but if you’re responsible for designing HA for the database, make sure you talk to the application people and test for issues.

The second thing for me is that somehow there was a period of time when writes were occurring to the East Coast system that weren’t sent to the West Coast. My ignorance of how this HA stuff works in MySQL prevents me from making a big deal of this, but this isn’t something that should happen. If the quorum moves data to another node, it must stop writes to the first node. This could happen in SQL Server, but for me, this is the level of data loss I’d need to accept in my RPO.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 3.4MB) podcast or subscribe to the feed at iTunes and Libsyn.

This entry was posted in Editorial and tagged high availability. Bookmark the permalink.

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

GitHub Downtime

The Voice of the DBA Podcast

About way0utwest

Search this blog

VS Live San Diego

18 Year MVP Awardee

Tags

Search this blog

Steve’s Tweets

Older Posts

Meta

Recent Posts

Archives

Copyright Steve Jones 2018

Copyright 2016

Meta

GitHub Downtime

The Voice of the DBA Podcast

Share this:

Related

About way0utwest

Search this blog

VS Live San Diego

18 Year MVP Awardee

Tags

Search this blog

Steve’s Tweets

Older Posts

Meta

Recent Posts

Archives