tag

Consensus

#70 2017-12-2716 min
Building a Distributed Log from Scratch, Part 2: Data Replication
In part one of this series we introduced the idea of a message log, touched on why it’s useful, and discussed the storage mechanics behind it. In part two, we discuss data replication. We have our log. We know how to write data to it and read it back as well as how data is persisted. The caveat to this is, although we have a durable log, it’s a single point of failure (SPOF). If the machine where the log data is stored dies, we’re SOL. Recall that one of our three priorities with this system is high availability, so the question is how do we achieve high availability and fault tolerance?
#50 2016-01-0121 min
From the Ground Up: Reasoning About Distributed Systems in the Real World
The rabbit hole is deep. Down and down it goes. Where it ends, nobody knows. But as we traverse it, patterns appear. They give us hope, they quell the fear. Distributed systems literature is abundant, but as a practitioner, I often find it difficult to know where to start or how to synthesize this knowledge without a more formal background. This is a non-academic’s attempt to provide a line of thought for rationalizing design decisions. This piece doesn’t necessarily contribute any new ideas but rather tries to provide a holistic framework by studying some influential existing ones. It includes references which provide a good starting point for thinking about distributed systems. Specifically, we look at a few formal results and slightly less formal design principles to provide a basis from which we can argue about system design.
#36 2015-03-257 min
You Cannot Have Exactly-Once Delivery
I’m often surprised that people continually have fundamental misconceptions about how distributed systems behave. I myself shared many of these misconceptions, so I try not to demean or dismiss but rather educate and enlighten, hopefully while sounding less preachy than that just did. I continue to learn only by following in the footsteps of others. In retrospect, it shouldn’t be surprising that folks buy into these fallacies as I once did, but it can be frustrating when trying to communicate certain design decisions and constraints.
#27 2014-11-011 min
From Mainframe to Microservice: An Introduction to Distributed Systems
I gave a talk at Iowa Code Camp this weekend on distributed systems. It was primarily an introduction to them, so it explored some core concepts at a high level. We looked at why distributed systems are difficult to build (right), the CAP theorem, consensus, scaling shared data and CRDTs. There was some interest in making the slides available online. I’m not sure how useful they are without narration, but here they are anyway for posterity.
#25 2014-09-246 min
Understanding Consensus
A classical problem presented within the field of distributed systems is the Byzantine Generals Problem. In it, we observe two allied armies positioned on either side of a valley. Within the valley is a fortified city. Each army has a general with one acting as commander. Both armies must attack at the same time or face defeat by the city’s defenders. In order to come to an agreement on when to attack, messengers must be sent through the valley, risking capture by the city’s patrols. Consider the diagram below illustrating this problem.

Building a Distributed Log from Scratch, Part 2: Data Replication

From the Ground Up: Reasoning About Distributed Systems in the Real World

You Cannot Have Exactly-Once Delivery

From Mainframe to Microservice: An Introduction to Distributed Systems

Understanding Consensus