Saturday, July 19, 2014

Distributed Systems Reading Notes

Distributed System Model: A set of assumptions about the environment and in which a distributed system is implemented.
  1. Failure of Nodes
    -- Byzantine Fault Tolerance (nodes misbehaving in arbitrary fashion)
  1. Communication links reliability and when they may fail
    -- Network links are reliable and messages are never lost or delayed indefinitely.
  2. Assumptions about timing and ordering
    -- Synchronous System Model: Processes execute in lock step, there is an upper     bound on the message transmission delay and each process has an accurate clock.
    -- Asynchronous System Model: Processes execute at independent rate, no bound on message transmission delay and useful clocks do not exist.


Consensus Problem:
  1. Every correct process must agree of same value
  2. Every correct process must decide at most 1 value
  3. All processes eventually reach a decision
  4. If all correct processes propose the same value, then all correct processes agree on the same value.
Leader election, atomic broadcast, mutual exclusion
CAP Theorem:


CAP.png
FLP Theorem
If at most one node can fail by crashing, messages cannot be lost and there is no bound on message delay then you cannot achieve consensus.


Consistency Models:
  1. Strong
    -- Linearizable
    -- Sequential
  2. Weak
    -- Client centric
    -- Eventual


Replication:
  1. Synchronous
  2. Asynchronous

  1. Replication methods that prevent divergence (single copy)
    -- 1n messages (async primary/backup)
    -- 2n messages (sync primary/backup)
    -- 4n messages (2PC, multi paxos)
    -- 6n messages (3PC, Paxos with repeated leader election)
    -- Raft https://www.youtube.com/watch?v=06cTPhi-3_8
    -- Zk Atomic Broadcast (Atomic Broadcast)
  2. Replication methods that risk divergence -- R+W > N does not guarantee strong consistency. -- Read Repair, Merkle Trees, Gossip, Eventual Consistency -- eg. Cassandra, Riak, Voldermort