Lab 4: Replicated disk

This lab will introduce you to reasoning about a limited form of concurrency (in the form of one of the disks failing in the background), as well as to reasoning about crash recovery.

In this lab assignment, you will implement a disk replication system. The API you'll be using provides access to two disks. It allows a single disk to fail at any time (this failure is fail-stop, so you don't need to handle restoring a disk). You'll implement a single-disk API by replicating over the two disks, which should hide failure of either disk.

There are two main challenges in this lab:

  1. Concurrency. A disk can fail at any time, which is a concurrent behavior you need to address.
  2. Recovery. In this lab you'll need a recovery procedure to fix up the disk after a crash. Your implementation (particularly for write) will need this recovery procedure to have the atomic crash behavior expected for the single-disk API.

The source code provided for this lab is as follows:

The other files in src/Lab4 implement Haskell extraction, if you want to run your replicated disk implementation with the help of make bin/replicate-nbd.

Submission for this lab is split into four parts:

Note the exercises in the assignment! We'll be going back and forth in the file, unlike in previous labs. Make sure to look for the labeled exercises for each handin and complete all of them. Specifications from earlier weeks will be important in later ones, so you should read the entire assignment and keep it in mind while working.

We expect that in later weeks you'll need to go back and fix your specifications as you start proving, so keep this in mind when assessing how much work each handin of the lab is.

Submit your work to the submission web site as in previous labs.

Questions or comments regarding 6.826? Contact us on Piazza or send e-mail to the 6.826 staff at 6826-staff@lists.csail.mit.edu.

Creative Commons License