Recap: simulation proofs for crash safety abstraction relation connects code to spec state init establishes absR to begin other operations preserve it on crash, prove absR to either old or new state (ops are atomic) Lab 3 append is atomic on crash how: append_helper updates new disk blocks but doesn't change abstract state append then _atomically_ incorporates changes by updating header two proofs strategies: (1) prove helper goes from pre of log_abstraction(state, log) to recovered of log_abstraction(state', log) ("abstraction relation has not changed"; more formally, code state is related to same abstract state) (2) prove helper only modifies new disk blocks, and in append_ok use this to conclude abstraction relation to old log Lab 4 replicated disk base API: two disks with read/write/size twist: either disk can fail at any time (think: yank out hard drive) goal: implement single disk failures should not be visible if computer crashes, atomic (think: power failure) two challenges: concurrency from disk failures need recovery from crashes how do we handle concurrency? concurrency means some things are changing in background illustrate problem with read specification read(i, a) pre: true post(r, state, state'): disk_failure(state, state') /\ match disk_i state' with | Some d -> r = Working (diskGet d a) | None -> r = Failed all proofs have to carry information from state to state' especially about other disk what can we rely on over disk_failure? if a disk doesn't fail, it stays the same if a disk has failed, it stays failed we address preserving facts about other disk with maybe_holds and a "frame" framed spec read(d0, a) pre: two_disks_are state (eq d) F post(r, state, state'): match r with | Working b -> diskGet d a =?= r /\ two_disks_are state' (eq d) F | Failed -> two_disks_are missing F how do we implement this plan in Coq? need ghost variables to express the d and F parts of this spec ghost variables are parameters to a spec which aren't part of the code need to provide them in the proof to chain from one function to the next also in this lab: recovery what does replicated disk recovery do? disks can become out-of-sync on crash during write why is this a problem? disk starts out all zero write 1, crash in the middle now read returns 1 first disk fails read now returns 0 -> not possible with single disk solution: on crash, copy from first disk to second only need to copy one sector maintain invariant even on crash that at most one sector is out-of-sync recovery fixes addresses one by one recovery as part of refinement diagram every function + crash + recovery is another transition of the system spec: pre + recovered of a top-level method code: impl with crash + recovery same diagram, but what is code transition? crash during code followed by running recovery note that in the middle, abstraction relation does not need to hold what does running recovery do? want to include crashes during recovery repeatedly run recovery with crashes finally recovery runs to completion intuition: recovery should be safe to re-run on crash recovery with some crashes should be same as just running it once (idempotence) theorem: if recovery satisifes a specification, and specification is idempotent, recovery satisfies the same specification even when re-run after every crash proving Hoare logic specs with recovery write code crash -> recovery pre -> recovery post -> write recovered (atomic) works because recovery crash -> recovery pre (idempotence) internally recovery has a loop