CSPEC ===== Why this paper? Coq proofs of correctness for concurrent code. Hard concurrency: not just locked regions. Lock-free operations on shared file system state. Implementation of a lock on an x86-TSO memory model. This paper came out of the last time we taught 6.826. CSPEC infrastructure initially built from 6.826 labs. Ideas came out of discussions of concurrency lectures. Mail server example. Directory structure. tmpdir user mailbox directory one email per file Figure 1. Figure 2. Careful lock-free reasoning: deliver() writes to temporary file before placing email in mailbox. deliver() calls to link() might race. pickup() opening and reading files after they were delivered. pickup()'s readdir() snapshots the contents of mailbox at some point. deletion also tricky (not shown in these figures) Overall approach. Low-level semantics of underlying platform: file system, x86-TSO, .. Correctness: trace inclusion. E.g., external events are receiving an email via SMTP, delivering via POP3. Structure: many layers. Each layer has its own operations and semantics for those operations. One operation from a given layer maps code (one or more ops) in lower layer. Correctness proofs between layers. Layers compose for a top-to-bottom proof. Full mail server written in terms of the top-level layer. Executable descripton of mail server operation. Talks about interacting with SMTP/POP3, as well as handling stored messages. [ Ref: https://github.com/mit-pdos/cspec/blob/master/src/Mail/MailServer.v ] Handling of stored messages is atomic: Deliver, Pickup, Delete. Compiled down to low-level file system ops, translating through all layers. SMTP/POP3 operations get replaced with unverified Haskell code. Their semantics say that SMTP/POP3 operations "produce" external traces. Captures server's interaction with the world. Top-to-bottom proof: resulting code produces subset of external traces. I.e., same calls to SMTP/POP3 as if stored message ops were atomic. Concurrency model at each layer: atomic ops. Threads can invoke only operations from a layer (plus local computation). Complete description of what each thread might do. Why multiple layers? Breaking up a large proof into multiple parts. But also important for helping reason about concurrency. Layer semantics. Defines what the operations are required to do. Defines what other threads are allowed to do. Changing semantics allows re-ordering operations. E.g., change semantics to limit what other threads do. E.g., change semantics to relax what our thread expects from op. Will look at examples later on. CSPEC patterns. Mover types. Retry loops. State abstraction. Protocol (invariant). Partitioning (sharding). Mover plan. History of execution at some layer: atomic ops. Want to argue that next-layer's ops are also atomic. Individual low-level op can be a left-mover, right-mover or non-mover. Can move right or left in its history and still have a valid history. Must be a mover with respect to any other operaton from other threads. Figure 10: definition of mover. Example of movers: acquire and release for a lock. Acquire is a right-mover. Release is a left-mover. Operations that access locked memory are both-movers (left and right). One-way implication: if interleaved execution happened, then can think of it as another execution with movers applied. Movers avoid the need for complex abstraction relation. Hard to define abstraction relation. AR needs to talk about PC values, pending threads, etc. Movers argue about trace inclusion rather than abstraction / simulation. Mover pattern. Right-movers, then zero or one non-movers, then left-movers. Zips together entire range into an atomic unit. State- and argument-dependent movers. pickup() reading a file after it was returned by readdir(). Left-movers that might never happen. Lock release() that never arrives in some execution. Nothing to move left; how can the trace be equivalent? Left-movers must be always-enabled and must produce no external events. Retry loop pattern. Loop body is either a noop or exits the loop (non-mover). Pattern proves that loop is equivalent to just running the loop once. Effectively, guess the one right time to invoke the loop body. Examples of movers from the mail server. Disjoint variables. tmp filenames based on PID. Operation by different users (disjoint mailboxes). Locks. Locking the mailbox for pickup vs delete. Delivery vs pickup. Once file is linked into the mailbox, cannot be modified by delivery. Pickup holding a lock ensures no concurrent delete. Mover patterns: successful link is a non-mover, and readdir is a non-mover. Changing semantics to enable movers. PID-based mailbox filenames: how to make readdir a mover? Other delivery processes might create files with different PIDs. Other delete processes might delete files with our PID. Need to re-define a restricted readdir to make it a mover. Restricted readdir returns a superset of files present with caller's PID. Results. Proved mail server, x86-TSO locking implementation. Mail server seems to scale pretty well. Haskell extracted version on par with Go. No crash safety for mail server. That will be the next paper. Proof effort. Hard to separate CMAIL from CSPEC. Both took 6 months to develop, after some initial rejected plans. Incremental effort: A day to implement deletion. A week to implement multiple users. x86-TSO might be a better measure: took Tej a few weeks? Much of the proof effort is in proving the framework's correctness. Key theorem is Figure 11. Other systems used similar ideas without proving it correct. I.e., no machine-checked proof that mover pattern implies trace inclusion. Subtle corner cases might be missing (left-mover issues). Summary. Formal reasoning about hard concurrency. Movers help with proof effort. Takes some creative massaging of semantics to apply movers.