Vale ==== why this paper? exposure to new approaches and ideas in verification hard problems in verification integration of existing and new tools key technical idea verifying a VC generator meta about this paper this is a hard paper: it is written for a programming language audience with more background than you probably have it's ok if you don't understand everything you'll get better at reading papers with practice goals for this lecture - understand F*, an ecosystem for building verified systems based on different ideas - core contribution: efficient verification of DSL - extract some other lessons from the paper case study/motivation for this work: verification of AES-GCM widely-used cryptography underlying web security HTTPS -> "S" due to TLS TLS has handshake (setup) and record (bulk transfer) layers AES-GCM provides AEAD "authenticated encryption with associated data" encrypt data + authenticate metadata spec is high-level math; code is tricky, optimized assembly performance matters: every byte transferred over HTTPS uses AEC-GCM correctness matters: incorrect results or leaking secrets could compromise web security why is the AES-GCM proof hard? AES-GCM mixes AES and GCM large gap between registers and spec (field operations) must use assembly for sufficient performance note the performance gap between verified C (0.27MB/s), verified asm (991), unverified asm (6414) asm implementations use instructions not available from C and must carefully move data between registers important to prove side-channel resistance intro to F* ML-like functional language full dependent types (same as Coq) refinement types checked by Z3 effects for ghost code (only for proof), code that extracts to C "serious language" - F* is implemented in F* some syntax: val add64 : x:int64 -> y:int64 -> Pure int64 (requires x + y < pow2_64) (ensures (λ z. z = x + y) let add64 = x + y combines powerful types as in Coq with Z3 for refinement types when you write val add_pos : x:int{x > 0} -> y:int{y > 0} -> z:int{z > 0} let add_pos = x + y F* emits a _verification condition_ to check that x > 0 /\ y > 0 => x + y > 0 as in Coq, you can also create inductive types also has an effect system; three interesting effects: - Pure for standard functional programming procedures - ST models C code; has operations for allocating and using pointers - Ghost (special case: Lemma) for procedures that will only be used for proofs -> when we extract code from ST, Ghost procedures will not run; F* makes sure that this doesn't affect behavior embedding assembly into F* with Vale - create a type `code` to represent assembly - write down `state` for state of machine - create a function `eval_code` to represent semantics - prove theorems at this point, we have the original Vale paper verified VC generator goal: send simpler F* VCs to Z3 plan: use weakest preconditions as VCs, generate weakest preconditions in F* as an ordinary function, do a once-and-for-all "soundness" proof that VC implies code meets spec, and then use soundness to prove user code meets user spec why does this help? the vc generator produces nice formulas expands state and then simplifies within F*, rather than within a complicated Z3 model in class: walk through Coq re-implementation of this verified generator https://github.com/mit-pdos/6.826-2019-labs/blob/vale/src/Lab0/VerifiedVC.v other lessons from the paper use of untrusted tools important to overall story: integrating Low* and Vale want to call from Low* (C) into Vale (assembly) challenges: - different memory models - calling conventions - prove side-channel resistance idea: do a simulation proof that Vale code manipulates Low* heap layers of abstractions we will see this many times in this class as a reference, here are some PL terms from the paper explained in lecture ghost weakest precondition normalization (reduction) others state, continuation monads effect higher-order abstract syntax shallow embedding well-founded fuel type-level computation typeclass