Vale
====

why this paper?
  exposure to new approaches and ideas in verification
  hard problems in verification
    integration of existing and new tools
  key technical idea
    verifying a VC generator

meta about this paper
  this is a hard paper: it is written for a programming language
    audience with more background than you probably have
  it's ok if you don't understand everything
  you'll get better at reading papers with practice

goals for this lecture
  - understand F*, an ecosystem for building verified systems based on
    different ideas
  - core contribution: efficient verification of DSL
  - extract some other lessons from the paper

case study/motivation for this work: verification of AES-GCM
  widely-used cryptography underlying web security
    HTTPS -> "S" due to TLS
    TLS has handshake (setup) and record (bulk transfer) layers
    AES-GCM provides AEAD "authenticated encryption with associated data"
    encrypt data + authenticate metadata
  spec is high-level math; code is tricky, optimized assembly
  performance matters: every byte transferred over HTTPS uses AEC-GCM
  correctness matters: incorrect results or leaking secrets could
    compromise web security
  why is the AES-GCM proof hard?
    AES-GCM mixes AES and GCM
    large gap between registers and spec (field operations)
    must use assembly for sufficient performance
      note the performance gap between verified C (0.27MB/s),
        verified asm (991), unverified asm (6414)
      asm implementations use instructions not available from C and
        must carefully move data between registers
    important to prove side-channel resistance

intro to F*
  ML-like functional language
  full dependent types (same as Coq)
  refinement types checked by Z3
  effects for ghost code (only for proof), code that extracts to C
  "serious language" - F* is implemented in F*

  some syntax:
  val add64 : x:int64 -> y:int64 -> Pure int64
                                         (requires x + y < pow2_64)
                                         (ensures (λ z. z = x + y)
  let add64 = x + y

  combines powerful types as in Coq with Z3 for refinement types
  when you write
    val add_pos : x:int{x > 0} -> y:int{y > 0} -> z:int{z > 0}
    let add_pos = x + y
  F* emits a _verification condition_ to check that
    x > 0 /\ y > 0 => x + y > 0

  as in Coq, you can also create inductive types

  also has an effect system; three interesting effects:
  - Pure for standard functional programming procedures
  - ST models C code; has operations for allocating and using pointers
  - Ghost (special case: Lemma) for procedures that will only be used
    for proofs
     -> when we extract code from ST, Ghost procedures will not run; F*
        makes sure that this doesn't affect behavior

embedding assembly into F* with Vale
  - create a type `code` to represent assembly
  - write down `state` for state of machine
  - create a function `eval_code` to represent semantics
  - prove theorems
  at this point, we have the original Vale paper

verified VC generator
  goal: send simpler F* VCs to Z3
  plan: use weakest preconditions as VCs, generate weakest preconditions
    in F* as an ordinary function, do a once-and-for-all "soundness"
    proof that VC implies code meets spec, and then use soundness to
    prove user code meets user spec
  why does this help?
    the vc generator produces nice formulas
    expands state and then simplifies within F*, rather than within a
      complicated Z3 model
  in class: walk through Coq re-implementation of this verified generator
    https://github.com/mit-pdos/6.826-2019-labs/blob/vale/src/Lab0/VerifiedVC.v

other lessons from the paper
  use of untrusted tools
  important to overall story: integrating Low* and Vale
    want to call from Low* (C) into Vale (assembly)
    challenges:
    - different memory models
    - calling conventions
    - prove side-channel resistance
    idea: do a simulation proof that Vale code manipulates Low* heap
  layers of abstractions
    we will see this many times in this class

as a reference, here are some PL terms from the paper
  explained in lecture
    ghost
    weakest precondition
    normalization (reduction)
  others
    state, continuation monads
    effect
    higher-order abstract syntax
    shallow embedding
    well-founded
    fuel
    type-level computation
    typeclass