Vale
====
why this paper?
exposure to new approaches and ideas in verification
hard problems in verification
integration of existing and new tools
key technical idea
verifying a VC generator
meta about this paper
this is a hard paper: it is written for a programming language
audience with more background than you probably have
it's ok if you don't understand everything
you'll get better at reading papers with practice
goals for this lecture
- understand F*, an ecosystem for building verified systems based on
different ideas
- core contribution: efficient verification of DSL
- extract some other lessons from the paper
case study/motivation for this work: verification of AES-GCM
widely-used cryptography underlying web security
HTTPS -> "S" due to TLS
TLS has handshake (setup) and record (bulk transfer) layers
AES-GCM provides AEAD "authenticated encryption with associated data"
encrypt data + authenticate metadata
spec is high-level math; code is tricky, optimized assembly
performance matters: every byte transferred over HTTPS uses AEC-GCM
correctness matters: incorrect results or leaking secrets could
compromise web security
why is the AES-GCM proof hard?
AES-GCM mixes AES and GCM
large gap between registers and spec (field operations)
must use assembly for sufficient performance
note the performance gap between verified C (0.27MB/s),
verified asm (991), unverified asm (6414)
asm implementations use instructions not available from C and
must carefully move data between registers
important to prove side-channel resistance
intro to F*
ML-like functional language
full dependent types (same as Coq)
refinement types checked by Z3
effects for ghost code (only for proof), code that extracts to C
"serious language" - F* is implemented in F*
some syntax:
val add64 : x:int64 -> y:int64 -> Pure int64
(requires x + y < pow2_64)
(ensures (λ z. z = x + y)
let add64 = x + y
combines powerful types as in Coq with Z3 for refinement types
when you write
val add_pos : x:int{x > 0} -> y:int{y > 0} -> z:int{z > 0}
let add_pos = x + y
F* emits a _verification condition_ to check that
x > 0 /\ y > 0 => x + y > 0
as in Coq, you can also create inductive types
also has an effect system; three interesting effects:
- Pure for standard functional programming procedures
- ST models C code; has operations for allocating and using pointers
- Ghost (special case: Lemma) for procedures that will only be used
for proofs
-> when we extract code from ST, Ghost procedures will not run; F*
makes sure that this doesn't affect behavior
embedding assembly into F* with Vale
- create a type `code` to represent assembly
- write down `state` for state of machine
- create a function `eval_code` to represent semantics
- prove theorems
at this point, we have the original Vale paper
verified VC generator
goal: send simpler F* VCs to Z3
plan: use weakest preconditions as VCs, generate weakest preconditions
in F* as an ordinary function, do a once-and-for-all "soundness"
proof that VC implies code meets spec, and then use soundness to
prove user code meets user spec
why does this help?
the vc generator produces nice formulas
expands state and then simplifies within F*, rather than within a
complicated Z3 model
in class: walk through Coq re-implementation of this verified generator
https://github.com/mit-pdos/6.826-2019-labs/blob/vale/src/Lab0/VerifiedVC.v
other lessons from the paper
use of untrusted tools
important to overall story: integrating Low* and Vale
want to call from Low* (C) into Vale (assembly)
challenges:
- different memory models
- calling conventions
- prove side-channel resistance
idea: do a simulation proof that Vale code manipulates Low* heap
layers of abstractions
we will see this many times in this class
as a reference, here are some PL terms from the paper
explained in lecture
ghost
weakest precondition
normalization (reduction)
others
state, continuation monads
effect
higher-order abstract syntax
shallow embedding
well-founded
fuel
type-level computation
typeclass