Protocol invariants
===================

Paper: "I4: Incremental Inference of Inductive Invariants for Verification
  of Distributed Protocols"

Why this paper?
  Verifying protocols is an important problem
  Hard even separately from verification of the software implementing the protocol
  This paper tackles the problem of coming up with an inductive invariant

Why is protocol reasoning difficult?
  Asynchronous network
    Messages can be delayed
    Messages can be dropped
    Messages can be duplicated
    Messages can be corrupted (this is easy to fix with known participants)
    Consequence: nodes see different sets of messages, in different order, ..
  Node failures
    Nodes crash, restart
    No indication of crashes to other participants
    When node restarts, re-joining the system can be tricky: node has old state
  Membership for open systems
    Who can participate in the system?
    Nodes misbehave in arbitrary ways: running malicious software, etc.
    "Byzantine"
  We won't talk about the open-system challenges here
    Easier problem: membership is known (configured set of good servers,
      maybe operator changes the configuration over time to add/remove servers)

Protocol reasoning: Ivy
  Ivy is an existing tool for checking protocols
  Similar in spirit to the TLA-style reasoning about the protocol in IronFleet
    IronFleet lecture was focused on how to connect code to protocol level
    I4 considers just the protocol level
    I4 assumes code correctly implements protocol description
    I4 focuses on coming up with inductive invariant
    .. should help prove things about the protocol (safety, refinement, etc)
       but that's outside of I4 or this paper
  Protocol description assumed trusted

What does an Ivy protocol look like?
  State-machine description, similar to previous state-machine approaches seen
  Figure 2, left side
    Description of what's in a state
      "Relations"
    Initial states
    Transitions between states
      Atomic actions in our protocol
      Should reflect operations done by a single node

Lock server example
  Server S
  Client C
  Server's lock state represented by semaphore(S)
  Client C holding a lock from server S represented by link(C, S)

What does correctness look like, for I4?
  Some safety predicate (invariant).
  Example for the lock server:
    forall C1, C2: client,
    forall S: server,
      link(C1, S) and link(C2, S) -> C1 = C2
  Ivy supports other kinds of specs too (like refinement).
  In principle, I4 should be helpful for any kind of spec.
  But paper's focus is just invariants; no state abstraction, refinement, etc.

Inductive invariant
  Nice way of thinking of trace inclusion we've been considering all along
  Universe of all possible states
  Code behavior: set of reachable states, defined by initial states + code execution
  Spec requirement: set of safe states, defined by spec predicate
  Goal: show code behavior is a subset of spec requirement

  Challenge: hard to reason about precise reachable states
    No concise predicate: only defined in terms of closure under transitions
    Reachable states are, by definition, inductive
    This paper has a combination of two ways of looking at reachable states

  Safe states are not "inductive": safe state can transition to unsafe state
    Not enough constraints on execution state

Proof strategy: find inductive invariant
  Stated as a concise state predicate
  Implies spec predicate (safe states)
  Contains all possible starting states
  Inductive (closed under the "step" relation)
  (By implication, this means it's a superset of reachable states.)

What does a proof look like in Ivy?
  Same basic story as we've seen many times:
  Invariant holds in starting states
  From any state satisfying invariant, every transition preserves invariant

Ivy demo
  ivy_check lec/ivy/lock_server.ivy

  strengthen:
    invariant [2] link(C, S) -> semaphore(S) = false

  ivy_check lec/ivy/lock_server.ivy

  add another action:
    action exchange(x:client,y:client,z:server) = {
      var a:bool;
      var b:bool;
      a := link(x,z);
      b := link(y,z);
      link(x,z) := b;
      link(y,z) := a;
    }
    export exchange

  ivy_check lec/ivy/lock_server.ivy

Main strength of Ivy: decidable logic
  States, transitions, invariants translated into Z3
  Restricted logic that can always be decided by Z3
    Avoid things like existential quantifiers, higher-order predicates, etc.
    See Ivy paper (PLDI 2016) for more details: EPR logic.
  Feed the formula to Z3, get an answer back.
    Either invariant is satisfied,
    or counter-example state that does not satisfy invariant.
  Modeling unbounded system: any number of servers, clients.
    Z3 is checking symbolic expressions, not enumerating all possible states.
    This is why, for Ivy, it's crucial to have an invariant.
    Otherwise, impossible for Ivy to efficiently determine reachable states.

Alternative plan: model checking
  Instead of specifying an invariant, explicitly explore all reachable states
  Requires bounding the state, so there's a finite number of states to explore
  E.g., one server, S, and two clients, C0 and C1.
  Bounded-state safety spec:
    link(C0,S) and link(C0,S) -> C0 = C0
    link(C0,S) and link(C1,S) -> C0 = C1
    link(C1,S) and link(C0,S) -> C1 = C0
    link(C1,S) and link(C1,S) -> C1 = C1
  Exploring reachable states:
    sem(S)  link(C0,S)  link(C1,S)
    true    false       false
    false   true        false
    false   false       true
  Safety spec holds for all reachable states of this bound

Clever trick: connect model-checking and invariant-based reachable states
  Prior work: AVR model-checking tool
    Generates inductive invariant but on finite state sizes
    In that prior work, inductive invariant made model checking more efficient
  I4: algorithm for guessing what an unbounded version of that invariant
    might look like
  Roughly, heuristics for guessing what underlying general case might
    have led to a finite inductive invariant produced by AVR
  Need sufficiently interesting finite inductive invariant to guess right

I4 workflow:
  Developer writes Ivy protocol definition and safety spec (invariant)
  Try to run Ivy, see if invariant is good enough
  When it's not good enough (Ivy fails), I4 cooks up a finite version
    I4 gives finite version to model checker (AVR), to get finite inductive inv.
    I4 generalizes inductive invariant to unbounded size
    I4 adds this new inductive invariant, sees if Ivy can make unbounded proof work
    If not, increase finite size and try again, or ask for developer help

Reading / breakout Q: would this work for concurrent software, like Armada?

Distributed protocols vs concurrent software
  Protocol description avoids "internal" details
    Assumes a state abstraction over the code-level memory contents
    Assumes code-level implementation correctly implements protocol
    Result: simpler state, transitions in protocol description
  Regular structure of nodes in a distributed system
    No dynamically created threads