Protocol invariants
===================
Paper: "I4: Incremental Inference of Inductive Invariants for Verification
of Distributed Protocols"
Why this paper?
Verifying protocols is an important problem
Hard even separately from verification of the software implementing the protocol
This paper tackles the problem of coming up with an inductive invariant
Why is protocol reasoning difficult?
Asynchronous network
Messages can be delayed
Messages can be dropped
Messages can be duplicated
Messages can be corrupted (this is easy to fix with known participants)
Consequence: nodes see different sets of messages, in different order, ..
Node failures
Nodes crash, restart
No indication of crashes to other participants
When node restarts, re-joining the system can be tricky: node has old state
Membership for open systems
Who can participate in the system?
Nodes misbehave in arbitrary ways: running malicious software, etc.
"Byzantine"
We won't talk about the open-system challenges here
Easier problem: membership is known (configured set of good servers,
maybe operator changes the configuration over time to add/remove servers)
Protocol reasoning: Ivy
Ivy is an existing tool for checking protocols
Similar in spirit to the TLA-style reasoning about the protocol in IronFleet
IronFleet lecture was focused on how to connect code to protocol level
I4 considers just the protocol level
I4 assumes code correctly implements protocol description
I4 focuses on coming up with inductive invariant
.. should help prove things about the protocol (safety, refinement, etc)
but that's outside of I4 or this paper
Protocol description assumed trusted
What does an Ivy protocol look like?
State-machine description, similar to previous state-machine approaches seen
Figure 2, left side
Description of what's in a state
"Relations"
Initial states
Transitions between states
Atomic actions in our protocol
Should reflect operations done by a single node
Lock server example
Server S
Client C
Server's lock state represented by semaphore(S)
Client C holding a lock from server S represented by link(C, S)
What does correctness look like, for I4?
Some safety predicate (invariant).
Example for the lock server:
forall C1, C2: client,
forall S: server,
link(C1, S) and link(C2, S) -> C1 = C2
Ivy supports other kinds of specs too (like refinement).
In principle, I4 should be helpful for any kind of spec.
But paper's focus is just invariants; no state abstraction, refinement, etc.
Inductive invariant
Nice way of thinking of trace inclusion we've been considering all along
Universe of all possible states
Code behavior: set of reachable states, defined by initial states + code execution
Spec requirement: set of safe states, defined by spec predicate
Goal: show code behavior is a subset of spec requirement
Challenge: hard to reason about precise reachable states
No concise predicate: only defined in terms of closure under transitions
Reachable states are, by definition, inductive
This paper has a combination of two ways of looking at reachable states
Safe states are not "inductive": safe state can transition to unsafe state
Not enough constraints on execution state
Proof strategy: find inductive invariant
Stated as a concise state predicate
Implies spec predicate (safe states)
Contains all possible starting states
Inductive (closed under the "step" relation)
(By implication, this means it's a superset of reachable states.)
What does a proof look like in Ivy?
Same basic story as we've seen many times:
Invariant holds in starting states
From any state satisfying invariant, every transition preserves invariant
Ivy demo
ivy_check lec/ivy/lock_server.ivy
strengthen:
invariant [2] link(C, S) -> semaphore(S) = false
ivy_check lec/ivy/lock_server.ivy
add another action:
action exchange(x:client,y:client,z:server) = {
var a:bool;
var b:bool;
a := link(x,z);
b := link(y,z);
link(x,z) := b;
link(y,z) := a;
}
export exchange
ivy_check lec/ivy/lock_server.ivy
Main strength of Ivy: decidable logic
States, transitions, invariants translated into Z3
Restricted logic that can always be decided by Z3
Avoid things like existential quantifiers, higher-order predicates, etc.
See Ivy paper (PLDI 2016) for more details: EPR logic.
Feed the formula to Z3, get an answer back.
Either invariant is satisfied,
or counter-example state that does not satisfy invariant.
Modeling unbounded system: any number of servers, clients.
Z3 is checking symbolic expressions, not enumerating all possible states.
This is why, for Ivy, it's crucial to have an invariant.
Otherwise, impossible for Ivy to efficiently determine reachable states.
Alternative plan: model checking
Instead of specifying an invariant, explicitly explore all reachable states
Requires bounding the state, so there's a finite number of states to explore
E.g., one server, S, and two clients, C0 and C1.
Bounded-state safety spec:
link(C0,S) and link(C0,S) -> C0 = C0
link(C0,S) and link(C1,S) -> C0 = C1
link(C1,S) and link(C0,S) -> C1 = C0
link(C1,S) and link(C1,S) -> C1 = C1
Exploring reachable states:
sem(S) link(C0,S) link(C1,S)
true false false
false true false
false false true
Safety spec holds for all reachable states of this bound
Clever trick: connect model-checking and invariant-based reachable states
Prior work: AVR model-checking tool
Generates inductive invariant but on finite state sizes
In that prior work, inductive invariant made model checking more efficient
I4: algorithm for guessing what an unbounded version of that invariant
might look like
Roughly, heuristics for guessing what underlying general case might
have led to a finite inductive invariant produced by AVR
Need sufficiently interesting finite inductive invariant to guess right
I4 workflow:
Developer writes Ivy protocol definition and safety spec (invariant)
Try to run Ivy, see if invariant is good enough
When it's not good enough (Ivy fails), I4 cooks up a finite version
I4 gives finite version to model checker (AVR), to get finite inductive inv.
I4 generalizes inductive invariant to unbounded size
I4 adds this new inductive invariant, sees if Ivy can make unbounded proof work
If not, increase finite size and try again, or ask for developer help
Reading / breakout Q: would this work for concurrent software, like Armada?
Distributed protocols vs concurrent software
Protocol description avoids "internal" details
Assumes a state abstraction over the code-level memory contents
Assumes code-level implementation correctly implements protocol
Result: simpler state, transitions in protocol description
Regular structure of nodes in a distributed system
No dynamically created threads