Armada ====== Administrivia Lab 4 is out You have multiple options If you choose a non-default option, let us know soon! Why this paper? Formal verification of low-level concurrent programs. "Hard concurrency": not just holding locks. Lock-free access to variables. x86-TSO shared memory. State-machine-level reasoning. In contrast to language-level approaches (e.g., next lecture on Iris). Proof strategies for reasoning about concurrent code. Basics: invariant and state abstraction. Reduction (or "movers"). x86-TSO elimination. Weakening. Meta-strategy: break up proof into multiple levels. Big focus on automation. Armada overview. Armada language: their own language, like Dafny and C. Used for implementation (translated to C for compilation). Used for top-level specification and intermediate levels. Language vs state machine. Recall from last lecture: Reasoning about language is typically done with Hoare logic / WPs. Reasoning about state machine is typically done with invariants. State machine translator. Armada language -> state machine. Need to encode state of the program as the state of the state machine. Need to encode the code as transitions in the state machine. Similar to Butler's lecture: transition depends on PC, ... Correspondence between levels. Multiple possible proof strategies. Executions of code are a subset of executions of higher level (e.g., spec). Automated prover. Checks the necessary conditions for correctness. State machine translator outputs state encoding, transition encoding for prover. Prover in their case is Dafny, which ultimately builds on top of Z3. Diagram. Armada code -> [Code generator] -> C code Armada code -> [State machine translator] -> State machine definition. Proof strategy -> [Proof generator] -> Proof of state machine refinement. Why state machines? Tricky to automate WP / Hoare logic style reasoning. State machine reasoning is better for automation / brute force. Straightforward to figure out what needs to be checked: all possible interleavings. Example diagram: multiple threads, many possible state transition traces. "Just" a matter of making this efficient. Example: figure 9 from the paper. What does the high-level spec mean? Large atomic chunks. How to prove that these code sections are atomic? Approach: reduction / movers. "Right-mover" transition can move right in state transition trace. "Left-mover" transition can move left in state transition trace. What does it mean to move? Alternative trace (with some transition re-ordered) is a valid execution. Why does this help us reason about concurrency? Recall that the correctness goal is that the code is a subset of the spec. Movers show that all code interleavings are represented by a smaller set. Pattern: R R .. R N L L .. L Using movers in the figure 9 example. lock(&mutex) is a right-mover. condition, do_something are likely both-movers (holding lock). unlock(&mutex) is a left-mover. Proving atomicity using reduction / movers. Developer specifies the right- and left-mover regions. Armada considers all transitions in those regions. Tries to prove that each of those transitions is the appropriate kind of mover. What does it mean to prove a transition to be a mover? Suppose we want to prove transition T is a right-mover. Similar story for left-movers, with some technical complications. Consider state machine trace: T X s_1 ---> s_2 ---> s_3 For every X, we have to prove that this also works: X T s_1 ---> s_2' ---> s_3 We get to use the invariant to help us with this proof. E.g., invariant says if one thread holds lock, others cannot be holding. And threads can only execute steps between lock and unlock when holding. So, if one thread is executing locked step, other threads cannot do that. As long as all conflicting transitions require lock, this "works". Armada encodes this as: if NextState(NextState(s_1, step_i), step_j) = s_3, then NextState(NextState(s_1, step_j), step_i) = s_3 Encapsulated non-determinism: all non-deterministic choices go into step_i. Breakout/reading Q: what are all the kinds of non-determinism in this step? What goes into the step object? Thread ID So, step object identifies which thread will take a step Randomness in the program Uninitialized values Havoc data Randomness from the runtime New thread ID, if creating a thread New pointer, if allocating memory What if we have two threads being created at the same time? Wouldn't we swap thread IDs if the operations get re-ordered? Or, same for allocations: wouldn't the pointers be different? Crucial detail of Armada's execution model: thread IDs, pointers never reused. No need for existential quanitifier for intermediate state s_2'. Works great for automation: step is purely a function! With careful execution model, it's sensible to re-order step objects. Weakening strategy. TSP example: racy check before acquiring lock. Weakened code: arbitrary if statement. Might be good enough for some specs (which don't care about specific answer). Allows subsequent levels to ignore the racy memory read. What would it mean to prove the if-statement weakening strategy? Every transition possible in code is still possible in spec. Actually more sophisticated because there might be omitted transitions, etc. What if we wanted to prove something more about the TSP racing if statement? Figure 10. First prove that racy read is no less than the lowest value so far. Probably requires stating an invariant about pending writes to best_len. Then weaken by using "somehow" to pick some value >= ghost_best. No more racy read, just an over-approximation of the lowest value. TSO elimination strategy. Can pretend like the program has sequentially consistent memory, with locks. Requires developer to specify what lock protects the memory locations. Proof: invariant must ensure threads hold lock while accessing this memory, and that threads flush x86-TSO buffer by the time they release the lock. Multiple levels. Break up large proof into smaller parts. One strategy per level. Transform code/spec so that different strategies might apply. E.g., may need weakening before it's possible to apply reduction/movers. Extensibility. Proof strategies are not "trusted" to be correct. Must produce proofs that convince Dafny of correctness (trace inclusion). Correctness boils down to reasoning about all possible state transitions. So, state-machine model of Armada code is trusted (state machine transformer). But strategies for reasoning about possible state-machine transitions are not. Dafny checks proof that strategy is applied soundly. Does this work? Relatively simple "programs": 57, 29, 64, 70 lines of code. However, proving them requires sophisticated reasoning. Movers and other strategies seem sufficient to prove them. Automation seems to work OK: need to write 100's of lines of proof, but not 1000's of lines of proof. Active area of research. Armada pushes heavily on automation, and ends up with state machines. Next lecture on Iris is more on the language side. Both require developer cleverness to simplify reasoning about concurrency. Helpful code references: State machine states: LPlusState https://github.com/microsoft/Armada/blob/master/Source/Armada/AbstractProofGenerator.cs#L1567 Step object: Armada_TraceEntry https://github.com/microsoft/Armada/blob/master/Source/Armada/StateMachineTranslator.cs#L1135 Step object contains TID: https://github.com/microsoft/Armada/blob/master/Source/Armada/NextRoutine.cs#L243 Step object also contains other nondet based on source language: New thread ID, for a thread create statement: https://github.com/microsoft/Armada/blob/master/Source/Armada/ArmadaParser.cs#L1038 New pointer, for memory allocation: https://github.com/microsoft/Armada/blob/master/Source/Armada/ArmadaParser.cs#L1140 Garbage for havoc: https://github.com/microsoft/Armada/blob/master/Source/Armada/ArmadaParser.cs#L1431