L14: SibylFS Paper: "SibylFS: formal specification and oracle-based testing for POSIX and real-world file systems" by Ridge et al. (SOSP'15) Source code: https://sibylfs.github.io/ Why this paper? Example of what you can do if you have a spec Testing many implementations Black box No proofs, but formal spec for POSIX There is no single reference implementation of POSIX Handles non-determinism Finds corner cases Influenced POSIX standard What is POSIX? A standard specifying informally the Unix interface Informally, one set of man pages for all Unix implementations Why is there a spec? There were many UNIX implementations, slightly different in the details Quite a few with closed source Pain in the neck for app developers How is it decided with is OK and what is incorrect? Experts working on different Unixes get together and try to reach agreement Result: POSIX specifies the agreed-on core Leaves room for implementations to be different Important standard How do people test file systems today? Run test cases and see what file systems does No standard test set for all file systems File systems behave differently Sometimes multiple outcomes are possible Several error codes are ok Sometimes file systems behave differently for same input Trailing / on pathnames What is right? What the POSIX standard says? Often too informal What popular applications expect? Overview of SibylFS test generator + hand-written tests script test scripts some setup file system commands sequence of file system commands to test behavior test executor runs scripts and produces traces traces are invocations (name, arguments, etc.) and return (name, results) Sibylfs spec abstract state plus operations on abstract model Sibylfs test run test script on model of file system match model against trace non-determinism: one state may transitions to several states output: checked traces if error, display error e.g., returned wrong error code What is SibylFS? Spec written in Lem abstract state impl of FS calls on that state An executable tester Returns if a trace is ok or not ok Challenge: non-determinism Spec shouldn't be deterministic Good spec doesn't nail down all implementation details E.g., spec shouldn't say what inode number stat should return Source of non-determinism in POSIX Several possible errors are ok Depends on order of checks in implementation which one is returned The number of bytes read It is allowed to be less than the number of bytes requested The order of dir entries returned by readdir The interleaving of systems calls by different processes Solution: next-state enumeration SibylFS computes all possible next states from a given state and API call for example, one for each possible error code special operator in model Avoid state explosion traces tells order don't need to try all possible interleavings Challenge: non-determinism in directory listing Several sources of non-determinism: POSIX doesn't specific order of directory listing Concurrent modification to directory Listing isn't atomic Another process may delete an entry while listing What is spec under concurrency modifications? if an entry is not modified it will be returned if an entry is deleted, it may be returned or not an entry may never be returned twice subsequent calls cannot return it again => SibylFS must track all changes to a directory since an opendir a set of "must" entries a set of "may" entries Specification is more complex than implementation! Challenge: non-determinism from concurrency Several processes may interact with the file system concurrently Interleaving at the granularity of calls No support for concurrency inside file system Supported by explicitly maintaining all possible file system states This could be avoided by asking the file system how it interleaved calls But that breaks black-box approach The SybilFS model labeled transition system transitions are API calls (invocation or return) separate to allow for interleaving of calls from different processes labels are call + arguments or call result + return values list of commands the state fid table group table pid table ... transition is a function from (state, label) to finite set of states some states are special correspond to POSIX undefined, unspecified, or implementation defined operations on the abstract state resolve pathnames perform operation (e.g., stat) trace = sequence of labels Test generation special-purpose tool ad-hoc algorithms to avoid generating too many tests manual equivalance partitioning based on the properties of the file system state special cases: read/write need to test sequences of calls manual 21,000 tests measure test coverage of model (98%) Results Good news: Ignoring errors, permissions, behavior is similar File systems get the common behavior right But many corner cases with problems/errors Problems in POSIX specification error clauses are out of sync e.g., link, mkdir, and open Defects OS X: pwrite(fd, -10, ...) -10 becomes big integer, oops sends SIGXFSZ to process, typically killing it posixovl/VFAT failure to decrement hard link count on rename openZFS on OS X figure 8 sequence calling process consumes 100% of CPU and ignores signals Platform conventions Linux ignores off in pwrite(fd, off, ..) when fd is opened with O_APPEND Nasty for applications that platforms have their own conventions Invariant violations a call which returns an error should not change FS state FreeBSD: open(d, O_DIR|OC_CREAT, O_EXCL) -> ENOTDIR + newly created file Error codes Linux: unlink of directory should return EPERM But LSB says EISDIR OSX: rename("/", ..) -> EISDIR should return EBUSY or EINVAL Pathname resolution Linux: link /dir/ /f.txt/ -> EEXIST, but better to return ENOTDIR OSX: symlink(s1, d) symlink(s2, s1) readlink s2/ -> content of s1, but better to return EINVAL Limitations No concurrency inside file system No crashes Summary Specification for black-box testing of file systems Handles different types of non-determinism Finds problems in corner cases