Automated Theorem Proving in Austin TX: 2007

Tuesday, December 18, 2007

MaceMC: Model Checking for the Systems Guys

A couple of months ago, Prof. Qiu directed me to a very cool paper about a model checker that won "Best Paper" in NSDI 2007. The NSDI '08 program committee met here in Austin yesterday to go through papers, and were kind enough to stick around and give an all-day workshop on their own research today. Amin Vahdat was a member of the program committee as well as being an author on the paper, so he gave a little presentation on it. (He also got into a bit of a tiff w/ Mothy Roscoe from ETH Zurich over whether his approach w/ Mace was right or not, but that's neither here nor there...)

The cool thing about the paper is that it introduces a model checker for a relevant class of liveness properties, rather than the normal safety properties that any old model checker off the street handles. Safety properties are essentially those properties that say something bad never happens. For example, you may have as a safety property that a password is never stored in plaintext. Liveness properties on the other hand state that something good will eventually happen. An example of a liveness property is that my waiter will eventually bring me my food, though it may take unbearably long (a safety property is that he never spits in it!).

Model checking, in a nutshell, is a way of systematically exploring the state space of a program and checking whether it meets a certain specification (this is a very rough nutshell!). In theory, for finite state machines, you can exhaustively test the state space and verify that it does in fact have the specific safety properties you desire by showing that the properties hold in every reachable state. In practice, this is intractable because of the "State Explosion" problem which results in a few more states than you would like (think combinatorially many more!). The nice thing about model checking safety properties (in an FSM) is that it is decidable; if you run the model checker long enough, it will either prove that the properties are met, or it will give you a counter example (a state that violates the property). On the other hand, liveness properties rarely have counter examples. In general, a counter example for a liveness property is an infinitely long trace where the liveness requirement is never met. Kind of hard to generate those buggers... In certain cases, for example loops in a deterministic system, counter examples can be shown, but this doesn't hold in general.

The guys in the paper (@ UCSD) have a special language (Mace) that they use for developing distributed systems, and this is the language that their model checker verifies. So, unlike many model checkers which are used to verify the correctness of little "toy" models written in some logic, this works directly upon the implementation itself and can find real-world bugs.

The Paper

Saturday, September 29, 2007

Formal Methods at Microsoft

The other night (coincidentally right before an initial interview with Microsoft) I ran across the Static Driver Verification project in the Windows Group at Microsoft. SDV is a tool that integrates a home grown model checker (SLAM) for drivers with a whole suite of specifications for Windows HW drivers. This is an amazing example of the potential benefits of formal verification in software when an appropriate focus is put on utility as well as theoretical interest.

The Microsoft Research group that produced SDV also wrote a tech report on their success in a technology transfer from a research project to an actual software tool usable by real developers both inside and outside of MS.

It makes me wonder whether or not a similar project could be undertaken for Linux. It seems that there would be a significant barrier from the severe lack of stable driver APIs, requiring a constant update of the formal specifications. Whatever spec language used would have to be easily maintainable by regular kernel developers as they update the interfaces in each release.

Friday, September 21, 2007

Slow to start

It seems this blog has been abandoned. There weren't any posts this summer as I was working at IBM in the Linux Technology Center on a project totally unrelated to ATP (we used Xen + libvirt; awesome stuff!).

But, now that the year has started up again, I've found myself employed by a large Austin company to integrate the ACL2 system into their existing product. Yeah! I'm not sure how much of the actual project I can disclose, but I'll try to put what I learn about ACL2 here as I go along.

Sunday, April 8, 2007

Counterexamples in ACL2

One of the cool features missing in ACL2 is the ability to create counterexamples to a claim upon a failed proof attempt. This isn't always possible of course¹, but it would still be a useful feature, since the existence of a counterexample would let users know that the attempt failed because the proposition wasn't a theorem, not because of a lack of capability in ACL2.

A failed search for counterexamples might also help ACL2 recognize why a proposition is in fact probably true, and find a new approach to proving it. This is pretty unlikely, but it could be a useful step in helping the ATP get over the problem of emulating human proof techniques. When we're stuck finding a proof for a theorem, and we suspect that the conjecture is false, we might try to find a counterexample. As we search for a counterexample, we may discover that a certain tact we pick is fruitless because a subtle condition of the conjecture that we missed covers this case. This newfound insight can help us find a new proof by observing some previously overlooked structure in the problem.

Some interesting work in this realm has been done in the form of QuickCheck, and its Lisp child ClickCheck. There is also a technical report on this topic by Myung Won Kim that I haven't made my way through yet.

1 Consider if a program were always able to construct a counterexample to every false claim. Then theorem proving would be decidable, since we could tell a whether a conjecture was a theorem by whether or not we can construct a counterexample to it. But it is in fact undecidable, so we have a contradiction.

Thesis Idea

The immediate point of my foray into ATP is for my Bachelors Thesis. My current working idea is to employ ACL2 to prove theorems about Haskell programs. Proving theorems about any Haskell program would be far too difficult a project for something to be completed within 2 years (I graduate in Spring '09), so at the moment I am evaluating what subset of Haskell I wish to work on.

I'd like to have a subset that captures at least SOME of the interesting properties of Haskell. Some of the features I'm considering include:

Infinite objects
Lazy evaluation (specifically call-by-need)
Static typing
Type inference
Currying
Pattern Matching

Modeling any of these features in ACL2 presents a challenge in and of itself.

#1 is tricky since ACL2 itself only has finite objects, so reasoning about the properties of infinite objects without creating an instance of such a creature won't be simple. Fortunately, Matt Kaufmann did some work in this area back in the 80's for adapting Nqthm (predecessor to ACL2) into a theorem prover about the SASL language. He just put up a soft copy of a tech report he wrote on the subject (exactly 1 year and 1 day before I was born...) here.

#2 Shouldn't be too tricky. Since the subset of Haskell I'll be using will be completely side-effect free, I can model call-by-need with call-by-name and maintain correctness.

#3 The hard part about including this would be handling types in the untyped ACL2. I'm not sure how to go about this, but it seems like someone has probably worked on this before, so I can base my work on previous works. It's a pretty essential part of Haskell, so it is certainly going in.

#4 Would be quite difficult. I probably wouldn't be able to include this. Instead I will require explicit type annotations on functions.

#5 Definitely want to include this. Should be pretty simple.

#6 Not sure about this. Will definitely require a fair bit of research on my part. It's an important part of Haskell, but I honestly have no idea how it works. It seems really cool, so I'd really like to include it.

Since my formal background in programming languages is about nil, I've slotted myself a fair bit of reading. So far today I've humped through the first 11 chapters of Types and Programming Languages. I'm not the biggest fan of the way the book goes about things, but it's not bad.

Saturday, April 7, 2007

Welcome

Howdy!

Welcome to my blog. My name is Mark Reitblatt, and I'm an undergraduate in the University of Texas (Austin) Department of Computer Sciences. The purpose of this blog is to document what I learn as I journey into the broad and deep world of Automated Theorem Proving (ATP for short). Hopefully, this will be a useful resource to other students interested in the domain of formal methods, but unsure about where to start.

For the time being, my primary focus will be on the ACL2 theorem prover written by J Moore and Matt Kaufmann. Written in its own logic (based upon Common Lisp), ACL2 is the latest incarnation of the Boyer-Moore theorem prover. In 2006, the Boyer-Moore theorem prover won an ACM Software System Award.

ACL2 has seen extensive use throughout industry, being used to prove the correctness of the floating point division algorithm in the AMD K5 processor, the correctness of the first silicon implementation of the JVM and the correctness of the pipeline hazard detection algorithm of the Motorola CAP DSP, among many other projects. One current project is constructing a model of the Java Virtual Machine in order to prove theorems about programs written in JVM Bytecode.

In my next post I'll talk about my current goals with ACL2, namely my ideas for my Bachelors Honors Thesis.

Automated Theorem Proving in Austin TX