Trip report from SOSP 2007

I spent the last few days at SOSP 2007, the latest instantiation of a biennial conference on operating systems research. This was held in Stevenson, WA, along the Columbia River Gorge about 45 miles east of Portland.

General observations:

  • This area is absolutely gorgeous in the fall. Living in California, I tend to forget about the whole trees-changing-colors thing in the fall. We had a couple of days of nice weather at the beginning of the conference and the views of the river and surrounding hills were just spectacular. Of course by the end of the conference we had cold drizzling rain and I remembered why I don't live there.
  • For anyone looking for good brewpubs in the area, I highly recommend checking out Walking Man Brewery in Stevenson. The place doesn't look like much, but they had a very nice barleywine and imperial IPA, as well as a good standard IPA and stout. The smoked salmon pizza was also excellent. Double Mountain Brewery in Hood River, OR is also worth a visit.
  • The fact that SOSP included 3 papers on OS support for Javascript is a sad reflection of the state of computing. Nothing against the authors of the papers (who were trying to find solutions to real problems), but ~50 years after the introduction of FORTRAN, LISP, and Algol 60, is this the best we can come up with?
  • A lot more people care about byzantine fault tolerance (at least in the research community) than I would have thought possible.

Comments on specific talks:

  • TxLinux: this looks at how to map the synchronization primitives in Linux (mostly spinlocks) onto transactional memory primitives. Nothing earthshattering, but some decent practical work, including looking at issues like dealing with I/O interrupts in the middle of a transaction, scheduling changes, and priority inversion. One of the problems with moving to new hardware paradigms (such as different consistency models or transactional memory) is figuring out what to do with all of that legacy code, most prominently operating systems, programming language runtimes, and large application platforms like databases.
  • Triage: the goal here is to automatically debug problems onsite after a fault is detected (e.g., program crash). The main idea is to use checkpoints to back up to (hopefully) prior to the fault, and replay to introduce different variations to help detect the problem. E.g., if the problem goes away depending on scheduling order it's probably some sort of race condition. The implementation was focused on user level programs using OS checkpoint/replay support but I don't see any reason it wouldn't work for OS kernels (or more complex multi-process applications) with VM based checkpoint/replay.
  • iComment: the basic idea here was to apply natural language processing (plus some manually constructed "filters") to infer program assumption based on comments. Aside from the fact that the testbed was the Linux kernel, this didn't really have anything to do with operating systems - the techniques would apply equally well (or equally poorly) to any sufficiently complex application. The assumption of this work is that developers are sufficiently mechanical in the way they write comments (e.g., "assumes lock is held" in the comment that prefaces a function) that a program can infer meaning and automatically test for correctness. But if we're so mechanical, why not use more formal annotations to express intent? The most prominent example of this is ASSERTs, which serve as both verifiable statements of intent and documentation of assumptions. I'd be more interested in research on what other types of annotations would be useful, and perhaps some analysis of why annotations haven't caught on, than in "fuzzy" natural language techniques.
  • SecVisor: this project looked at using a small hypervisor (and hardware virtualization support a la SVM or VT) to prevent illicit kernel code execution. The hypervisor used shadow page tables to control the memory permissions, preventing kernel code from being modified or execution of text or user memory while in kernel mode. This relies on being able to detect when the processor switches between user and kernel mode (by tracing the entry/exit points), so that pages can be remapped (e.g., restoring user level execute permissions when exiting the kernel). Of course this assumes a static kernel text - supporting loadable kernel modules gets more complicated (the hypervisor has to get involved in loading the text and doing appropriate validation to make sure the module is OK).

Making code reviews less painful

One constant in large software development projects is code reviews - letting your peers look at what you've done to make sure you haven't missed anything.  It's been called the last defense against brokenness since it represents the last check before your code goes into the common repository and affects the rest of the developers (and sooner or later, your users).  And, at least in my experience, the tools available for code review are barely adequate.  In the 13+ years I've been doing code reviews (both as reviewer and reviewee), I've used a wide variety of tools, from simple text-based ones like diff and patch to more complex web-based tools that generate dynamic HTML pages with pretty colors indicating what's changed.  Although these have gotten better and better at highlighting the changes and allowing you to look at other context in the modified files, they generally don't help at all with the other side of the review process - capturing comments and the discussion around them.  That's usually left in email, often one-on-one between the reviewer and the developer - meaning other reviewers miss the context of previous review comments.  Or everybody sees the comments, even about code they're not interested in.  Also, as a reviewer I really get tired of typing in file names and function names or line numbers before every comment so people know what I'm referring to - I want to focus on the substance of the comments, not how to describe the location of the code I'm talking about.

I've seen a few attempts to address these issues, but the one I've been using lately that seems to have a lot of promise is ReviewBoard.  This was developed by VMware's Christian Hammond and David Trowbridge (with help from others), and a number of groups within VMware are now using it (so I've had a chance to use it "in anger").  It's a browser-based online review system, and can be used to look at file diffs, expand out unmodified sections of the file, etc..  In addition, comments are added within the tool itself, and can easily be associated  with a given source line (clicking on the line brings up a box to enter the comment).   When the reviewer is done with comments, they're "published", and the developer sees them.  They then can respond, the reviewer can respond to the response, etc. - and the conversation is all captured within the tool.  (It's tied to email as well, so you don't need to keep refreshing your browser to see what's changed.)  In addition, one reviewer can see the comments from another reviewer - and add their own comments.  The developer can also refresh the changes after updating to address the review comments.  The end result is to capture the entire review process in a single place - in a way that's transparent to all participants.

ReviewBoard is built on python and django, and has backends that support CVS, Subversion, Perforce, and Mercurial.  It's under an MIT license, so you can look at the code and change it however you want.  If you're a developer who works on a project where you do code reviews, I'd suggest checking it out.

A new source browser is born

Back when I was at Sun working on OpenSolaris, I decided that we needed a good web-based source browser to help show off the "product" (which was, essentially, the source code itself).  At the time, we had a pilot site using CVSweb, which was easy to set up but (IMO) incredibly painful to use.  In particular, it didn't have any support for cross-reference links (the ability to jump to the definition of a function or variable by clicking on the name in the source code), and the search capabilities were pretty minimal.  I wanted something more usable for studying the source and general development activities; i.e., something that I (a diehard cscope user) would actually find useful in my own development work.

At the time, the state of the art was LXR, the Linux Cross-Reference system.  LXR provides cross-reference support, as well as freetext searches (via Glimpse), but the setup was somewhat cumbersome, support for revision history seemed awkward, and there were licensing issues with Glimpse.  Fortunately, it turned out that a Sun engineer was already playing around with web-based source browsers.  We talked with Chandan about our requirements, and he agreed to work on a new source browser.  The result was OpenGrok.

OpenGrok has actually been usable for a while as the source browser for OpenSolaris, but the source code was just released a couple of days ago.  This means people can now download it and use it for whatever source tree they're interested in (including proprietary code).  I did this recently with some internal VMware code, and (aside from a few minor issues that Chandan's working on fixing) it was remarkably easy.  OpenGrok is all Java based and runs on top of Tomcat, so once you have a recent version of Java (1.5) and Tomcat set up, it's pretty simple (even for a Tomcat novice like me).

The result is a source browser that not only has the cross-reference links and freetext search capabilities of LXR, but supports syntax highlighting and has a nice clean look and feel.  It's also completely free (in both senses) - released under the CDDL license.  (I suppose some people may have problems with the Java dependency, but at least that's free-as-in-beer for everyone.)  And, somewhat surprisingly given the fact that it's all Java, it's fast.  I admit I was skeptical when I first heard that Chandan was planning to use a Java search engine (Apache Lucene), but the results are impressive.  I'd suggest that anyone who spends much time looking at complex source trees check this out.