A colleague pointed me to a blog entry on a blog by the VP of VMWare. It talks about a new “replay” feature in the next version of VMWare — you can record the execution of your VM to a file, and then replay that file later. This will reproduce the VM’s execution exactly as it happened the first time. This is an unbelievably cool feature for debugging things like memory stomps and race conditions.
When trying to find a bug, the easiest way is to run it in the debugger, put breakpoints near where you think the problem is happening, and then run it, hoping that it fails the same way that it did before. If it doesn’t, well, try again, or tweak some part of the test case to try to cause it to be more reproducible. If it’s a race condition, it’s highly possible that attaching the debugger or tweaking the test changes the timing enough that the problem never shows up. With this thing, once you’ve reproduced a problem, you could then replay it (in the debugger!) and step through the code, knowing that the problem will reproduce. You can step “back in time” to just before the problem occurred, and then step through the problem as it happens. According to the web site, the recording process has no effect on the speed of the VM. This is not possible according to the Heisenburg Uncertainty Principle, which, strangely, is specifically referenced in the article. Ah, what did he know, anyway?
Apparently it doesn’t have SMP support, so if your problem is SMP-related, this won’t help you. But it looks like a very nice piece of software engineering, so kudos to the VMWare people!