Virtual machines are playing a role in cyber security researchflickr: sandia labs

Research thrives through sharing: ideas, methods, results, experience, tools, anything. However, sharing alone is often not enough.“You can download our [computer] code from the URL supplied. Good luck downloading the only [post-doctoral researcher] who can get it to run, though” tweeted computational biologist Prof. Ian Holmes last year, tongue firmly in cheek. In recent years, scientists have started relying on a new technology to share their software tools, in a scenario similar to the 'brain in a vat' philosophical thought experiment.

Imagine your brain is placed in a vat filled with nutrients, and wired to a computer that simulates your physical environment. The computer sends signals to your brain that determines what you see, hear and otherwise experience. The computer also receives signals from your brain, such as an instruction to move your limbs, and updates the simulated environment accordingly. As far as you are concerned, the computer’s simulation of the universe is perfect. Would you be able to deduce that your brain is suspended in a vat, rather than physically located in your skull?

Now imagine a slightly different scenario. You have a computer program that can simulate a computer. That is, when you run this program, it simulates the internals of a computer, and this provides you access to two computers: the 'real' computer on which the program is running, and the simulated computer maintained by the program. As before, the simulation is complete: you can even run other programs on the simulated computer.

Does it make any difference to programs running on the simulated computer that their computer isn’t real? For most programs, it doesn’t. In fact, the ‘real’ computer itself might be running as a simulation inside another computer. This illusion makes it easy to share programs, because you can move a simulated computer between real computers, and none would be the wiser. This method is used, for example, to run Windows programs on a Mac.

A program that simulates a computer is called a virtual machine (VM). Scientists have started using VMs to share their research software. Software is pretty pervasive in science, but research software is notoriously difficult to install and use. Using VMs makes it easy to run other people’s code: you simply run the VM that contains the code. This will then simulate the computer environment where that code was installed by its creators. Although VMs are easy to run, they are not always easy to analyse – it’s a bit like trying to sift through a real computer. Dr Titus Brown, who has contributed much to the debate on openness in research software, criticised the VM approach for not being sufficiently open. He blogged that, “providing a gigantic black box of custom installed code that was installed, set up, and executed by experts just isn’t very useful to many people.”

Prof. Ian Gent runs the Recomputation project, which seeks to use VMs to make software shareable across researchers and across time. He refined Brown’s critique to argue that it’s preferable to share the assembly instructions of a VM, and not just the VM itself. This is a bit like sharing a program’s source code rather than the program alone – not only can the program be generated from the code, but the code is much easier to analyse.

Software tools are widely used in science, but they are often hard to share and reuse. VMs are emerging as a valuable ‘meta’ tool for sharing, reusing and extending software tools more easily. Perhaps one day ‘VM’ will become as recognisable a term as ‘telescope’ or ‘flask’.