This is what the average computer looked like in the 1950sClapagaré

There is often a bewildering gulf between the things we use and what we know about them. Relatively few people have a detailed understanding of how a modern aeroplane works, yet many of us are happy to trust that a plane will get us safely to our destination. Phones, medical equipment, and cars are all coordinated internally by software. Usually this software is designed to do good things, and sometimes it is designed to do devious things (such as misreport a car’s emissions). Often we witness gaps in the design, and these show up as incorrect behaviour. This can be especially damaging in science, since a software malfunction could lead to incorrect scientific results. How can we trust the software used in science?

The first motto of the Royal Society was “on the word of no one”; we should not have to trust knowledge based on its provenance, but rather recreate it for ourselves. Yet scientific research often depends on the software of not one, but very many, people whom we have never met.

Professor Carole Goble has led a raft of initiatives to support scientific research. Her projects have built and contributed to tools used by scientists, and helped develop communities of users for scientific software. Last summer I interviewed her about her work.

Just as in so many other areas, software plays a vital role in modern science. “Software has two significant jobs when it comes to science and computational science. The first is an instrument, just like a telescope or a microscope, and in fact it’s probably one of the most prevalent scientific instruments, not that people think of it as an instrument, but it is — and that is why it has to be well-engineered, and maintained, and tested… And the second thing is as a record of what you did. The algoritHm involved, for example. The full set of resources that you actually used, as a record. That could be reproduced, in an alternative setting, or if the computational framework is no longer available, repaired in some way, or recovered in some way, but as a record of what you had — and for that you need open, clear description.”

She underlines how knowing about software is one thing, but developing robust and usable software is a different thing altogether: “I run a team at Manchester, and currently have around 15 post-docs working for me, which is quite a big team, and 12 of them are software engineers — they’re not computer science researchers. I kind of evolved into somebody who delivers platforms. We have rules about the computer science researchers — their code’s not getting into production platforms. I always tell my students: the definition of computer science is: it doesn’t work. And for software engineering: it does. They’re quite different things.”

To improve the quality of software in science, she helped corral a training initiative for scientists: “The quality of software engineering is pretty poor across the board, but you can teach skills, and that’s what ‘software carpentry’ does. We’re very heavily involved in this, and we’re launching its sister, ‘data carpentry’.” These are volunteer-based initiatives that run over a hundred training workshops a year, and contribute to training materials which are made openly available online.

The development of software can take several years, in a complex web of contributions: “I’m a career academic, and so I’m happy for other people to build businesses based on my software. I can put it on my impact statement, and might be able to get some more money next time.”

Formats and computers change, researchers come and go, yet the software should continue to work. Five years ago, Professor Goble co-founded the Software Sustainability Institute to help improve the quality and longevity of research software. This involves computer science, but also extends into community development: “Not all software shall survive, but software that turns out to be useful needs to have sustainability options. Open source is a great resource for sustainability — the more projects that use a software, the more likely it is to be able to attract funds. But we also need people to understand that, if they’re using somebody else’s software — which is what the dream of most funding agencies is, that they only have to pay for it once — then the community must play the game: being good citizens and citing it, giving credit to the people who have produced it, and maybe even contributing to it with their own grants, either in kind with development, or in cash to keep the development going. And the funding agencies need to also find mechanisms to be able to fund infrastructure that isn’t going to be peer reviewed against novel research, because they’re not the same thing at all.”

Openness seems to be a vital part of Professor Goble’s initiatives. People write software, but people occasionally make mistakes. It is important to make critical software open for inspection by others, who might be able to spot and fix mistakes: “This is when you want to know the transparency part of it — what does it really do, and under what circumstances should you not use it. This is often badly described and documented. How far can you diverge from the steps or services in the algorithms, how far can you push the range of parameters before it isn’t valid anymore?”

The pursuit of science is changing to benefit from new tools. This is feeding on, and enabling, more openness in science. Science also seems to need perpetual renewal, human as much as technological. “Was it Max Planck who said that progress in science proceeds through a series of funerals? And he’s right in the sense that, you’re kind of trained in the image of your professors — but computational science is nothing like [what] people in their 50s were trained in.”