Problems with Software 3: Creating Crises Where There Aren't Any

Blog archive

Recent posts
Some Reflections on Writing Unix Daemons
Faster Shell Startup With Shell Switching
Choosing What To Read
Debugging A Failing Hotkey
How Often Should We Sharpen Our Tools?
Four Kinds of Optimisation
Minor Advances in Knowledge Are Still a Worthwhile Goal
How Hard is it to Adapt a Memory Allocator to CHERI?
"Programming" and "Programmers" Mean Different Things to Different People
pizauth: First Stable Release

[This is number 3 in an occasional series ‘Problems with Software’; read the previous article here.]

As an undergraduate in the late 1990s, the part of my course I liked the least was Software Engineering, which started with obviously silly 1970s practices, and went downhill from there. As someone who was already familiar with both programming and, to some extent, the software industry, I pooh-poohed the course; my arrogance was duly rewarded with the lowest mark of any part of my degree, which taught me a useful life lesson.

One previously unfamiliar concept which this dreadful course introduced to me was the ‘software crisis’. First promulgated in the late 1960s and early 1970s, it captured the notion that software was nearly always over-budget, behind schedule, and of low quality. Those with an appreciation of history will know that most of the software technologies (programming languages, operating systems, and so on) we now take for granted took form around this time. As with any new field, this was an exciting time, with no technical precedent: none of its participants would have been able to predict the path forward. Rather, experimentation with new ideas and tools led to successes and failures; the latter received greater attention, leading to the idea of a ‘software crisis’.

To the best of my knowledge, the ‘software crisis’ was the first wide-spread moment of self-doubt in the history of computing. It reflected the idea that the problems of the day were huge and, quite possibly, perpetual. It is this latter point that, with the benefit of hindsight, looks hopelessly naive: by the early 1990s (and, arguably, many years before), software was demonstrably and continually improving in quality and scope, and has continued doing so ever since. While the massive improvements in hardware certainly helped (by the mid 1990s, hardware imposed constraints only on the most demanding of software), it was surely inevitable that, as a community, we would become better at creating software, given experience.

Mercifully, I have not found myself in a discussion about the ‘software crisis’ for 5 or more years; it seems finally to have been put to rest, albeit many, many years after it had any possible relevance. However it seems that, in software, we abhor a crisis vacuum: as soon as one has disappeared, others arise in its place. I’ve seen a few over the last few years including a ‘skills crisis’ and an ‘outsourcing crisis’, all minor problems at best.

However, one ‘crisis’ has gained particularly widespread attention in the last few years: the ‘multi-core crisis’. A quick historical detour is instructive. Although – in accordance with Gordon Moore’s famous observation – the number of transistors on a CPU has doubled every 2 years for the last 50 years, by the mid part of this decade, it was proving increasingly difficult to make use of extra transistors to improve the performance of single-core (i.e. traditional) processors. In short, the easy performance increases that the software industry had indirectly relied upon for many decades largely disappeared. Instead, hardware manufacturers used the extra transistors to produce multi-core processors. The thesis of the ‘multi-core crisis’ is that software has to make use of the extra cores, yet we currently have no general techniques or languages for writing good parallel programs.

Assuming, for the time being, that we accept that this might be a crisis, there are two things which, I think, we can all agree upon. First, virtually all software ever written (including virtually all software written today) has been created with the assumption of sequential execution. Second, although we feel intuitively that many programs should be paralellisable, we do not currently know how to make them so. In a sense, both these statements are closely related: most software is sequential precisely because we don’t know how to parallelise it.

One question, which those behind the ‘multi-core crisis’ do not seem to have asked themselves, is whether this will change. For it to do so, we would need to develop languages or techniques which would allow us to easily write parallel software. Broadly speaking, only two approaches have so far seen wide-spread use: processes and threads. The processes approach breaks software up into chunks, each of which executes in its own largely sealed environment, communicating with other chunks via well-defined channels. It has the advantage that it is relatively easy to understand each process; and that processes can not accidentally cause each other to fail. On the downside, processes must be large: the approach is not suited to fine levels of granularity, ruling out most potential uses. The threads approach is based on parallel execution within a single process, using atomic machine instructions to ensure synchronisation between multiple threads. It has the advantage of allowing fine grained parallelisation. On the downside it is notoriously difficult to comprehend multi-threaded programs, and they are typically highly unreliable because of incorrect assumptions surrounding state shared between threads.

In short, processes and threads are not without their uses, but they will never get us to parallel nirvana. A number of other approaches have been proposed but have yet to prove themselves on a wide scale. Transactional memory, for example, has serious efficiency problems and is conceptually troubling with respect to common operations such as input / output. There also seems to be an interesting tradeoff between parallel and sequential suitability: languages such as Erlang, which are, relatively speaking, good at parallelism, often tend to be rather lacking when it comes to the mundane sequential processing which is still a large part of any ‘parellel’ system.

However, all this is incidental detail. The point about a crisis is not about whether we have solutions to it or not, but whether it is a real, pressing problem. The simple answer is, for the vast majority of the population, “no”. Most peoples computing requirements extend little beyond e-mail, web, and perhaps word processing or similar. For such tasks – be they on a desktop computer or running on a remote server – computers have been more than powerful enough since the turn of the century. If we were able to increase sequential execution speed by an order of magnitude, most people wouldn’t notice very much: they are already happy with the performance of their processors, so increasing it again won’t make much difference (most users will see far more benefit from an SSD than a faster processor). In those areas where better parallel performance would be a big win – scientific computing, for example, or computer games – people are prepared to pay the considerable premium currently needed to create parallel softwre. Ultimately, we may get better at creating parallel software, but most users’ lives will tick on merrily enough either way.

The ‘software crisis’ was bound to solve itself given time and the ‘multi-core crisis’ is no such thing: just because we’ve been given an extra tool doesn’t mean that we’re in crisis if we don’t use it. Eventually (and this may take many years), I suspect people will realise that. But, I have little doubt, it will be replaced by another ‘crisis’, because that seems to be the way of our subject: no issue is too small not to be thought of as a crisis. The irony of our fixation on artificially created crises is that deep, long-term problems – which arguably should be considered crises – are ignored. The most important of these surely relates to security — too much of our software is susceptible to attack, and too few users understand how to use software in a way that minimises security risks. I have yet to come across any major organisation whose computing systems give me the impression that they are truly secure (indeed, I have occasionally been involved in uncovering trivial flaws in real systems). That to me is a real crisis — and it’s one that’s being exploited every minute of every day. But it’s not sexy, it’s not new, and it’s hard: none of which makes it worthy, to most people, of being thought of as a crisis.

Newer 2011-06-28 08:00 Older
If you’d like updates on new blog posts: follow me on Mastodon or Twitter; or subscribe to the RSS feed; or subscribe to email updates:

Comments



(optional)
(used only to verify your comment: it is not displayed)