“Research” is a much broader term than most of us who consider ourselves to be researchers realise. My occasional interactions with people in very different disciplines (e.g. biology) have tended to involve a series of culture shocks to both parties. Even within computing, there are significant differences between sub-disciplines. These seem to mostly go unnoticed or, at least, uncommented on, which is a pity: at the very least it’s worth knowing that the way we do things isn’t the only possible way. One of the most surprising differences, at least to me, is where people expect to find the problems that they then work on.
What is a research problem?
Before I go further, it’s worth defining what I mean by “research problem”. At the most basic, I mean “the problem I’m trying to solve” or “the thing I’m trying to make better”. Research problems tend to come at multiple levels. For example, at the highest level, I want to make software better, but that’s such a vague aim that it’s difficult to convert into meaningful action. What I have to do is find lower-level, more concrete, problems where I can better define what direction I want to head in and what success might look like. For example, I might say “programs written in programming language X are so slow that programmers twiddle their thumbs waiting for programs to run; if I can make programs in language X run twice as fast as they currently do, programmer productivity will increase”.
I’ve actually done two separate things in that small example. Not only have I defined a problem (programs in programming language X run too slowly) but I’ve also motivated why that problem is worth trying to solve (it’s degrading programmer productivity). The requirement to have such a motivation is a common expectation in the parts of computing I work in, but not everywhere. For example, in some areas of mathematics, it’s considered worthwhile to solve a problem irrespective of any possible real-world use. In A Mathematican’s Apology, G. H. Hardy said [1]:
I have never done anything ‘useful’. No discovery of mine has made, or is likely to make, directly or indirectly, for good or ill, the least difference to the amenity of the world. I have helped to train other mathematicians, but mathematicians of the same kind as myself, and their work has been, so far at any rate as I have helped them to it, as useless as my own. Judged by all practical standards, the value of my mathematical life is nil; and outside mathematics it is trivial anyhow.
Personally, I like the idea that the research problems I’m working on might lead to practical use, though I’m realistic that this won’t always work out — after all, I don’t have perfect insight into the future.
How I generally find research problems
My most common approach to finding research problems derives accidentally from another habit of mine. I spend a considerable portion of my time in what I think of as “scouting mode” where, for the research problems I’m currently working on, I’m trying to understand what might come next. It’s easiest to explain this as involving two steps.
First, I explicitly try to better understand the problem I’m currently tackling, because as I work, on it my impression of exactly what the problem is, or should be, evolves. Mostly I try to think of where the boundaries of the problem are and whether there might plausibly be path(s) to get to those boundaries. This helps me identify both challenges (i.e. “this is going to take longer than I thought”) and opportunities (i.e. “this also accidentally solves another problem”) that I’m likely to encounter.
Since that first step frequently identifies more challenges than I hoped for (I am an optimist after all), my second step is to look for inspiration, either as tools or techniques, that might help solve those challenges. I take all sorts of things into considerations: blog posts, research papers, conversations, source code, README files (yes, really!), my own experiments, and so on. I guess that I probably average 30-60 minutes a day looking for such inspiration and trying to think through the consequences of what I’ve found.
Mostly my hunt for inspiration is reactive: I have just stumbled across a problem that I can’t fix with a 30 second web search and I need to find a solution for it now. Most of these cases tend to involve relatively minor software details and are often resolved by simply spending a bit more time searching (which might, for example, uncover a new-to-me library that does what I need). Sometimes I realise I’ve hit something deeper and then I might, for example, spend a couple of hours reading through relevant research literature hoping to find a solution.
However I deliberately spend some time proactively looking for inspiration without a concrete problem in mind: I’m trying to build a cache of tools and techniques that will help me for what I might encounter in the future. My main aim here is to fill a short-term cache for the problems I think I will encounter in the coming weeks and months. This is deliberately unstructured: after all, I don’t really know what problems I might encounter even tomorrow! As a pleasant side-bonus this also inevitably populates a long-term cache: it broadens my outlook and gives me a wider sense of what’s possible.
Whether I’m reactively or proactively looking for inspiration, occasionally, and seemingly out of nowhere, something will jump out at me as particularly interesting. Sometimes, I realise that an existing set of solutions to a problem are incomplete. For example, when I looked into parsing error recovery, I quickly realised that most existing techniques had been designed with the severe limitations of 1980s hardware in mind. It didn’t take much extrapolation to realise that modern hardware might change what was possible. Sometimes, I realise that a new-to-me technique or tool could solve a problem I hadn’t even thought of before. Within a few hours of stumbling across meta-tracing, for example, I realised that fairly fast programming languages no longer had to be the preserve of wealthy organisations [2]. A little while later (prompted by an off-hand comment from someone else) I realised that meta-tracing could be the basis of a solution to migrating software between programming languages, a problem I had never even thought of before! Similarly, CHERI opened my eyes to all sorts of possibilities for pragmatically improving the security of existing software, the consequences of which I’m still trying to think through.
Many readers have probably read of Richard Feynman and Richard Hamming’s description of problem solving. In Hamming’s words [3]:
Most great scientists know many important problems. They have something between 10 and 20 important problems for which they are looking for an attack. And when they see a new idea come up, one hears them say “Well that bears on this problem.” They drop all the other things and get after it… Now of course lots of times it doesn’t work out, but you don’t have to hit many of them to do some great science. It’s kind of easy. One of the chief tricks is to live a long time!
How does this relate to my description? Well, when I’m in proactive mode, I don’t really have a long-standing list of concrete problems in my head: I’m less trying to match problems and solutions than I am trying to better understand possible problem areas [4]. When I come across a new tool or technique that seems plausible [5] I try to guess what its knock-on effects will be: mostly I’m unable to do so; sometimes I can guess a bit but nothing seems particularly interesting; but, once in a while, whole areas seem to unveil themselves to me. When the latter occurs, I know I’ve found a research problem I want to work on [6]! Crucially, the problem areas that occur to me are nearly always the result of the proactive scouting I’ve performed in the past: for me, at least, it’s a virtuous cycle of solving current problems and identifying future ones to tackle!
Some other ways of finding research problems
You can probably imagine my astonishment when I realised that in some communities there is a widely accepted list of “the next problems to work upon”. For example, in some (many? all? I don’t know!) branches of telecoms, where people are often working towards things like the next mobile network communication standards (4G, 5G, etc.), there seems to be a shared understanding based on past experience of what might work and what might not. Propose, for example, the use of a conductor material that didn’t work in the previous generation, and you’re going to get very short shrift from nearly everyone. Rather, the community seems to have a sense of “there are ten plausible materials, three are currently being trialled, so pick one of the other seven to work on.” I am, of course, simplifying and caricaturing what goes on, so please don’t see this as a criticism: that community has been highly successful in rolling out a sequence of impressive real-world advances.
A very different way of obtaining research problems is when a field goes through alternating periods of “big ideas” followed by “incremental gains”. In computing, the most recent big idea is machine learning. There are now tens of thousands of researchers (at a low-end estimate) taking the basic idea of machine learning and either applying small tweaks to it, or finding new problem domains to apply it to. It can seem like there’s an almost infinite set of small problems to work on in machine learning, which means that no-one needs to do too much thinking before choosing one and charging forwards. It’s easy to dismiss the individual value of most of this work as very low — yet, collectively, it’s clearly pushing this field forwards. This is very different from, say, programming languages where it’s close to an expectation that each and every researcher will have a distinct niche that they’re working on.
The final approach I’ve encountered is where research problems come from an external body. For example, if you work in medical research, the priorities of funders will often point you towards working on certain health problems over others. One advantage of this is that the external body can not only drive a critical mass of researchers towards solving a common goal, but help ensure that different researchers don’t unnecessarily duplicate work. A disadvantage is that if the external body fixates on a pointless or unsolvable problem, it can cause an awful lot of wasted research.
Fundamentally, I don’t think that any of the approaches I’ve outlined above is inherently better or worse: they’re just different. My intuition is that it’s a good thing that there are different approaches to finding research problems. And, at the risk of stating the obvious, while finding a good research problem to work on is vital, how one goes about addressing it is an entirely different matter!
Acknowledgements: thanks to Edd Barrett, Lukas Diekmann, and Dan Luu for comments.
Footnotes
Time has shown that Hardy was wrong: since his death in 1947, some of his work has turned out to be useful!
Time has shown that Hardy was wrong: since his death in 1947, some of his work has turned out to be useful!
For example, at a very conservative estimate, HotSpot – the “main” Java virtual machine – has had somewhere between 1000-4000 person years of effort poured into it. The personnel cost implied by that is staggering — not to mention that the people involved are often an organisation’s most talented, who could conceivably be earning the organisation copious income if deployed elsewhere.
For example, at a very conservative estimate, HotSpot – the “main” Java virtual machine – has had somewhere between 1000-4000 person years of effort poured into it. The personnel cost implied by that is staggering — not to mention that the people involved are often an organisation’s most talented, who could conceivably be earning the organisation copious income if deployed elsewhere.
As far as I know, we only have a second hand quote of Feynman’s advice via Gian-Carlo Rota though the final sentence does sound like vintage Feynman:
You have to keep a dozen of your favorite problems constantly present in your mind, although by and large they will lay in a dormant state. Every time you hear or read a new trick or a new result, test it against each of your twelve problems to see whether it helps. Every once in a while there will be a hit, and people will say: ‘How did he do it? He must be a genius!’“
As far as I know, we only have a second hand quote of Feynman’s advice via Gian-Carlo Rota though the final sentence does sound like vintage Feynman:
You have to keep a dozen of your favorite problems constantly present in your mind, although by and large they will lay in a dormant state. Every time you hear or read a new trick or a new result, test it against each of your twelve problems to see whether it helps. Every once in a while there will be a hit, and people will say: ‘How did he do it? He must be a genius!’“
Whether this is because I’m too stupid to identify concrete problems in advance and or because areas such as software problems are inevitably woolier than in the physical sciences I’m unsure.
Whether this is because I’m too stupid to identify concrete problems in advance and or because areas such as software problems are inevitably woolier than in the physical sciences I’m unsure.
There’s no point thinking about the implications of something that I don’t think can work.
There’s no point thinking about the implications of something that I don’t think can work.
Conversely, I often fail to get excited by solutions that seem to excite other people, because I cannot see how they project backwards onto research problems. For example, an astonishing number of people I’ve come across tout WebAssembly as a solution to their problems (and a wide variety of problems at that!). I see it as a neat solution to a niche problem (running C/C++-ish code in a browser) but I am unable to see it as a general solution to other problems. I hope, however, that other people are right and I’m wrong!
Conversely, I often fail to get excited by solutions that seem to excite other people, because I cannot see how they project backwards onto research problems. For example, an astonishing number of people I’ve come across tout WebAssembly as a solution to their problems (and a wide variety of problems at that!). I see it as a neat solution to a niche problem (running C/C++-ish code in a browser) but I am unable to see it as a general solution to other problems. I hope, however, that other people are right and I’m wrong!