Stick or Twist?

Blog archive

Recent posts
Some Reflections on Writing Unix Daemons
Faster Shell Startup With Shell Switching
Choosing What To Read
Debugging A Failing Hotkey
How Often Should We Sharpen Our Tools?
Four Kinds of Optimisation
Minor Advances in Knowledge Are Still a Worthwhile Goal
How Hard is it to Adapt a Memory Allocator to CHERI?
"Programming" and "Programmers" Mean Different Things to Different People
pizauth: First Stable Release
Podcast: mp3, Opus

All of us have points in our lives where we have to decide whether we should continue down our current path or change to another. As a researcher, I often face a constrained version of this problem: should I continue on my current research path or change to another? For a long time I wasn’t aware that I was being faced with such decisions; and, when I did become aware, I wasn’t sure how best to make a decision. Over time I’ve realised that a simple “stick or twist?” heuristic mostly works well for me. I don’t claim that anything in this article is novel, nor do I think I’m describing an approach that’s applicable to every situation — but it might provide some useful food for thought.

Let’s start by dismissing a common heuristic: wait and see. When faced with a hard decision, most of us have a tendency to hope that the underlying problem magically disappears: sometimes we consciously delay making a choice, but often we try and pretend the choice doesn’t exist at all. Although it sounds like a cliché, I’m a firm believer that, in general, not making a decision is equivalent to making a decision. If I dither over whether to buy a shop’s last cake or not, someone else will buy it: I’m then, involuntarily, in exactly the same situation as if I’d decided not to buy the cake. If used too often, “wait and see” turns us into corks on the ocean, entirely at the mercy of events. Except for a small number of exceptionally poor decisions, I’ve found that I personally regret decisions I didn’t take more than decisions I did take.

Next, let’s dismiss the two extreme heuristics for making decisions: never change (which is similar, though with a very different motivation, to “wait and see”) or always change [1]. Because we live in a continually changing world, it is inevitable that even a once-optimal path will become suboptimal over time: not changing guarantees irrelevance in the long run. At the other extreme, change for its own sake means that even when we stumble onto an optimal path, we’ll change off it for no good reason.

Although I didn’t realise it at the time, my first professional encounter with the need to make a decision about my research direction happened during my PhD. I was part of a research group doing what is now called “software modelling” but which back then was mostly just called UML. As part of that, I started attending international UML standards meetings. I soon realised that most people at these meetings shared a common vision, which is that UML should take over from programming languages. In other words, rather than programming using text, they wished to use UML’s block-and-line diagrams instead. UML had originally been intended to represent high-level program structure (the software equivalent of architectural blueprints), not low-level detail. I wasn’t sure that this was desirable, nor was I sure that it was possible. However, since I was often the youngest person in the room by 15 or more years, I assumed that I must be missing out on a profound insight.

Those doubts grew stronger when, in around 2003, I found myself in a lift with a senior member of the UML standards committee. Somehow his conversation turned to the evil programming languages people, who were somehow conspiring to prevent software from being written in UML. I have never had much truck with conspiracy theories, so I asked him outright how UML could be used for operating system kernels, which have hundreds of thousands of lines of code. He replied, in a tone which suggested he did not expect disagreement, “in 5 years, Linux will be written in UML”. Naive as I was, it was obvious to me that this was a ludicrous timeline and it made me wonder what other unpleasant details the community might have been ignoring.

The confirmation point came a few months later at a more academic meeting. A senior Professor got up to demonstrate the tool his research group had slaved over for years, which allowed one to translate textual programs to UML and back again. He brought up a simple program consisting of 5 lines of text, pressed a button, and transformed it into equivalent UML. The diagrammatic representation was so verbose that it required scrolling over two screens. I stayed silent but someone else politely, and I’m fairly sure innocently, asked “isn’t the diagrammatic version a bit harder to understand?” The Professor looked absolutely stunned by the question: ”but… it’s diagrammatic!” was all he could muster. A couple of other people agreed that it looked a bit harder to understand. The Professor was crushed: rarely before or since have I seen a man shift from a confident to a haunted look so quickly.

By this point I had a clear explanation in my head for why UML could not replace programming languages: diagrams are excellent at displaying a small number of elements but they do not scale past what you can squeeze onto a single screen [2]. Even beginners write programs which are too big by this metric!

I then made what I now realise is a foolish mistake: because I could articulate what I thought was a compelling reason why UML would not scale, I assumed everyone else would come to the same conclusion and the field would collapse. 15 years after I thought this, the (renamed) UML research field is, I believe, about the same size as when I left it.

Why was I so wrong? I suspect that most people in the UML community would probably agree with me that UML has problems scaling. However, I suspect that they think the problem will gradually disappear over time whereas I cannot see a realistic way in which it will be solved. However, neither side’s belief can be validated at the moment and it is unlikely that my belief can ever be completely validated. In retrospect, I should have been more humble about my judgment on UML. It would have been enough for me to say that, in my opinion, the probability of solving UML’s scaling difficulties was lower than the probability of it being unsolvable.

However, whatever my motivation was, or should have been, I did change direction, into the field of programming languages. I created Converge, whose main aim was to allow people to extend the language by embedding DSLs (Domain Specific Languages) within it. From a technical perspective this worked well enough (see e.g. an assembler DSL or a statemachine DSL) but it required users to differentiate the DSLs with special brackets ($<<...>>). To my surprise, those special brackets turned out to undermine the whole idea: they are so visually intrusive that if a screen of code contains more than a couple of them, it becomes much harder to read than normal. Unfortunately, the brackets are necessary, because without them it is impossible to parse files [3]. Slowly — and painfully because I had spent several years of my life on this — I was forced to conclude that this approach would never be viable. After several years of feeling stuck (and trying, and failing, to move onto other research problems), I realised that it might be possible to solve the problem of language embedding in an entirely different fashion (which ended up in our language composition work).

However, before I had got that far, I made another mistake. Underlying DSLs in Converge is a macro system (or, more formally, “compile-time meta-programming”) derived from Template Haskell. Since DSLs in Converge had failed, I generalised further that macros aren’t very useful in modern programming languages [4]. When I later gave a talk titled something like “macros aren’t useful”, other macro researchers were, with one exception [5], bemused by my explanation, thinking it comically simplistic. In retrospect, I now realise that we were tackling subtly different problems: I was trying to embed multiple DSLs within a single normal file while they were identifying whole files as being of one DSL or another. Macro-ish techniques work better for the latter than the former because there is no need to have any syntax to identify one language from another. In other words, my generalisation was subtly flawed.

Those two examples represent some of the more consequential times that I’ve reconsidered my direction of travel, but there are others. After a while, I came to understand what my “stick or twist?” heuristic is: I now think of it as continually searching for the simplest plausible reason why my current direction is wrong. When I find a sufficiently simple reason, and once I’ve convinced myself the reason is plausibly true, I feel that the best course of action is to change direction. Why a “simple reason”? Because experience has taught me that until and unless I can distill a problem down to a very simple concrete explanation, I’ve not actually understood the problem. Why “plausible”? Because it’s rarely possible to be certain about such matters. Once the possibility of being wrong has become sufficiently high, I’d rather risk abandoning old work that might have succeeded rather than getting endlessly stuck on a problem I can’t solve [6].

As both examples I’ve given might suggest, other people can come to a different conclusion than me. Reasonable people can often disagree about the probability of a risk manifesting, and different people often have very different thresholds for when they consider a risk worth taking. Even the same person can have different thresholds in different areas: I suspect my willingness for taking a risk in research is higher than average, but in other areas of my life it’s definitely lower than average. So, just because I’ve decided to move on doesn’t mean that other people have made the wrong decision by staying. Indeed, I really hope that they prove me wrong! Wouldn’t it be great if the UML community could improve on textual programming? I might think that outcome unlikely, but if I’m wrong, the resulting dent to my ego will be more than worth it.

Before you worry that I’m indulging in false humility, let me assure you of the true meanness of my spirit by stating that I think not all stick or twist heuristics are equally good. In particular, there is a polar opposite heuristic to mine: continually searching for the most complex reason why my current direction is right. I doubt that anyone would admit to this heuristic out loud, but it is not hard to find people using it, for example within politics or, alas, in some academic disciplines. Why a “complex reason”? There seem to me to be two different causes. Some people seem to be scared that if they’re clear about their thoughts they’ll be exposed as charlatans. Others simply want to win, irrespective of whether their argument is correct or not: a lack of clarity is one weapon they use to assert victory [7].

Inevitably, writing my stick or twist heuristic down is much easier than applying it. First, my ego does its best to block any thoughts that I might be wrong. Second, even when I identify a possible problem with my current direction, and a plausible explanation to explain it, acknowledging it to the point of taking action requires surprising will-power. Third, hard decisions are generally hard because we lack concrete data to guide us. I have to rely on my own experience or readings to deduce a plausible direction — and, largely, trust to luck. It can feel frightening to make decisions knowing that.

Despite these issues, anyone who’s bored enough to look over my research papers will be able to identify at least two other major changes of direction similar to those noted above, each of which was the result of using the same heuristic. Less obvious are the more numerous minor changes I make using this heuristic: when programming, I can sometimes go through several such decisions in a single day.

Finally, you may have noticed that I’ve used “change direction” in two different senses. In the first instance, I abandoned a research field entirely and moved to another; in the second, I merely abandoned a particular approach to tackling a problem, later trying a different approach to that same problem. In both cases, I had to put aside several years of work, but the first course of action might be seen by some people as the more dramatic of the two. To me, they’re indistinguishable in magnitude: the real difficulty was identifying the problem, simplifying it, and acknowledging it. Personally, I hope I have many changes of direction yet to come!

Acknowledgements: Thanks to Chun Nolan for comments.

Newer 2020-10-07 08:00 Older
If you’d like updates on new blog posts: follow me on Mastodon or Twitter; or subscribe to the RSS feed; or subscribe to email updates:

Footnotes

[1]

I’ve not met anyone who sits precisely at either extreme, but I’ve met people who come surprisingly close.

I’ve not met anyone who sits precisely at either extreme, but I’ve met people who come surprisingly close.

[2]

You can see this clearly with state diagrams. Small examples are nearly always clearer than the textual equivalent; but once they get just a little too big, the textual equivalent is nearly always superior.

You can see this clearly with state diagrams. Small examples are nearly always clearer than the textual equivalent; but once they get just a little too big, the textual equivalent is nearly always superior.

[3]

Alas, this was neither the first nor the last time that parsing has caused me problems — nearly always because of my ignorance about what is and isn’t possible.

Alas, this was neither the first nor the last time that parsing has caused me problems — nearly always because of my ignorance about what is and isn’t possible.

[4]

I was wrong: in statically typed languages you often need macros to generate code that the type system would otherwise forbid. This is, of course, the original motivation of Template Haskell but I was too stupid to appreciate this.

I was wrong: in statically typed languages you often need macros to generate code that the type system would otherwise forbid. This is, of course, the original motivation of Template Haskell but I was too stupid to appreciate this.

[5]

A senior member of the community stood up and said “he’s right but for the wrong reasons”. Sometimes one has to take praise in whatever form it comes!

A senior member of the community stood up and said “he’s right but for the wrong reasons”. Sometimes one has to take praise in whatever form it comes!

[6]

When I was an undergraduate and a PhD student, I was often surprised by older academics whose research seemed stuck in an ancient rut. In fact, ”surprised” is a polite way of putting it: I laughed at such people. I’m ashamed to say that one was a gentle, thoughtful lecturer called Malcolm Bird (who died a few years ago). He seemed to us undergraduates to be hopelessly out of date. A couple of years later, after I’d started a PhD, a newly arrived Argentinian PhD student was astonished that our Department had the Malcolm Bird in it. Malcolm, it turns out, had once solved a hard and important problem in computer science. I later found out that as an academic he had continuously taken on more of his fair share of work. In other words, it’s plausible that, in only slightly different circumstances, Malcolm would have had a stellar research career and have been known as one of our subject’s great minds. The realisation that circumstances outside someone’s control can prevent that person realising their “full potential” was a sobering one. It’s too late to apologise to Malcolm in person, so writing about this shameful episode publicly is about as close as I can get.

When I was an undergraduate and a PhD student, I was often surprised by older academics whose research seemed stuck in an ancient rut. In fact, ”surprised” is a polite way of putting it: I laughed at such people. I’m ashamed to say that one was a gentle, thoughtful lecturer called Malcolm Bird (who died a few years ago). He seemed to us undergraduates to be hopelessly out of date. A couple of years later, after I’d started a PhD, a newly arrived Argentinian PhD student was astonished that our Department had the Malcolm Bird in it. Malcolm, it turns out, had once solved a hard and important problem in computer science. I later found out that as an academic he had continuously taken on more of his fair share of work. In other words, it’s plausible that, in only slightly different circumstances, Malcolm would have had a stellar research career and have been known as one of our subject’s great minds. The realisation that circumstances outside someone’s control can prevent that person realising their “full potential” was a sobering one. It’s too late to apologise to Malcolm in person, so writing about this shameful episode publicly is about as close as I can get.

[7]

What I’m never sure with such people is whether their external lack of clarity reflects an interior lack of clarity or not.

What I’m never sure with such people is whether their external lack of clarity reflects an interior lack of clarity or not.

Comments



(optional)
(used only to verify your comment: it is not displayed)