What I’ve Learnt So Far About Writing Research Papers

Blog archive

Recent posts
Some Reflections on Writing Unix Daemons
Faster Shell Startup With Shell Switching
Choosing What To Read
Debugging A Failing Hotkey
How Often Should We Sharpen Our Tools?
Four Kinds of Optimisation
Minor Advances in Knowledge Are Still a Worthwhile Goal
How Hard is it to Adapt a Memory Allocator to CHERI?
"Programming" and "Programmers" Mean Different Things to Different People
pizauth: First Stable Release

I long flattered myself that I’m only good at one thing: programming. Experience has taught me that I’m not as good at it as I thought – indeed, that there are people much better at it than I – but it’s fair to say that I’m not terrible at it. The problem is that, as a researcher, programming can only ever be part of my job: it’s not enough just to build something interesting, I need to explain it to people too. And explaining it to those further away than my voice can carry requires writing a paper. [1]

There are many people who dismiss paper writing as a distraction from building real systems. I sometimes wondered if it was worthwhile myself but, when you’ve been knocking around as long as I have, the importance of writing things down becomes clear. I’ve lost track of how many conversations I’ve had along these lines: “We saw that problem in System B, and we had a really interesting solution.” ”Where can I read about it?” “We never wrote it up.” “Where can I download it then?” “We never released the source code, the binaries are lost, and it runs on hardware that no longer exists.” I encountered this a lot when I was investigating existing language composition, systems with tantalising references ranging from Lisp systems with different macro expansion semantics to novel text editors. Even when the person I was talking to was passionate and knowledgeable about the system in question, I was rarely able to gain the insights I suspect I could have if I’d had a paper to pore over. Doubtless, many of these systems weren’t good enough to worry about them slipping from our memory; but the loss of the best, which may have involved tens or even hundreds of skilled person years of effort, is a tragedy. Writing things up does not guarantee that people in the future will be able to understand everything about the work [2], but at least they stand a sporting chance. Besides, even if people of the future can’t understand things, writing things down increases the number of people who can understand the work today.

How should one go about writing a paper? When I started my PhD, I had no idea of how one might go about this task, and it took me many years to find an approach that worked for me. In the rest of this blog post, I’m going to try and drill down into some of the things I’ve learnt about writing papers, and I’m going to give some examples of where I’ve learnt from my mistakes. However, I’m deliberately not phrasing anything herein as “advice” — indeed, several parts of my approach seem to run contrary to most of the advice I’ve seen. I don’t see this as a problem: people work in different ways, and if you’re unlucky enough to have a mind similar to mine, you might find something useful in here.

Before starting to write

Before I start writing, there are two crucial decisions that I need to make. First, what am I going to write about? Second, what is the profile of the intended reader?

The first question nearly always has an easy answer which is that I’ve done some research which is interesting and complete enough to be worth telling other people about. If I don’t have a good answer to that first question, it means that I’m probably not ready to start writing.

The second question is more tricky. In general, I assume that the intended reader is someone intelligent, interested, and well-versed in the general area. My job is to fill in the things that they don’t know. One can go mad trying to guess precisely what people do and don’t know, so my starting assumption is that the reader knows roughly what I knew when I started doing the research in question. However, this doesn’t work for all types of writing, particularly when a paper brings together two subjects that have previously been studied by disjoint communities. On any points where I’m unsure about what knowledge I can take for granted, I err on the side of caution, and assume the reader knows marginally less than I did. This should not be mistaken for an assumption that the reader is stupid, but simply an acknowledgement that none of us can know everything.

I also make a fundamental assumption that the reader has good intentions, and that my job is to be upfront and honest with them. This allows me to write clearly about the weaknesses and limitations of my work without worrying that I will be punished for doing so. This may sound like an odd thing to be explicit about, but people in my position are largely evaluated by the quantity and perceived quality of our peer reviewed papers. It is therefore tempting to adjust one’s writing to maximise the chances of acceptance by peer reviewers (who are often rushed; occasionally incompetent; or, though rarely, malevolent). I did that in the past, but I found the implicit cynicism corrosive: I decided, perhaps somewhat pompously, that I’d rather “fail” peer review honourably than “succeed” dishonourably. This has led to one or two more rejections than I might otherwise have received, but these are more than worth the longer-term satisfaction with the papers I write.

Starting to write

It is a truism of writing that the hardest thing is a blank page: building sufficient momentum to begin writing is no easy thing and I’ve done my fair share of hoping in vain for inspiration to strike. Of course, the only possible solution is to write something, anything, just to get started. The most common suggestion that I’ve heard is to start writing the bits of a paper that come easier without worrying about the overall order. While this seems to work well for many people, I’ve never found this a satisfying approach.

Instead, the first thing I do is to try and work out what the paper’s overall message is going to be, which means starting with the paper’s abstract. [3] This gives me a frame of reference for all subsequent writing and enables me to be ruthless in answering questions like “does the reader need to know about this particular aspect?” and “is this section in the right place in the paper?”. I first reread suggestions for writing an abstract (the fourth point in the link on paper writing advice) because it reminds me to be concise about what problem I solved and to motivate why my solution is worth someone’s time reading. In well established fields, where one is tackling a well-known problem, little motivation is needed. More experimental or unusual research, however, needs to be clearly motivated. This is often harder to do than it seems.

Let me give a concrete example based on the paper fine-grained language composition: a case study. We started looking at language composition because previous solutions were crude, because we felt we had a new approach worth trying, and because it looked like it would be fun. I had a few explanations about why it was worth doing (enough to convince people to fund us to do it), but they were vague. After developing a language composition editor we realised we needed to be able to run composed programs as well. We started with an extended warmup exercise, which gave us the confidence to try tackling something bigger.

We eventually settled on composing PHP and Python and started looking at the problem in detail. About 6-12 months later, we had created PyHyp, which was the first ever fine-grained language composition of “big” languages: certainly, we’d been able to get much further, and with less effort, than I had expected at the outstart. The problem was that we had also outstripped our ability to explain why we’d done what we’d done. The published abstract that I wrote (and I say “I” because I’m almost solely to blame for it) looks as follows:

Although run-time language composition is common, it normally takes the form of a crude Foreign Function Interface (FFI). While useful, such compositions tend to be coarse-grained and slow. In this paper we introduce a novel fine-grained syntactic composition of PHP and Python which allows users to embed each language inside the other, including referencing variables across languages. This composition raises novel design and implementation challenges. We show that good solutions can be found to the design challenges; and that the resulting implementation imposes an acceptable performance overhead of, at most, 2.6x.

The good thing about this abstract is that it’s simple to understand: the bad thing is that it doesn’t say why the problem we tackled is worth tackling. Since language composition has been largely ignored as a research challenge the abstract therefore relies on readers spontaneously realising why language composition is useful and why our research is worth reading — which is far too much to ask.

The frustrating thing is that, while carrying out the research, one of our collaborators had suggested a plausible “killer app”: system migration. The idea is simple. At the moment we have lots of systems written in old languages: we’d like to rid ourselves of the old languages but not the systems. We can either rewrite the systems from scratch (which is expensive and error prone) or use an automatic translator (which produces code that no human wants to maintain). Language composition offers us the potential to compose an old and a new language together and gradually migrate parts of the system piecemeal, at all times having a runnable system. This idea is tucked away in a paragraph near the end of the paper and is easily overlooked. This is an unforced error on my part, because papers are not meant to be like crime novels, with unexpected twists: a good paper should make clear upfront what the paper is about, and gradually fill in detail. Indeed, I struggled to find a good structure for this paper, perhaps in part because of my failure to set out the most compelling case possible in the abstract.

By the time I came to write a blog post on this subject I had realised my mistake and put the migration idea (in expanded form) near the beginning. Alas, anyone who read only the paper would not have seen this implicit mea culpa.

As this example hopefully shows, finding the right research message is often tricky. Fortunately, it is possible to do a decent job: looking back, I think both the storage strategies for collections in dynamically typed languages and Eco: a language composition editor papers do a much better job in their abstracts.

With the abstract out of the way, I draft an introduction. If I’ve got my high-level thinking in the abstract right, then the introduction is a much easier task because, to some extent, it’s simply an expanded abstract. I add more background, including some context about the research’s antecedents, and explain why we’ve done something useful. Sometimes I will explicitly list the paper’s contributions, especially if they might otherwise not pop out (thus the storage strategies paper was clear enough without such a list whereas I felt the fine-grained paper would not be understood without one).

With drafts of the abstract and introduction complete (I invariably come back and, at least, tweak them; sometimes I change them extensively), I then find it possible to move on to the paper’s main content.

Writing and editing

Perhaps unsurprisingly, I tend to write papers more-or-less linearly from front to back, finishing one section before moving on to the next. I find this helpful to give me the same sense of flow as the eventual reader: it means that I’m forced to think about seemingly simple questions like “have I introduced this concept earlier, or does it need to be introduced here?” It’s easy to overlook the high-level structure of a paper, and I’ve read many papers with all the right pieces but in the wrong order. [4] Similarly I’ve read many papers which use too many unconnected examples: each feels sensible in its own right, but each example takes time to understand, destroying the reader’s flow. A single example which can be built up in stages as the paper progresses nearly always makes the paper easier to read, and is most easily done if writing is done semi-linearly.

That said, I spend the majority of my time at a much lower-level: I’m trying to get all the facts that need to be in the paper in there, explained clearly, and in the right order. That means that I spend most of my time thinking about whether a sentence says what I need it to say, without saying anything it shouldn’t say, and whether it does so as concisely as possible. I’m not indulging in false modesty when I say that I don’t think that my first attempt at a sentence has ever satisfied all three. As soon as I’ve typed a sentence in, I’ll reread it, and if it’s really bad (which it normally is), I’ll move it a couple of lines down on the screen and try writing another version immediately. For some complex thoughts I can end up with multiple versions scrolling gradually off the bottom off the screen before I eventually get to something that isn’t obviously flawed. But I’m realistic: I probably won’t get those complex sentences right on the first day, so I try to move on fairly swiftly.

When I start a new day, I go back over what I wrote the previous day and look at it in a fresh light. As well as catching out innumerable typos, I always end up rephrasing and reordering sentences. That’s because, if there’s one thing I’ve learnt about good writing, it’s that it’s the result of good editing. And, as far as I can tell, good editing means extensive editing. It means being able to look at what I wrote and — ignoring my ego, which inevitably says “it’s good enough already” — trying to neutrally judge whether it can be improved. This is easy to say but, at least for me, it was an incredibly difficult skill to learn. I only learnt it when I had to write a grant proposal and my first draft was 11 pages, just a little over the 6 page limit. I had no idea what to do, and my first attempts at cutting things down were, frankly, pathetic. Over about 10 weeks, I accidentally crushed my ego enough to learn how to evaluate my own writing, and from that much followed. The grant was rejected, but I got so much out of the experience of writing it that I still judge it a success.

Order and brevity

Let’s look at a concrete example that shows how important it is to think about the order in which things are presented and how shorter writing is often better. The example is something that I remember spending a long time grappling with: an overview of meta-tracing. This was first needed for the approaches to interpreter composition paper, where we wanted to give a rough idea of meta-tracing to unfamiliar readers (who would otherwise find much of the paper tough to understand). Overview sections are often tricky: they have to summarise and simplify related work in a way that doesn’t overwhelm unfamiliar readers, while being accurate enough not to offend people who know the area in depth. This sort of writing ruthlessly exposes holes in one’s understanding and writing abilities.

I’m going to use the first paragraph from Section 2 of the “approaches” paper as my example. For the points I’m about to make, it hopefully doesn’t matter too much if you understand meta-tracing or not. Here’s the first draft I wrote (minus citations):

Meta-tracing takes an interpreter and creates a VM which contains both the interpreter and a tracing JIT compiler At run-time, user programs running in the VM begin their execution in the interpreter. When a ‘hot loop’ in the user program is encountered, the actions of the interpreter are traced (i.e. recorded), optimised, and converted to machine code. Subsequent executions of the loop then use the fast machine code version rather than the slow interpreter. Guards are left behind in the machine code so that if execution needs to diverge from the path recorded by the trace, execution can safely fall back to the interpreter.

I have long since forgotten how long that took to write, but 10-15 minutes is a reasonable guess. At first glance, this paragraph may seem to do a reasonable job of explaining things, once one ignores minor issues like the missing full stop before “at run-time”. However, there’s one glaring problem which took me or someone else (I’m no longer sure who) some time to realise: not only will most readers be unfamiliar with “meta-tracing”, they’ll be unfamiliar with “tracing” which the above completely fails to mention. That means that this explanation almost entirely fails to achieve its main purpose.

After various minor edits to the section as a whole, eventually I inserted a reference to (non-meta) tracing. Between the first draft above and the final published version below, 24 commits [5], were made to the section as a whole (mostly by me, but with some by Edd Barrett):

This section briefly introduces the concept of meta-tracing. Meta-tracing takes as input an interpreter, and from it creates a VM containing both the interpreter and a tracing JIT compiler. Although tracing is not a new idea, it traditionally required manually implementing both interpreter and trace compiler. Meta-tracing, in contrast, automatically generates a trace compiler from an interpreter. At run-time, user programs running in the VM begin their execution in the interpreter. When a ‘hot loop’ in the user program is encountered, the actions of the interpreter are traced (i.e. recorded), optimised, and converted to machine code. Subsequent executions of the loop then use the fast machine code version rather than the slow interpreter. Guards are left behind in the machine code so that execution can revert back to the interpreter when the path recorded by the trace differs from that required.

The paragraph is now quite a bit longer, but at least it explains tracing. However, it does so poorly (which, as the git history clearly shows, is entirely my fault and not Edd’s). Consider the sentences:

Meta-tracing takes as input an interpreter, and from it creates a VM containing both the interpreter and a tracing JIT compiler. Although tracing is not a new idea, it traditionally required manually implementing both interpreter and trace compiler. Meta-tracing, in contrast, automatically generates a trace compiler from an interpreter.

In order, these three sentences: define meta-tracing; reference tracing without explaining why it’s relevant; and somewhat obliquely explain why meta-tracing is different than tracing. To add insult to injury the third of those sentences partly repeats the content of the first sentence. The problem with this jumble of concepts isn’t just that it’s poor style: by forcing readers to keep switching the concepts they’re reasoning about, it makes it hard for them to concentrate on the crucial aspects. Unfortunately, I either didn’t notice this problem, or couldn’t work out how to fix it, before publication.

Fortunately for us, the story doesn’t end there. Some time later we wrote a paper called fine-grained language composition: a case study which also needed an overview of meta-tracing. I first copied in the paragraph from the “approaches” unchanged [6]. Very soon afterwards (probably because of the benefits of reading the paragraph with a fresh mind) I started addressing the problems noted above. With substantial help from Carl Friedrich Bolz and Edd Barrett, 28 commits to the section as a whole later the final published version looks as follows:

Tracing JIT compilers record hot loops (‘traces’) in an interpreted program, optimise those traces, and then compile them into machine code. An individual trace is thus a record of one particular path through a program’s control flow graph. Subsequent executions which follow that same path can use the machine code generated from the trace instead of the (slow) interpreter. To ensure that the path followed really is the same, ‘guards’ are left in the machine code at every possible point of divergence. If a guard fails, execution then reverts back to the interpreter.

Meta-tracing JITs have the same basic model, but replace the manually written tracer and machine code generator with equivalents automatically generated from the interpreter itself.

The explanation is now split into two paragraphs. The first has a concise explanation of tracing; the second briefly explains what’s different about meta-tracing. Not only this is a much easier flow for the reader, it also has a better explanation of guard failure and, as a pleasant bonus, it’s 10% shorter than the “approaches” version. I’m quite proud of this version but notice how long it took us to get there: around 50 commits to the section as a whole (most of which touched the paragraph above). Simplicity and clarity are not easily obtained, but they are always worth striving for.

Writing as part of a team

It’s been a long time since I’ve written a paper on my own. Outside of this blog, nearly all my writing is done in teams of 3–5 people. If my experience is anything to go by, the standard model of collaborating in team writing is for most people to write up the technical bits of a paper they were responsible for, with one person handling the “other” bits (e.g. the introduction and conclusions). Some teams make this model of writing work, but a lot (including some I’ve been part of) don’t, for two different reasons: different people often have jarringly different writing styles [7]; and vital information is often missing, because no individual feels solely responsible for it. It’s worth looking at both reasons in more detail.

Sometimes academic writing advice seems to suggest that there is one true writing style to which we should all aspire. This is nonsense, especially with a language as flexible as English: there are often multiple ways of saying the same thing that are all roughly as good as each other (a single idea can often be expressed in several ways, each approximately as good as the other). Which of those alternatives one chooses is largely a matter of personal style. For example, my personal style tends towards fairly short sentences and lots of punctuation. Some people I know prefer long sentences and the complete absence of the colon and semi-colon. Obviously I prefer my own style, but that doesn’t generally mean the alternative is wrong — just different. [8] The problem comes because readers of technical papers are very alert to minor changes in use of language, and such changes are much more common when different authors fail to homogenise their style. For example, referring to “dynamic compilation” in one part of a paper and ”adaptive compilation” elsewhere makes a thoughtful reader wonder if the different phrasings denote a real difference or not. Jolting a reader out of their reading flow in order to ponder such a question is not just a waste of time: the more often it happens, the more likely it is that the reader will lose sight of the big picture.

The problem of missing information is both less obvious and more serious. I frequently read papers which, while seemingly sensible at the level of individual sections, don’t make much sense as a whole. In rare cases this is because the paper is technically flawed. More often it’s because there’s a missing part of the explanation that would help me make sense of what I’m reading. Occasionally I’m able to deduce precisely what the missing information must have been but, more often than not, I’m simply left confused.

Missing information is sometimes simply due to an oversight, or because the authors considered it unworthy of inclusion (generally because “everyone knows that”). However, my experience is that, when writing in a team, a third reason is equally common: no individual feels responsible for that piece of information, implicitly assuming that it’ll appear elsewhere in the paper. Inevitably, therefore, the information is never included anywhere. I’ve occasionally seen people try to solve this by cramming huge numbers of definitions into a paper’s introduction. This then makes the start of the paper both a depressing, as well as an obfuscatory, read: the start of the paper should outline the story, not try and fill in every blank.

The rewriter approach to team writing

The approach I use to avoid standard team writing problems is for one person (I’ll call them the “rewriter”) to take overall responsibility for the paper’s writing, ensuring that the paper’s content and style are consistent and complete. The only way I’ve found to make this happen is for the rewriter to examine, and in general rewrite, every contribution to the paper. Most obviously this means that the paper ends up written in a single style. Less obviously it means that one person is forced to understand everything in the paper, lessening the chances of authors telling an inconsistent or incomplete story. The rewriter doesn’t need to be a mini-dictator — indeed, it’s best if they try and work by consensus when possible — but they do need to be unflagging in trying to maintain a consistently high standard across the paper.

I now apply the rewriter idea to nearly all the papers I’m involved with (though I’m not always the rewriter!) and I’ve learnt a few things since my first experience with this way of writing [9]. First, it relies on individuals being willing to forego their individual writing style and to trust that the rewriter will do a good job. Second, it requires people to be much more open than normal to being questioned about what they’ve written, and to be willing to answer each and every question.

Sometimes the rewriter will read something, realise that they don’t understand the explanation, and ask for clarification before rewriting can commence. Sometimes the rewriter will not have the same assumptions as the original author, misunderstand the original text, and then rewrite it incorrectly. The good news is that the rewritten version is nearly always clearly incorrect: the original author then needs to request changes until the rewritten text becomes accurate. [10] And, more often than you’d expect, the original author realises that there was an aspect of the research that they hadn’t previously considered, and they need to do more technical work before answering the query.

Let me give you a simple example which I quickly plucked out of the VM warmup paper. Carl Friedrich drafted the following text (intentionally quickly, as the style we had got used to was “draft something quickly, run it past the rewriter, then take another pass at it”) [11]:

In this case, I knew that there was an additional factor not mentioned in this text that needed including: we had discovered something about lazy loading interfering with our measurements, although I wasn’t very sure what the “something” was. I thus rewrote the text based on that additional factor but, since I didn’t know much about it, I also added a couple of questions at the same time:

Carl Friedrich then replied:

There are two changes in this version. Most obviously, Carl Friedrich responded to my first question with a comment (and notice that both of us are trying to understand what is going on rather than trying to “win” a debate). Less obviously, he responded to my second question by editing the text and deleting the comment. I’ve always encouraged this way of editing: there’s no point in people playing endless comment ping-pong simply because they’re too afraid to take decisive action. In this case, it was my job to review the edit that had been made (I simply look at the relevant diff in gitk); if I was unhappy with it, I would have reinstated the text with an additional comment (perhaps “I’m not sure my original question made sense, so let me try again”). As is generally the case, I had no problem with this particular edit.

There were several rounds of question-and-answer (see e.g. 1, 2, 3, or my personal favourite 4) before I eventually felt I understood what had gone on enough to rewrite the text to the following:

That text, somewhat simplified, is clearly recognisable as that in the final paper. Notice that neither Carl Friedrich or I would have been able to get the text to that quality on our own: it required working together, being open to questions, and not giving up until we were both satisfied that readers would understand the points we were trying to make.

How long does it take to write a paper?

I remember being told early in my career that, once the research is complete, writing a paper takes about a week. I was tortured by this for several years: I rarely got anything out of the door in under a month. And I’ve got much, much worse as my standards have been raised: I reckon that our major papers (for want of a better term) take at least 8-12 person weeks to write. [12] More complex papers take even longer. As a rough proxy, the VM warmup paper took 999 commits from start to finish (and, as can be seen on arXiv, 7 public releases). Many of those commits (particularly towards the end of writing) are small; some (particularly nearer the beginning of writing) are big. But, either way you look at it, a lot of effort went into the writing. It’s one reason why I’m not a prolific writer of papers: much to the distress of most of the bosses I’ve had, I’d rather try and write 1 good paper a year and fail than succeed at writing 20 mediocre ones.

And while it’s hardest to build the momentum to write the first few words, it’s not trivial to build and maintain the daily momentum needed to complete a paper. On a typical day writing ‘fresh’ prose, I aim to write somewhere around 800-1000 words (I don’t count precisely). Sometimes those words come easily, but sometimes they have to be forced out. I do make sure that I never leave a day empty handed though.

One of the hardest things to know is when a paper is finished. One way is to use an external deadline – e.g. the submission date for a conference – as a proxy for this. In general, this tends to lead to a mad rush in the last few days before the deadline, hugely increasing the chances of mistakes creeping in. I prefer to let papers mature for a lot longer than this, because it continues to amaze me how many mistakes become apparent after a good night’s sleep or two. However, taken too far, this is a recipe for never finishing: I think it’s vital to feel that one is continually moving, even if gradually, towards an end goal. To that end, I try and solidify aspects of the paper as I’m going along, so that I’m not tempted to end up in an endless loop of modifying the same things. Typically, I first satisfy myself that the paper’s structure is stable; then that all the necessary content is in; then that the prose is clear and consistent; and finally that there are no typos left. The movement between those stages is almost exclusively one way: when I’m looking for typos, for example, I won’t suddenly decide to move sections around.

At some point the decision has to be taken that the paper is finished and can be released to the world. If I’ve done my job well, the paper will hopefully find an interested audience. But these days I’m realistic enough to know that the paper won’t be perfect: it won’t be as clear as it should be and it will contain mistakes.

Several years ago, I wrote an overview paper on dynamically typed languages where I tried to collect together a lot of things I’d learnt on the topic. Because I knew that this was an area about which many people have strong opinions, I realised that I would have to do a better job than I’d done on any of my previous papers. I tried to buff the paper to the shiniest of sheens and, when it was published, I patted myself on the back for a job well done. A little while after publication I got the following email [13]:

I’m just reading your article on dynamically typed languages from 2009, and I’m stunned. It’s fantastically written, neutral and smart.

At this point, my ego – never the most fragile of objects – had swollen to roughly the size of a small planet. Then I read the next sentence:

So how on earth could you misspell “growth” as “grown” in the abstract?!

I checked, and, indeed, I had made a typo in the first sentence of the abstract – in other words, the 13th word in the paper was wrong. My ego shrank back down to a more appropriate size. I fixed the mistake in the HTML version but the PDF retains the error, as a reminder to my future self of my own fallibility.

Wrapping up

There are many other things I could say but if I had to boil everything I’ve learnt so far about paper writing into one sentence it would be this: you don’t need to be hugely talented with words (I’m not), but you have to be willing to continually reread, rethink, and reword everything you’ve written. When writing in a team “you” particularly means “the rewriter”, if you choose that model. Ultimately the goal is simple: will the reader be able to understand the points you wanted to get across, without undue effort? Bad writing, I suspect, comes from people who think their first effort is good enough. Good writing comes from people who know that, every section, every paragraph, and every sentence will have to be pored over multiple times: and, just when they think they’ve perfected the writing, getting external feedback might cause them to reevaluate everything. Sometimes this effort can feel like treading in Sisyphus’ footsteps but, in my experience, it’s effort well spent.

Acknowledgements: My thanks to Edd Barrett, Carl Friedrich Bolz-Tereick, Lukas Diekmann, and Sarah Mount for thoughtful comments. Any errors and infelicities are my own.

Newer 2018-01-10 08:00 Older
If you’d like updates on new blog posts: follow me on Mastodon or Twitter; or subscribe to the RSS feed; or subscribe to email updates:

Footnotes

[1]

Blogs like this one have their purpose, but I feel they’re solving a different problem to papers. Put crudely, I use my blog to explain things to people who want a general overview, whereas papers contain all the nitty-gritty details.

Blogs like this one have their purpose, but I feel they’re solving a different problem to papers. Put crudely, I use my blog to explain things to people who want a general overview, whereas papers contain all the nitty-gritty details.

[2]

Unless you’re already familiar with Kuhn’s work, Dick Gabriel’s wonderful paper about incommensurability in programming languages is, in my opinion, required reading. In short: when we read other’s work, particularly if it’s a little old, we often lack the context that all contemporary readers shared; we are therefore liable to misunderstand or dismiss important parts of the explanation. A simple example is the word “computer”. Before the 1940s this referred to people who performed calculations; currently it refers to machines. Unless you’re aware of that fact, you’ll get a completely incorrect impression when reading pre-1940s material that uses the term “computer”. As a collection of syllables, ‘incommensurability’ is an annoying mouthful: but the idea is simple, powerful, and describes a concept that is surprisingly frequently encountered. Dick brings it to life with a very interesting concrete example.

Unless you’re already familiar with Kuhn’s work, Dick Gabriel’s wonderful paper about incommensurability in programming languages is, in my opinion, required reading. In short: when we read other’s work, particularly if it’s a little old, we often lack the context that all contemporary readers shared; we are therefore liable to misunderstand or dismiss important parts of the explanation. A simple example is the word “computer”. Before the 1940s this referred to people who performed calculations; currently it refers to machines. Unless you’re aware of that fact, you’ll get a completely incorrect impression when reading pre-1940s material that uses the term “computer”. As a collection of syllables, ‘incommensurability’ is an annoying mouthful: but the idea is simple, powerful, and describes a concept that is surprisingly frequently encountered. Dick brings it to life with a very interesting concrete example.

[3]

The common advice is to write the abstract last, once you’ve got the rest of the paper written. This clearly works well for many people, but whenever I’ve tried doing so, I’ve found that my writing is aimless and disjointed.

The common advice is to write the abstract last, once you’ve got the rest of the paper written. This clearly works well for many people, but whenever I’ve tried doing so, I’ve found that my writing is aimless and disjointed.

[4]

When I’m reviewing papers, I write a small series of small comments in the order that I came across them. It’s quite common for me to write “On p2 I don’t understand X” and then to update it later by saying “Ah, this is explained on p5”.

When I’m reviewing papers, I write a small series of small comments in the order that I came across them. It’s quite common for me to write “On p2 I don’t understand X” and then to update it later by saying “Ah, this is explained on p5”.

[5]

I used git blame -L to find the relevant commits.

I used git blame -L to find the relevant commits.

[6]

This is sometimes referred to as “self-plagiarism” and some authors will write such sections anew each time to ensure they can’t be accused of it. Since background material is generally short and doesn’t really form part of a paper’s contribution, I don’t have a problem with people reusing small amounts of such material: the problem, in my opinion, is when people copy supposedly new technical contributions between papers. In this case I even made clear to myself and my co-authors what was going on in the commit message:

Integrate a brief description of meta-tracing.

The first and last paragraphs are directly swiped from ‘approaches to interpreter composition’. The middle paragraph is new, as I suspect we will need to talk about hints to explain why cross-language optimisations perform well.

This is sometimes referred to as “self-plagiarism” and some authors will write such sections anew each time to ensure they can’t be accused of it. Since background material is generally short and doesn’t really form part of a paper’s contribution, I don’t have a problem with people reusing small amounts of such material: the problem, in my opinion, is when people copy supposedly new technical contributions between papers. In this case I even made clear to myself and my co-authors what was going on in the commit message:

Integrate a brief description of meta-tracing.

The first and last paragraphs are directly swiped from ‘approaches to interpreter composition’. The middle paragraph is new, as I suspect we will need to talk about hints to explain why cross-language optimisations perform well.

[7]

This effect is often exaggerated when different co-authors have different mother tongues: it’s generally not difficult to tell a French author writing in English versus a German author, for example. If this sounds like Anglo-snobbery, I really don’t mean it to be: I’m still struggling with one language, and am in awe of people who can express themselves in multiple languages.

This effect is often exaggerated when different co-authors have different mother tongues: it’s generally not difficult to tell a French author writing in English versus a German author, for example. If this sounds like Anglo-snobbery, I really don’t mean it to be: I’m still struggling with one language, and am in awe of people who can express themselves in multiple languages.

[8]

That said, there are a few things I consider universally bad style for technical papers. The chief style crime I see amongst junior authors is a tendency to think that the following idiotic trope (repeated by seemingly every school English teacher) is true: “never repeat a word or phrase you’ve used recently.” Perhaps this is good advice for fiction, but it’s terrible advice in technical papers, where it’s vital to use the same word(s) to refer to the same concept throughout a paper.

That said, there are a few things I consider universally bad style for technical papers. The chief style crime I see amongst junior authors is a tendency to think that the following idiotic trope (repeated by seemingly every school English teacher) is true: “never repeat a word or phrase you’ve used recently.” Perhaps this is good advice for fiction, but it’s terrible advice in technical papers, where it’s vital to use the same word(s) to refer to the same concept throughout a paper.

[9]

Personally I stumbled across the “rewriter” way of working entirely by accident. I was part of a research project that ultimately turned out to not really need my skills. When it came to write up the resulting research, I felt like a spare limb, so to make myself feel useful I offered my services as a copy-editor. This was accepted by the rest of the team without any of us realising what we’d signed up to. Whenever someone added a new part to the paper, I would read it very carefully. If I couldn’t make any sense of it, I would add questions inline asking for clarification. If I could make sense of most or all of it, I rewrote the whole lot in my style. I viewed it as my job to make sure that the paper flowed, stylistically and technically, from start to finish, and that everything within was adequately explained.

The downside of this approach was that I almost drove my co-authors to murder because none of them had encountered such repeated questioning before. However, after a bit of time had passed (and the paper had done reasonably well), we all agreed it was better written than if we’d taken a traditional writing approach. I hasten to add that is not because the paper used my style, but simply because it used one person’s style consistently, and that person had also ensured that all the information the paper needed was contained therein.

Personally I stumbled across the “rewriter” way of working entirely by accident. I was part of a research project that ultimately turned out to not really need my skills. When it came to write up the resulting research, I felt like a spare limb, so to make myself feel useful I offered my services as a copy-editor. This was accepted by the rest of the team without any of us realising what we’d signed up to. Whenever someone added a new part to the paper, I would read it very carefully. If I couldn’t make any sense of it, I would add questions inline asking for clarification. If I could make sense of most or all of it, I rewrote the whole lot in my style. I viewed it as my job to make sure that the paper flowed, stylistically and technically, from start to finish, and that everything within was adequately explained.

The downside of this approach was that I almost drove my co-authors to murder because none of them had encountered such repeated questioning before. However, after a bit of time had passed (and the paper had done reasonably well), we all agreed it was better written than if we’d taken a traditional writing approach. I hasten to add that is not because the paper used my style, but simply because it used one person’s style consistently, and that person had also ensured that all the information the paper needed was contained therein.

[10]

My personal tactic is to turn seemingly ambiguous things into definite statements. For example, “This is smaller than some other objects” might become “This is smaller than most other objects” or “This is larger than most other objects” depending on my intuition about which way the ambiguity is best resolved. In the best case, I will have rewritten things more clearly than they were before. In the worst case I have made it both more likely that the original author will spot problems and that they will feel the need to correct things.

My personal tactic is to turn seemingly ambiguous things into definite statements. For example, “This is smaller than some other objects” might become “This is smaller than most other objects” or “This is larger than most other objects” depending on my intuition about which way the ambiguity is best resolved. In the best case, I will have rewritten things more clearly than they were before. In the worst case I have made it both more likely that the original author will spot problems and that they will feel the need to correct things.

[11]

The “screenshots” below are SVGs extracted from the PDF generated from LaTeX. The inline comments are a macro that Roel Wuyts used on a shared article we wrote many moons ago. I’ve shamelessly used it ever since. I can’t remember who added colour to the comment macros, but it really helps it stand out from the normal text. These days I use “how much red is there on a page” as a good proxy for “how close is this text to being finished?”

The “screenshots” below are SVGs extracted from the PDF generated from LaTeX. The inline comments are a macro that Roel Wuyts used on a shared article we wrote many moons ago. I’ve shamelessly used it ever since. I can’t remember who added colour to the comment macros, but it really helps it stand out from the normal text. These days I use “how much red is there on a page” as a good proxy for “how close is this text to being finished?”

[12]

How long did it take to write this blog post? Spread over about 8 months, it took about 7 days (partly because I wrote some material that external comments convinced me didn’t really fit in). It’s one reason why I’m not a prolific blogger, although the more significant reason is that I generally lack anything worth my time writing and your time reading.

How long did it take to write this blog post? Spread over about 8 months, it took about 7 days (partly because I wrote some material that external comments convinced me didn’t really fit in). It’s one reason why I’m not a prolific blogger, although the more significant reason is that I generally lack anything worth my time writing and your time reading.

[13]

I wish I could write emails as succinct and amusing as this one. While I hugely admire laconic humour in others, I’m not generally capable of editing my thoughts down quite so ruthlessly.

I wish I could write emails as succinct and amusing as this one. While I hugely admire laconic humour in others, I’m not generally capable of editing my thoughts down quite so ruthlessly.

Comments



(optional)
(used only to verify your comment: it is not displayed)