What I’ve Learnt So Far About Writing Research Papers
January 10 2018
I long flattered myself that I'm only good at one thing: programming. Experience
has taught me that I'm not as good at it as I thought – indeed, that there
are people much better at it than I – but it's fair to say that I'm not
terrible at it. The problem is that, as a researcher, programming can only ever
be part of my job: it's not enough just to build something interesting, I need
to explain it to people too. And explaining it to those further away than
my voice can carry requires writing a paper.
There are many people who dismiss paper writing as a distraction from
building real systems. I sometimes wondered if it was worthwhile myself but,
when you've been knocking around as long as I have, the importance of writing
things down becomes clear. I’ve lost track of how many conversations I’ve had
along these lines: “We saw that problem in System B, and we had a really interesting solution.”
”Where can I read about it?” “We never wrote it up.” “Where
can I download it then?” “We never released the source code, the
binaries are lost, and it runs on hardware that no longer exists.” I
encountered this a lot when I was investigating existing language composition, systems
with tantalising references ranging from Lisp systems with different macro
expansion semantics to novel text editors. Even when the person I was talking to
was passionate and knowledgeable about the system in question, I was rarely able
to gain the insights I suspect I could have if I'd have had a paper to pore
over. Doubtless, many of these systems weren't good enough to worry
about them slipping from our memory; but the loss of the best, which may have
involved tens or even hundreds of skilled person years of effort, is a tragedy.
Writing things up does not guarantee that people
in the future will be able to understand everything about the work ,
but at least they stand a sporting chance. Besides, even
if people of the future can't understand things, writing things down increases
the number of people who can understand the work today.
How should one
go about writing a paper? When I started my PhD, I had no idea of how one might
go about this task, and it took me many years to find an approach that worked for me.
In the rest of this blog post, I'm going to try and drill down
into some of the things I’ve learnt about writing papers, and I'm going to
give some examples of where I’ve learnt from my mistakes. However, I’m deliberately not
phrasing anything herein as “advice” — indeed, several parts of my approach seem
to run contrary to most of the advice I’ve seen. I don’t see this as a
problem: people work in different ways, and if you’re unlucky enough to have a
mind similar to mine, you might find something useful in here.
Before starting to write
Before I start writing, there are two crucial decisions that I need to make. First, what am
I going to write about? Second, what is the profile of the intended reader?
The first question nearly always has an easy answer which is that
I’ve done some research which is interesting and complete enough to be worth
telling other people about. If I don’t have a good answer to that first question, it means that I’m
probably not ready to start writing.
The second question is more tricky. In general, I assume that the intended
reader is someone intelligent, interested, and well-versed in the general area. My job is
to fill in the things that they don’t know. One can go mad trying to
guess precisely what people do and don’t know, so my starting assumption is that the reader knows
roughly what I knew when I started doing the research in question. However, this
doesn’t work for all types of writing, particularly when a paper brings together
two subjects that have previously been studied by disjoint communities. On any
points where I’m unsure about what knowledge I can take for granted, I err
on the side of caution, and assume the reader knows marginally less
than I did. This should not be mistaken for an assumption that the reader is
stupid, but simply an acknowledgement that none of us can know everything.
I also make a fundamental assumption that the reader has good intentions,
and that my job is to be upfront and honest with them. This allows me to write
clearly about the weaknesses and limitations of my work without worrying that I
will be punished for doing so. This may sound like an odd thing to be explicit about, but
people in my position are largely evaluated by the quantity and perceived quality of our
peer reviewed papers. It is therefore tempting to adjust one’s writing to
maximise the chances of acceptance by peer reviewers (who are often rushed;
occasionally incompetent; or, though rarely, malevolent). I did that in the past, but I found
the implicit cynicism corrosive: I decided, perhaps somewhat pompously, that I’d
rather “fail” peer review honourably than “succeed” dishonourably. This has led
to one or two more rejections than I might otherwise have received, but these
are more than worth the longer-term satisfaction with the papers I write.
Starting to write
It is a truism of writing that the hardest thing is a blank page: building
sufficient momentum to begin writing is no easy thing and I've done my fair
share of hoping in vain for inspiration to strike. Of course, the only possible solution is
to write something, anything, just to get started. The most common suggestion that I've heard is
to start writing the bits of a paper that come easier without worrying
about the overall order. While this seems to work well for many people, I've
never found this a satisfying approach.
Instead, the first thing I do is to try and work out what the paper’s
overall message is going to be, which means starting with the paper’s
abstract. This gives me a frame of reference
for all subsequent writing and enables me to be ruthless in answering
questions like “does the reader need to know about this particular aspect?”
and “is this section in the right place in the paper?”. I first reread suggestions for
writing an abstract (the fourth point in the link on paper writing advice) because it reminds
me to be concise about what
problem I solved and to motivate why my solution is worth someone’s time reading. In
well established fields, where one is tackling a well-known problem,
little motivation is needed. More experimental or unusual research, however,
needs to be clearly motivated. This is often harder to do than it seems.
Let me give a concrete example based on the paper
language composition: a case study. We started looking at
language composition because previous solutions were crude, because we felt we
had a new approach worth trying, and because it looked like it would be
fun. I had a few explanations about why it was worth doing (enough to convince people to fund us to do
it), but they were vague. After developing
a language composition editor we realised we needed to be able to
run composed programs as well. We started with an extended
warmup exercise, which gave us the confidence to try tackling something
We eventually settled
on composing PHP and Python and started looking at the problem in detail.
About 6-12 months later, we had created PyHyp, which was the first
ever fine-grained language composition of “big” languages:
certainly, we'd been able to get much further, and with less effort, than I
had expected at the outstart. The problem was that we had
also outstripped our ability to explain why we'd done what we'd done. The
published abstract that I wrote (and I say “I” because I’m almost solely to
blame for it) looks as follows:
Although run-time language composition is common, it normally takes the form of
a crude Foreign Function Interface (FFI). While useful, such compositions tend
to be coarse-grained and slow. In this paper we introduce a novel fine-grained
syntactic composition of PHP and Python which allows users to embed each
language inside the other, including referencing variables across languages.
This composition raises novel design and implementation challenges. We show that
good solutions can be found to the design challenges; and that the resulting
implementation imposes an acceptable performance overhead of, at most, 2.6x.
The good thing about this abstract is that it's simple to understand: the bad thing is that it
doesn't say why the problem we tackled is worth tackling. Since language composition
has been largely ignored as a research challenge the abstract therefore relies
on readers spontaneously realising why language composition is useful and why
our research is worth reading — which is far too much to ask.
The frustrating thing is that, while carrying out the research, one of
our collaborators had suggested a plausible “killer app”: system migration. The
idea is simple. At the moment we have lots of systems written in old languages:
we'd like to rid ourselves of the old languages but not the systems. We can
either rewrite the systems from scratch (which is expensive and error prone) or
use an automatic translator (which produces code that no human wants to
maintain). Language composition offers us the potential to compose an old and a
new language together and gradually migrate parts of the system piecemeal, at
all times having a runnable system. This idea is tucked away in a paragraph
near the end of the paper and is easily overlooked.
This is a unforced error on my part, because papers are not
meant to be like crime novels, with unexpected twists: a good paper
should make clear upfront what the paper is about, and gradually fill in
detail. Indeed, I struggled to find a good structure for this paper,
perhaps in part because
of my failure to set out the most compelling case possible in the abstract.
By the time I came to write a blog
post on this subject I had realised my mistake and put the migration idea
(in expanded form) near the beginning. Alas, anyone who read only the paper
would not have seen this implicit mea culpa.
As this example hopefully shows, finding the right research message is often tricky. Fortunately,
it is possible to do a decent job: looking back, I think both the storage
strategies for collections in dynamically typed languages and Eco:
a language composition editor papers do a much better job in their
With the abstract out of the way, I draft an introduction. If I've got my
high-level thinking in the abstract right, then the introduction is a much
easier task because, to some extent, it's simply an expanded abstract. I add
more background, including some context about the research's antecedents, and
explain why we've done something useful. Sometimes I will explicitly
list the paper's contributions, especially if they might otherwise not
pop out (thus the storage
strategies paper was clear enough without such a list
whereas I felt the fine-grained
paper would not be understood without one).
With drafts of the abstract and introduction complete (I invariably come
back and, at least, tweak them; sometimes I change them extensively), I then
find it possible to move on to the paper's main content.
Writing and editing
Perhaps unsurprisingly, I tend to write papers more-or-less linearly from front
to back, finishing one section before moving on to the next. I find this helpful
to give me the same sense of flow as the eventual reader: it means that I’m
forced to think about seemingly simple questions like “have I introduced this
concept earlier, or does it need to be introduced here?” It’s easy to overlook the high-level structure of
a paper, and I’ve read many papers with all the right pieces but in the wrong
Similarly I've read many papers which use too many unconnected examples: each
feels sensible in its own right, but each example takes time to understand,
destroying the reader's flow. A single example which can be built up in stages as the
paper progresses nearly always makes the paper easier to read, and is
most easily done if writing is done semi-linearly.
That said, I spend the majority of my time at a much lower-level: I’m trying
to get all the facts that need to be in the paper in there, explained clearly,
and in the right order. That means that I spend most of my time
thinking about whether a sentence says what I need it to say, without saying
anything it shouldn’t say, and whether it does so as concisely as possible.
I’m not indulging in false modesty when I
say that I don’t think that my first attempt at a sentence has ever satisfied
all three. As soon as I’ve typed a sentence in, I’ll reread it,
and if it’s really bad (which it normally is), I’ll move it a couple of lines
down on the screen and try writing another version immediately. For some complex thoughts I
can end up with multiple versions scrolling gradually off the bottom off the
screen before I eventually get to something that isn’t obviously flawed. But I'm
realistic: I probably won’t get those complex sentences right on
the first day, so I try to move on fairly swiftly.
When I start a new day, I go back over what I wrote the previous day and
look at it in a fresh light. As well as catching out innumerable typos, I always
end up rephrasing and reordering sentences. That’s because, if there’s one thing I’ve
learnt about good writing, it’s that it’s the result of good
editing. And, as far as I can tell, good editing means extensive editing. It
means being able to look at what I wrote and — ignoring my ego, which inevitably
says “it’s good enough already” — trying to neutrally judge whether it
can be improved. This is easy to say but, at least for me, it was an incredibly
difficult skill to learn. I only learnt it when I had to write a grant proposal
and my first draft was 11 pages, just a little over the 6 page limit. I had no
idea what to do, and my first attempts at cutting things down were, frankly,
pathetic. Over about 10 weeks, I accidentally crushed my ego enough to learn how
to evaluate my own writing, and from that much followed. The grant was
rejected, but I got so much out of the experience of writing it that I still
judge it a success.
Order and brevity
Let's look at a concrete example that shows how important it is to think about
the order in which things are presented and how shorter writing is often better.
The example is something that I remember spending a long
time grappling with: an overview of meta-tracing. This was first needed for the approaches
to interpreter composition paper, where we wanted to give a rough idea of
meta-tracing to unfamiliar readers (who would otherwise find much of the paper
tough to understand). Overview sections are often tricky: they have to summarise and simplify
related work in a way that doesn’t overwhelm unfamiliar readers, while being
accurate enough not to offend people who know the area in depth. This sort
of writing ruthlessly exposes holes in one’s understanding and writing abilities.
I'm going to use the first paragraph from Section 2 of the “approaches” paper
as my example. For the points I’m about to make, it hopefully doesn’t matter too
much if you understand meta-tracing or not. Here's the first draft I wrote
Meta-tracing takes an interpreter and creates a VM which contains both the
interpreter and a tracing JIT
compiler At run-time, user programs running in the VM begin their execution in the
interpreter. When a 'hot loop' in the user program is encountered, the actions
of the interpreter are traced (i.e. recorded), optimised, and converted to
machine code. Subsequent executions of the loop then use the fast machine code
version rather than the slow interpreter. Guards are left behind in the machine
code so that if execution needs to diverge from the path recorded by the trace,
execution can safely fall back to the interpreter.
I have long since forgotten how long that took to write, but 10-15 minutes is a
reasonable guess. At first glance, this paragraph may seem to do a reasonable
job of explaining things, once one ignores minor issues like the missing full
stop before “at run-time”. However, there’s one glaring problem which took
me or someone else (I'm no longer sure who) some time to realise: not only will
most readers be unfamiliar with “meta-tracing”, they’ll be unfamiliar with
“tracing” which the above completely fails to mention. That means that this
explanation almost entirely fails to achieve its main purpose.
After various minor edits to the section as a whole, eventually I inserted a
reference to (non-meta) tracing. Between the first draft above and the final
published version below, 24 commits ,
were made to the section as a whole (mostly by me, but with some
by Edd Barrett):
This section briefly introduces the concept of meta-tracing. Meta-tracing takes
as input an interpreter, and from it creates a VM containing both the interpreter and a tracing JIT
Although tracing is not a new idea, it
traditionally required manually implementing both interpreter and trace
compiler. Meta-tracing, in contrast, automatically generates a trace compiler
from an interpreter.
At run-time, user programs running in the VM begin their execution in the
interpreter. When a 'hot loop' in the user program is encountered, the actions
of the interpreter are traced (i.e. recorded), optimised, and converted to
machine code. Subsequent executions of the loop then use the fast machine code
version rather than the slow interpreter. Guards are left behind in the machine
code so that execution can revert back to the interpreter when the path recorded
by the trace differs from that required.
The paragraph is now quite a bit longer, but at least it explains tracing.
However, it does so poorly (which, as the git history clearly shows, is entirely
my fault and not Edd's). Consider the sentences:
Meta-tracing takes as input an interpreter, and from it creates a VM containing
both the interpreter and a tracing JIT compiler. Although tracing is not a new
idea, it traditionally required manually implementing both interpreter and
trace compiler. Meta-tracing, in contrast, automatically generates a trace
compiler from an interpreter.
In order, these three sentences: define meta-tracing; reference tracing without
explaining why it’s relevant; and somewhat obliquely explain why meta-tracing is
different than tracing. To add insult to injury the third of those sentences
partly repeats the content of the first sentence. The problem with this jumble
of concepts isn’t just that it’s poor style: by forcing readers to keep
switching the concepts they’re reasoning about, it makes it hard for them to
concentrate on the crucial aspects. Unfortunately, I either didn’t notice
this problem, or couldn’t work out how to fix it, before publication.
Fortunately for us, the story doesn’t end there. Some time later we wrote
a paper called fine-grained
language composition: a case study which also needed an overview of
meta-tracing. I first copied in the paragraph from the “approaches” unchanged
. Very soon afterwards (probably because of the
benefits of reading the paragraph with a fresh mind) I started addressing
the problems noted above. With substantial help from Carl Friedrich Bolz and Edd Barrett,
28 commits to the section as a whole later the final published version looks as
Tracing JIT compilers record hot loops ('traces') in an interpreted program,
optimise those traces, and then compile them into machine code.
An individual trace is thus a record of
one particular path through a program's control flow graph. Subsequent
executions which follow that same path can use the machine code generated from
the trace instead of the (slow) interpreter. To ensure that the path followed
really is the same, 'guards' are left in the machine code at every possible
point of divergence. If a guard fails, execution then reverts back to the
The explanation is now split into two paragraphs. The first has a concise
explanation of tracing; the second briefly explains what's different about
meta-tracing. Not only this is a much easier flow for the reader, it also has a
better explanation of guard failure and, as a pleasant bonus, it's 10% shorter
than the “approaches” version. I'm quite proud of this version but notice how
long it took us to get there: around 50 commits to the section as a whole (most
of which touched the paragraph above). Simplicity and clarity are not easily
obtained, but they are always worth striving for.
Meta-tracing JITs have the same basic model, but replace the manually written
tracer and machine code generator with equivalents automatically generated
from the interpreter itself.
Writing as part of a team
It's been a long time since I've written a paper on my own. Outside of this
blog, nearly all my writing is done in teams of 3–5 people. If my
experience is anything to go by, the standard model of collaborating in team
writing is for most people to write up the technical bits of a paper they were
responsible for, with one person handling the “other” bits (e.g. the
introduction and conclusions). Some teams make this model of writing work, but a
lot (including some I've been part of) don't, for two different reasons: different
people often have jarringly different writing styles ; and
vital information is often missing, because no individual feels solely responsible for it.
It's worth looking at both reasons in more detail.
Sometimes academic writing advice seems to suggest that there is one true
writing style to which we should all aspire. This is nonsense, especially with a
language as flexible as English: there are often multiple ways of saying the
same thing that are all roughly as good as each other (a single idea can often
be expressed in several ways, each approximately as good as the other). Which of those
alternatives one chooses is largely a matter of personal style. For example, my
personal style tends towards fairly short sentences and lots of punctuation. Some people I know
prefer long sentences and the complete absence of the colon and semi-colon. Obviously I prefer
my own style, but that doesn't generally mean the alternative is wrong —
just different. The problem comes because readers
of technical papers are very alert to minor changes in use of language, and
such changes are much more common when different authors fail to homogenise
their style. For example, referring to “dynamic compilation” in one part of a paper and
”adaptive compilation” elsewhere makes a thoughtful reader wonder if the
different phrasings denote a real difference or not. Jolting a reader
out of their reading flow in order to ponder such a question is not just a waste
of time: the more often it happens, the more likely it is that the reader will
lose sight of the big picture.
The problem of missing information is both less obvious and more serious. I
frequently read papers which, while seemingly sensible at the level of
individual sections, don’t make much sense as a whole. In rare cases this is
because the paper is technically flawed. More often it’s because there’s a
missing part of the explanation that would help me make sense of what
I’m reading. Occasionally I’m able to deduce precisely what the missing information
must have been but, more often than not, I’m simply left confused.
Missing information is sometimes simply due to an oversight, or because
the authors considered it unworthy of inclusion (generally because “everyone
knows that”). However, my experience is that, when writing in a team, a third
reason is equally common: no individual feels responsible for that piece of
information, implicitly assuming that it’ll appear elsewhere in the paper.
Inevitably, therefore, the information is never included anywhere. I’ve
occasionally seen people try to solve this by cramming huge numbers of
definitions into a paper’s introduction. This then makes the
start of the paper both a depressing, as well as an obfuscatory, read: the start
of the paper should outline the story, not try and fill in every blank.
The rewriter approach to team writing
The approach I use to avoid standard team writing problems is for one person
(I’ll call them the “rewriter”) to take overall
responsibility for the paper’s writing, ensuring that the paper’s content and style
are consistent and complete. The only way I’ve found to make this happen is for
the rewriter to examine, and in general rewrite, every contribution to the
paper. Most obviously this means that the paper ends up written in a single style. Less
obviously it means that one person is forced to understand everything in the
paper, lessening the chances of authors telling an inconsistent or incomplete
story. The rewriter doesn’t need to be a mini-dictator — indeed, it’s best
if they try and work by consensus when possible — but they do need to be
unflagging in trying to maintain a consistently high standard across the paper.
I now apply the rewriter idea to nearly all the papers I'm involved with
(though I'm not always the rewriter!) and I've learnt a few things since my
first experience with this way of writing . First, it relies on
individuals being willing to forego their individual writing style and to trust that the rewriter will do a good job.
Second, it requires people to be much more open than normal to being questioned
about what they’ve written, and to be willing to answer each and every question.
Sometimes the rewriter will read something, realise that they don’t
understand the explanation, and ask for clarification before rewriting
can commence. Sometimes the rewriter will not have the same
assumptions as the original author, misunderstand the original text, and then
rewrite it incorrectly. The good news is that the rewritten version is nearly
always clearly incorrect: the original author then needs to request
changes until the rewritten text becomes accurate. And, more often than you’d expect, the original author
realises that there was an aspect of the research that they hadn’t previously
considered, and they need to do more technical work before answering the query.
Let me give you a simple example which I quickly plucked out of the VM warmup
paper. Carl Friedrich drafted
the following text (intentionally quickly, as the style we had got used to
was “draft something quickly, run it past the rewriter, then take another pass
at it”) :
In this case, I knew that there was an additional factor not mentioned in this
text that needed including: we had discovered something about lazy loading
interfering with our measurements, although I wasn’t very sure what the
“something” was. I thus rewrote the text based on that additional factor but,
since I didn’t know much about it, I also added a
couple of questions at the same time:
Carl Friedrich then replied:
There are two changes in this version. Most obviously, Carl
Friedrich responded to my first question with a comment (and notice that both of
us are trying to understand what is going on rather than trying to “win” a
debate). Less obviously, he responded to my second question by editing the text
and deleting the comment. I’ve always encouraged this way of editing: there’s no
point in people playing endless comment ping-pong simply because they’re too
afraid to take decisive action. In this case, it was my job to review
the edit that had been made (I simply look the relevant diff in
gitk); if I was
unhappy with it, I would have reinstated the text with an additional comment
(perhaps “I’m not sure my original question made sense, so let me try again”).
As is generally the case, I had no problem with this particular edit.
There were several rounds of question-and-answer (see e.g.
or my personal favourite
before I eventually felt I understood what had gone on enough to rewrite the
text to the following:
That text, somewhat simplified, is
recognisable as that in the final paper. Notice that
neither Carl Friedrich or I would have been able to get the text to that quality
on our own: it required working together, being open to questions, and not
giving up until we were both satisfied that readers would understand the points
we were trying to make.
How long does it take to write a paper?
I remember being told early in my career that, once the research is complete,
writing a paper takes about a week. I was tortured by this for
several years: I rarely got anything out of the door in under a month. And I’ve
got much, much worse as my standards have been raised: I reckon that our major
papers (for want of a better term) take at least 8-12 person weeks to write.
More complex papers take even longer. As a rough
proxy, the VM warmup paper took 999
commits from start to finish (and, as can be seen on
arXiv, 7 public releases).
Many of those commits
(particularly towards the end of writing) are small; some (particularly nearer
the beginning of writing) are big. But, either way you look at it, a lot of
effort went into the writing. It’s one reason why I’m not a prolific writer of
papers: much to the distress of most of the bosses I’ve had, I’d rather try and
write 1 good paper a year and fail than succeed at writing 20 mediocre ones.
And while it’s hardest to build the momentum to write the first few words, it’s
not trivial to build and maintain the daily momentum needed to complete a paper.
On a typical day writing ‘fresh’ prose, I aim to write somewhere around 800-1000
words (I don’t count precisely). Sometimes those words come easily,
but sometimes they have to be forced out. I do make sure
that I never leave a day empty handed though.
One of the hardest things to know is when a paper is finished. One way is to
use an external deadline – e.g. the submission date for a conference
– as a proxy for this. In general, this tends to lead to a mad rush in the
last few days before the deadline, hugely increasing the chances of mistakes
creeping in. I prefer to let papers mature for a lot longer than this, because
it continues to amaze me how many mistakes become apparent after a good
night’s sleep or two. However, taken too far, this is a recipe for never
finishing: I think it’s vital to feel that one is continually moving, even if
gradually, towards an end goal. To that end, I try and solidify aspects of
the paper as I’m going along, so that I’m not tempted to end up in an
endless loop of modifying the same things. Typically, I first satisfy myself that the paper’s structure is stable;
then that all the necessary content is in; then that the prose is clear and
consistent; and finally that there are no typos left. The movement between those
stages is almost exclusively one way: when I’m looking for typos, for example, I
won’t suddenly decide to move sections around.
At some point the decision has to be taken that the paper is finished and can be
released to the world. If I’ve done my job well, the paper will hopefully find
an interested audience. But these days I’m realistic enough to know that the paper won’t
be perfect: it won’t be as clear as it should be and it will contain mistakes.
Several years ago, I wrote an overview paper on dynamically
typed languages where I tried to collect together a lot of things I’d learnt
on the topic. Because I knew that this was an area about which many
people have strong opinions, I realised that I would have to do a better job
than I'd done on any of my previous papers. I tried to buff the paper to the
shiniest of sheens and, when it was published, I patted myself on the back for a
job well done. A little while after publication I got the following email
I'm just reading your article on dynamically typed languages from
2009, and I'm stunned. It's fantastically written, neutral and smart.
At this point, my ego – never the most fragile of objects – had
swollen to roughly the size of a small planet. Then I read the next sentence:
So how on earth could you misspell "growth" as "grown" in the
I checked, and, indeed, I had made a typo in the first sentence of the
abstract – in other words, the 13th word in the paper was wrong.
My ego shrank back down to a more appropriate size. I fixed the mistake in the
HTML version but the PDF
retains the error, as a reminder to my future self of my own fallibility.
There are many other things I could say but if I had to boil everything I’ve
learnt so far about paper writing into one
sentence it would be this: you don’t need to be hugely talented with words (I’m
not), but you have to be willing to continually reread, rethink, and reword
everything you’ve written. When writing in a team “you” particularly means “the
rewriter”, if you choose that model. Ultimately the goal is simple: will the
reader be able to understand the points
you wanted to get across, without undue effort? Bad writing, I suspect, comes from
people who think
their first effort is good enough. Good writing comes from people who know that,
every section, every paragraph, and every sentence will have to be pored
over multiple times: and, just when they think they’ve perfected the writing,
getting external feedback might cause them to reevaluate everything.
Sometimes this effort can feel like treading in Sisyphus’ footsteps but, in my
experience, it’s effort well spent.
Acknowledgements: My thanks to Edd Barrett, Carl Friedrich
Bolz-Tereick, Lukas Diekmann, and Sarah Mount for thoughtful comments. Any errors and
infelicities are my own.
Blogs like this one have their purpose, but I feel they're solving a different
problem to papers. Put crudely, I use my blog to explain things to people
who want a general overview, whereas papers contain all the nitty-gritty
Unless you're already familiar with Kuhn's work, Dick Gabriel's wonderful paper about
incommensurability in programming languages is, in my opinion, required reading.
In short: when we read other’s work, particularly if it’s a little old, we often
lack the context that all contemporary readers shared; we are therefore liable
to misunderstand or dismiss important parts of the explanation. A simple example
is the word “computer”. Before the 1940s this referred to people who performed
calculations; currently it refers to machines. Unless you’re aware of that fact,
you’ll get a completely incorrect impression when reading pre-1940s material
that uses the term “computer”. As a collection of syllables,
‘incommensurability’ is an annoying mouthful: but the idea is simple, powerful,
and describes a concept that is surprisingly frequently encountered. Dick brings
it to life with a very interesting concrete example.
The common advice is to write the abstract last, once you've got the rest of the
paper written. This clearly works well for many people, but whenever I've tried
doing so, I've found that my writing is aimless and disjointed.
When I’m reviewing papers, I write a small series of small comments in the order
that I came across them. It’s quite common for me to write “On p2 I don’t
understand X” and then to update it later by saying “Ah, this is explained on
I used |
git blame -L to find the relevant commits.
This is sometimes referred to as "self-plagiarism" and some authors will write such
sections anew each time to ensure they can't be accused of it. Since background
material is generally short and doesn't really form part of a paper's
contribution, I don't have a problem with people reusing small amounts of such
material: the problem, in my opinion, is when people copy supposedly new
technical contributions between papers. In this case I even made clear to myself
and my co-authors what was going on in the commit message:
Integrate a brief description of meta-tracing.
The first and last paragraphs are directly swiped from 'approaches to
interpreter composition'. The middle paragraph is new, as I suspect we will need
to talk about hints to explain why cross-language optimisations perform well.
This effect is often exaggerated when different co-authors have different mother
tongues: it's generally not difficult to tell a French author writing in English
versus a German author, for example. If this sounds like Anglo-snobbery, I really
don't mean it to be: I'm still struggling with one language, and am in awe of
people who can express themselves in multiple languages.
That said, there are a few things I consider universally bad style for technical
papers. The chief style crime I see amongst junior authors is a tendency to
think that the following idiotic trope (repeated by seemingly every school English teacher) is
true: “never repeat a word or phrase you’ve used recently.” Perhaps this is good
advice for fiction, but it’s terrible advice in technical papers, where it’s
vital to use the same word(s) to refer to the same concept throughout a paper.
Personally I stumbled across the "rewriter" way of working entirely by
accident. I was part of a research project that ultimately turned out to not
really need my skills. When it came to write up the resulting research, I
felt like a spare limb, so to make myself feel useful I offered my services
as a copy-editor. This was accepted by the rest of the team without any of us
realising what we'd signed up to. Whenever someone added a new part to the
paper, I would read it very carefully. If I couldn't make any sense of it, I
would add questions inline asking for clarification. If I could make sense of
most or all of it, I rewrote the whole lot in my style. I viewed it as my job
to make sure that the paper flowed, stylistically and technically, from start
to finish, and that everything within was adequately explained.
The downside of this approach was that I almost drove my co-authors to
murder because none of them had encountered such repeated questioning before.
However, after a bit of time had passed (and the paper had done reasonably
well), we all agreed it was better written than if we'd taken a
traditional writing approach. I hasten to add that is not because the paper used
my style, but simply because it used one person's style consistently,
and that person had also ensured that all the information the paper needed was
My personal tactic is to turn seemingly ambiguous things into definite
statements. For example, “This is smaller than some other objects” might become “This is
smaller than most other objects” or “This is larger than most other objects”
depending on my intuition about which way the ambiguity is best resolved. In
the best case, I will have rewritten things more clearly than they were before.
In the worst case I have made it both more likely that the original author will
spot problems and that they will feel the need to correct things.
The “screenshots” below are SVGs extracted from the PDF generated from
LaTeX. The inline comments are a macro that Roel Wuyts used on a shared article
we wrote many moons ago. I've shamelessly used it ever since. I can't
remember who added colour to the comment macros, but it really helps it
stand out from the normal text. These days I use “how much red is there
on a page” as a good proxy for “how close is this text to being
How long did it take to write this blog post? Spread over about 8 months, it
took about 7 days (partly because I wrote some material that external comments
convinced me didn't really fit in). It's one reason why I'm not a prolific
blogger, although the more significant reason is that I generally lack anything
worth my time writing and your time reading.
I wish I could write emails as succinct and amusing as this one. While I hugely
admire laconic humour in others, I’m not generally capable of editing my
thoughts down quite so ruthlessly.
Follow me on Twitter