Laurence Tratt's Blog


An editor for composed programs

August 20 2014

A few of you might remember a blog post of mine from 2011 titled Parsing: The Solved Problem That Isn't. My hope with parsing has always been that I can take an off-the-shelf approach and use it without having to understand all the details. In that particular post, I was looking at parsing from the angle of language composition: is it possible to glue together two programming language grammars and parse the result? To cut a long story short, my conclusion was none of the current parsing approaches work well for language composition. I was nervous about going public, because I felt sure that I must have missed something obvious in what is a diffuse area (publications are scattered across different conferences, journals, and over a long period of time). To my relief, a number of parsing experts have read, or heard me talk about, the technical facts from that post without barring me from the parsing community [1].

When I wrote the original blog post, I did not know if a good solution to the problem I was considering could exist. I ended by wondering if SDE (Syntax Directed Editing) might be the solution. What I've discovered is that SDE is a wonderful way of guessing someone's age: those under 40 or so have rarely heard of SDE; and those above 45 [2] tend to either convulse in fits of laughter when it's mentioned, become very afraid, or assume I'm an idiot. In other words, those who know of SDE have a strong reaction to it (which, almost invariably, is against SDE). This blog post first gives an overview of SDE before describing a new approach to editing composed programs that we have developed.

Syntax directed editing

Imagine I want to execute a program which adds 2 and 3 times x together. I fire up my trusty text editor, type 2 + 3 * x in, and pass it to a compiler. The compiler first parses the text and creates a parse tree from it, in this case plus(int(2), mul(int(3), var(x)), and then uses that tree to do the rest of the compilation process. In a sense, when programming in a text editor I think in trees; mentally flatten that tree structure into sequences of ASCII characters that I can type into the text editor; and then have the parser recreate the tree structure I had in my head when I wrote the program. The problem with this process is not, as you might be thinking, efficiency—even naive parsing algorithms tend to run fast enough on modern machines. Rather it is that some apparently reasonable programs fail to parse, and the cause of the errors can be obscure, particularly to novice programmers.

In the early 80s, a number of researchers developed SDE tools as a solution to the problem of frustrated and demotivated novices. Instead of allowing people to type in arbitrary text and parsing it later, SDE tools work continuously on trees. SDE tools have the grammar of the language being edited built in, which means they know which trees are syntactically valid. Rather than typing in 2 + 3 * x, one first instructs the SDE tool to create a plus node. A box with a + character appears on screen with two empty boxes to the left and right. One then clicks in the left hand box, types 2, before clicking in the right hand box and typing *. Two more boxes appear on screen into which one enters 3 and x. There is therefore no need for parsing to ever occur: one can directly input the tree in one's head into the SDE tool, rather than indirectly specifying it via a sequence of ASCII characters.

The advantage of SDE is that it guides users in creating correct programs. Indeed, SDE tools actively prevent incorrect programs from being created: at all points the tree is syntactically correct, albeit some holes may be temporarily unfilled. However, this advantage comes at a severe cost: to be frank, SDE tools are generally annoying to use. Novices, like small children, will tolerate nearly anything you throw at them, because they are unaware of better alternatives. When seasoned programmers were asked to use SDE tools, they revolted against the huge decline in their productivity. SDE tools not only make inputting new programs somewhat annoying, but they completely change the way existing programs are edited. Two examples will hopefully give you an idea of why. First, programmers often get half way through an edit in one part of a file (e.g. starting an assignment expression) and realise they need to define something else earlier in a file. They therefore break off from the initial edit, often leaving the program in a syntactically invalid state, and make textual edits elsewhere, before returning to the initial edit. SDE tools tend to forbid such workflows since they need to keep the program in a syntactically valid state [3]. Second, when editing text, it is often convenient to forget the tree structure of the program being edited, and treat it as a simple sequence of ASCII characters. If you ever done something like selecting 2 + 3 from the expression 2 + 3 * x, you have done just this. SDE tools force edits to be relative to a tree. For example, code selections invariably have to be of a whole sub-tree: one can select 2, or +, or 3, or 3 * x (etc.) but not 2 + 3. If this this seems like a minor restriction, you should try using such a tool: I don't think there is a programmer on this earth who will not find the changes required to use such tools hugely detrimental to productivity.

SDE tools thus quickly disappeared from view and within a few years it was almost as if they had never existed. The world would not have been much worse off for this, except for the fact that SDE and language composition are a perfect match. Whereas composing traditional grammars or parsers always involves trade-offs of expressivity or reliability, SDE imposes no such compromise. Since SDE tools are always in control of the languages they are editing, they can allow arbitrary languages to be composed in powerful ways.

I was certainly not the first person to note the potential of SDE for language composition. JetBrains MPS tool is a relatively recent SDE tool which allows composition of powerful editors. To the best of my knowledge, it is the most pleasant SDE tool ever developed. In particular, when entering new text, MPS feels tantalisingly close to a normal text editor. One really can type in 2 + 3 * x as in a normal text editor and have the correct tree created without requiring unusual interactions with the user interface. MPS applies small tree-rewritings as you type, so if you follow 2 with +, the 2 node is shuffled in the tree to make it the left hand child of the +, and the cursor is placed in an empty box to the right hand side of the +. The author of the language being edited by MPS has to enable such transformations, and while this is no small effort, it is a cost that need be paid only once. However, whilst MPS does a good job of hiding its SDE heritage when typing in new text, editing old text has many of the same restrictions as traditional SDE tools. As far as I can tell, MPS has gradually special-cased some common operations to work normally (i.e. as in a text editor), but many common operations have surprising effects. In some cases, edits which are trivial in a traditional editor require special key-presses and actions in MPS to extract and reorder the tree.

MPS is a striking achievement in SDE tools. It lowers SDE to the point that some programmers can be forced, with some encouragement, to use it. Some of the compositions using MPS are truly stunning, and have pushed the envelope far further than anyone has ever managed before. But, if I'm being honest, I wouldn't want to edit a program in MPS, using its current editor, and nor do I think programmers will take to it voluntarily: it makes too many simple things complicated.

A new solution

Simplifying a little, it seemed to us that MPS had made as good a stab as anyone is likely to manage at moving an SDE tool in the direction of a normal text editor; we wondered if it might be possible to move a normal text editor in the direction of SDE. In other words, we hoped to make an editor with all the tree goodness of SDE tools, but which would feel like a normal text editor. So, two years ago, in the Software Development Team at King's College London, we set out to see if such a tool could exist. I am happy to say that we now believe it can, and we (by which I mostly mean Lukas Diekmann) have created a prototype editor Eco based on the principles we have developed. In essence, we have taken a pre-existing incremental parsing algorithm and extended it such that one can arbitrarily insert language boxes at any point in a file. Language boxes allow users to use different language's syntaxes within a single file. This simple technique gives us huge power. Before describing it in more detail, it's easiest to imagine what using Eco feels like.

An example

Imagine we have three modular language definitions: HTML, Python, and SQL. For each we have, at a minimum, a grammar. These modular languages can be composed in arbitrary ways, but let's choose some specific rules to make a composed language called HTML+Python+SQL: the outer language box must be HTML; in the outer HTML language box, Python language boxes can be inserted wherever HTML elements are valid (i.e. not inside HTML tags); SQL language boxes can be inserted anywhere a Python statement is valid; and HTML language boxes can be inserted anywhere a Python statement is valid (but one can not nest Python inside such an inner HTML language box). Each language uses our incremental parser-based editor. An example of using Eco can be seen in Figure 1.

Figure 1: Editing a composed program. An outer HTML document contains several Python language boxes. Some of the Python language boxes themselves contain SQL language boxes. Some specific features are as follows. ❶ A highlighted (SQL) language box (highlighted because the cursor is in it). ❷ An unhighlighted (SQL) language box (by default Eco only highlights the language box the cursor is in, though users can choose to highlight all boxes). ❸ An (inner) HTML language box nested inside Python.

From the user's perspective, after loading Eco, they can create a new file of type HTML+Python+SQL. After the blank page appears, they can start typing HTML exactly as they would in any other editor, adding, altering, removing, or copying and pasting text without restriction. The HTML is continually parsed by the outer language box's incremental parser and a parse tree constructed and updated appropriately within the language box. Syntax errors are highlighted as the user types with red squiggles. The HTML grammar is a standard BNF grammar which specifies where Python+SQL language boxes are syntactically valid by referencing a separate, modular Python grammar. When the user wishes to insert Python code, they press Ctrl+L, which opens a menu of available languages (see Figure 2); they then select Python+SQL from the languages listed and in so doing insert a Python language box into the HTML they had been typing. The Python+SQL language box can appear at any point in the text; however, until it is put into a place consistent with the HTML grammar's reference to the Python+SQL grammar, the language box will be highlighted as a syntax error. Note that this does not affect the user's ability to edit the text inside or outside the box, and the editing experience retains the feel of a normal text editor. As Figure 3 shows, Eco happily tolerates syntactic errors — including language boxes in positions which are syntactically invalid — in multiple places.

Figure 2: Inserting a language box opens up a menu of the languages that Eco knows about. Languages which Eco can be sure are valid in the current context are highlighted in bold to help guide the user.

Figure 3: Editing a file with multiple syntax errors. Lines 6, 8 and 11 contain syntax errors in the traditional sense, and are indicated with horizontal red squiggles. A different kind of syntax error has occurred on line 4: the SQL language box is invalid in its current position (indicated by a vertical squiggle).

Typing inside the Python+SQL language box makes it visibly grow on screen to encompass its contents. Language boxes can be thought of as being similar to the quoting mechanism in traditional text-based approaches which use brackets such as [|...|]; unlike text-based brackets, language boxes can never conflict with the text contained within them. Users can leave a language box by clicking outside it, using the cursor keys, or pressing Ctrl+Shift+L. Within the parse tree, the language box is represented by a token whose type is Python+SQL and whose value is irrelevant to the incremental parser. As this may suggest, conceptually the top-level language of the file (HTML in this case) is a language box itself. Each language box has its own editor, which in this example means each has an incremental parser.

Figure 4: An elided example of an SQL language box nested within an outer Python language box. From the perspective of the outer incremental parser, the tree stops at the SQL token. However, we can clearly see in the above figure that the SQL language box has its own parse tree.

Assuming that there are no syntax errors, at the end of the editing process, the user will be left with a parse tree with multiple nested language boxes inside it as in Figure 4. Put another way, the user will have entered a composed program with no restrictions on where language boxes can be placed; with no requirement to pick a bracketing mechanism which may conflict with nested languages; with no potential for ambiguity; and without sacrificing the ability to edit arbitrary portions of text (even those which happen to span multiple branches of a parse tree, or even those which span different language boxes). In short, Eco feels almost exactly like a normal text editor with the only minor difference being when one enters or exits a language box.

How it works

Language boxes are an embarrassingly simple concept, based on a very simple observation. Context free parsing orders tokens into a tree based solely on the types of tokens: the value is not looked at (doing so would make the parser context sensitive, which is generally undesirable). When the user presses Ctrl+L and selects language L, Eco simply creates a token in the parse tree of type L. From the perspective of each tree, language boxes are normal tokens; the fact that they have sub-trees as values is of no consequence to the parser.


Since, in general, one cannot guarantee to be able to parse as normal text the programs that Eco can write, Eco's native save format is as a tree [4]. This does mean that one loses the traditional file-based notion of most programming languages. Fortunately, other communities such as Smalltalk have long since worked out how to deal with important day-to-day functionality like version control when one moves away from greppable, diffable, files. We have not yet incorporated any such work, but it seems unlikely to be a hugely difficult task to do so.

Some thoughts

We think that Eco is not only the first in a new class of editor, but the first editor which makes editing composed programs painless. Put another way, I wouldn't mind using an Eco-like tool to edit composed programs, and I am extremely fussy about my text editor. The fact that I say Eco-like is not accidental. Eco is a prototype tool, with all that entails: it is hugely incomplete (e.g. we currently have little or no undo functionality, depending on how you view things), buggy, and sometimes unexpectedly slow (though not because of the incremental parser, which is astonishingly fast). What we have learnt in building it, however, is more than sufficient to see how the ideas would scale up to an industrially ready tool. Will such a tool ever appear? I don't know. However, unlike when I wrote my original blog post, I can now state with some confidence that a good solution is possible and practical.

If you're interested in finding out more about Eco, you can read the (academic) SLE paper this blog post is partly based on, or download Eco yourself [5]. The paper explains a number of Eco's additional features—it has a few tricks up its sleeves that you might not expect from just reading this blog.

It is also important to realise that Eco is not magic: it is a tool to make editing composed programs practical. It is still necessary to specify what the composed syntax should mean (i.e. to give it semantics), assuming we want to do something with the result. In the case of the HTML+Python+SQL composition, composed programs can be exported to a Python file and then executed. Outer HTML fragments are translated to print statements; SQL language boxes to SQL API calls (with their database connection being to whatever variable a call to sqlite3.connect was assigned to); and inner HTML fragments to strings. We can thus export files from our composition and run them as normal Python programs. There are many other possible, and reasonable, ways to export a program from Eco, and one can imagine a single composition having multiple exporters. If you want to see how Eco fits into the wider language composition landscape, this video of a recent talk shows some of the directions we're pushing ahead on.

Acknowledgements: This work was generously funded by Oracle Labs without whom there would be no Eco. Lukas Diekmann has been the driving force behind Eco. Edd Barrett and Carl Friedrich Bolz gave insightful comments on a draft of this blog post.


[1] Though some of them quite reasonably disagree with me as to whether some of the limitations or trade-offs I mentioned really rule out some approaches from language composition.

[2] I'm not sure if my sample size of 40-45 year olds is insufficient, or whether they really are the inbetween generation when it comes to SDE. Fortunately, I will sleep soundly enough without knowing the answer to this particular question.

[3] I'm told that some of the original SDE tools were more flexible than this suggests, but these seem to be the exception, and not the rule. It's unclear to me exactly how flexible such tools were in this regard. Alas, the software involved has either disappeared or requires impossibly ancient hardware and software to run.

[4] It currently happens to be gziped JSON, which is an incidental detail, rather than an attempt to appeal to old farts like me (with gzip) as well as the (JSON-loving) kids.

[5] Depending on when you read this, you may wish to check out the git repository, in order to have the latest fixes included.

Link to this entry

The Bootstrapped Compiler and the Damage Done

December 4 2013

These days, it's not uncommon to hear that a language's compiler is written in itself. This phrase often confuses non-language implementers: how can a language's compiler be written in itself? The answer is surprisingly simple: what's being referred to is the second compiler created for the language. In such cases, the first compiler created is a bootstrapping compiler with just enough functionality to allow a compiler to be written in the language being created. In other words, when creating language NL, a basic compiler is first written in existing language OL, allowing a compiler for NL to be written in NL itself [1]. When the new compiler is completed, the language is said to be bootstrapped, and the bootstrapping compiler is thrown away, leaving only the bootstrapped compiler written in NL. Not every language bootstraps its compiler, but it is increasingly common to do so, especially for those languages which target VMs [2]. Indeed, it's become sufficiently common that I've sometimes seen new languages which haven't been bootstrapped looked on with slight suspicion (the argument seems to boil down to either or both of you clearly don't think much of your language if you're not prepared to use it to write its own compiler or your language is so underpowered, it can't even be used to write its own compiler).

There are obvious advantages to creating a bootstrapped compiler. Most notably, it means that the language being designed and implemented is also used early in the implementation process. What might otherwise be abstract musings about the use of the language are made concrete by the need to have the language be good enough to write the bootstrapped compiler in. As the compiler is written, missing or undesirable features in the language are discovered and the language's design extended and changed as appropriate. Bootstrapping thus tends to see the design and implementation of a language progressing in unison.

Traditionally, bootstrapping is seen as an almost universal good (apart from occasional griping over performance issues, which can be ignored). I have bootstrapped languages myself, and I will do so again (attacks by random buses allowing). However, I have recently become aware of a real problem with bootstrapping: it can, and does, distort language design. In other words, by having a bootstrapped compiler as the first – and, often, for some time the biggest – program in a language, the eventual language design is often unintentionally altered and rarely for the better. I suspect this has two root causes.

First, we understand how to design compilers better than any other type of program. There are many books devoted solely to the topic of writing compilers. Many clever researchers have put huge amounts of effort into understanding and optimising almost every aspect of compiler design. We have many example compilers and a huge folklore to draw on. This means that programming a compiler is unlike programming any other system I know. Normal programming – especially when starting – involves a huge degree of uncertainty as to how the program should be constructed; even what the program is expected to do is likely to be shrouded in fog. Not so with compilers: we know exactly what they should do, how their major components should be designed, and how those components should be integrated together. Programming a compiler therefore involves an unusually high degree of certainty right from the start. Compiler writers are able to focus on the essence of the task of the hand, without having to worry that choices taken will later necessitate a fundamental change in the compiler's design.

Second, compilers are an atypical class of program. In essence, a compiler is a simple batch pipeline process. A program is read in and translated to a tree; a series of tree transformations are applied; and eventually one of those trees is saved out as some sort of binary data (e.g. machine code or bytecode). Most of the intermediate tree transformations calculate a relatively simple bit of information about the program and create a slightly modified tree based on it. A few calculations crop up time and time again, such as: maps from variables to scopes or types; and stacks to determine closures. Significantly, and unlike most programs in the real world, there is no interaction with users: the compiler knows all it needs about the outside world from the moment it is called.

Since we know an unusual amount about how to write compilers, and because compilers are such a specific style of program, compilers tend to use programming languages in an unrepresentative fashion. A certain style of programming is common in compilers: pattern matching against trees and creating slightly altered trees from them. Most of the trees can be easily statically typed and are, or can trivially be made to be, immutable, since – relative to the capabilities of modern computers – the trees involved are rather small. If that is the program you are most used to writing, it is perhaps inevitable that your language design will make that use case particularly easy to realise, at the expense of more normal programs. On top of that, the complexity of the language is almost invisible to those bootstrapping the compiler, so there is a continual temptation to add features in to the language or its libraries.

In a sense, my fear is easily summarised: we language designers are all too often making languages which are much better for writing bootstrapped compilers than they are for writing other programs. We're coming up with languages with very odd features that are used relatively little other than by the compiler writer, and with languages so feature heavy that only the language designer can understand how everything fits together. Ultimately, we're thinking about ourselves much more than we are normal programmers. I have certainly been guilty of this myself and I fear I see it in several new languages too (though, thankfully, not all).

There is no guaranteed solution to this problem. We can't force an iron curtain to be drawn between language designer and compiler. But, as language designers, we can at least be aware of the dangers of drawing overly strong conclusions based solely on the experience of writing a compiler in our new language. As language users, we should treat the claims of language designers with scepticism until they, or other people, have written a reasonable number of non-compiler programs in their language. This still won't get us to nirvana, but it will stop us being distracted by shiny language features that solve issues compiler writers have that few other programmers share. If we don't, not only shall the meek inherit the earth, but they may choose to do so with non-bootstrapped languages.

Acknowledgements: My thanks to Edd Barrett, Carl Friedrich Bolz, and Lukas Diekmann for insightful comments on early drafts of this article. All opinions, errors, omissions, or infelicities, are my own.


[1] Converge's bootstrap compiler, for example, was written in Python.

[2] For no particular reason that I can think of, languages which compile to machine code seem less inclined to bootstrap.

Link to this entry

Relative and Absolute Levels

September 24 2013

Recently, I've seen a number of discussions along the lines of what makes a language declarative or imperative? (see, in chronological order, William Cook, Bob Harper, and Neel Krishnaswami). Trying to get a crisp definition of vague terms often involves endlessly going around circles. Just in case that isn't bad enough, in this specific case, some believe that their favourite term connotes superiority, thus setting otherwise reasonable people against each other. Intuitively, the two terms are talking about the level of abstraction of a programming language: declarative means that something provides more what and less how; while imperative is more how and only implicitly what. We have a vague feeling that the terms should be precise opposites, but in practice we keep stumbling across examples which are hard to fit precisely into exactly one of the terms. William and Bob are, I believe, both correct in saying that, if the terms declarative or imperative once had a specific meaning, they have long been robbed of it.

I would suggest that what's really going on is that we're trying to use the terms to express absolutes when they can only meaningfully express the relative relation of one thing to another. Before tackling the original point directly, let me try and give an example that is less likely to offend those of us immersed in programming languages. Imagine there are two people competing against each other in a 4km cycle race. Jim takes 5 minutes and Tim takes 4 minutes 50. We might say Tim is quick but we would be incorrect. Tim is only quick relative to Jim. When Kim — who can do the race in 4 minutes 30 — comes into the room, we can clearly see that Tim is not quick compared to Kim. Thus, while relative comparisons between individuals are safe, absolute statements are not: we can't be sure if we will encounter a person who changes our notions of quickest or slowest. Put more carefully, the absolute statements are only correct for the set of individuals we've sampled. Relative statements (e.g. Tim is quicker than Jim) remain correct even when we encounter new individuals.

With that in mind, let's look at a few examples from programming languages:

  • Absolute: Haskell is a declarative language. Incorrect.
  • Relative: Haskell is a more declarative language than C. Correct.
  • Absolute: C is an imperative language. Incorrect.
  • Relative: C is a more imperative language than Haskell. Correct.

Depending on your background, you may find the two incorrects hard to swallow initially. Consider this little C fragment:

a = 1;
b = 2;
c = 3;
d = a + b + c;
and an example compilation of it into ARM assembler [1].:
MOV R0, #1
MOV R1, #2
MOV R2, #3
ADD R3, R0, R1
ADD R3, R3, R2
STMDB R13!, {R0}
MOV R0, R3
BL print_int
LDMIA R13!, {R0}

Both fragments behave the same, in the sense that they describe the same output (the printing of the number 6). The C version however states much less detail about how that output should occur. The assembler version first has to break the addition down into 2 separate instructions. It then has to save the original value of R0 to prevent it being clobbered by the argument being passed to print. Relative to C, the assembler version feels much less declarative: one might even be moved to say that the assembler version is more imperative.

The fun begins when one realises that the assembly version is not the end of the road. We could, if we wished, go all the way down to electrons when considering a program executing. In the opposite direction, a DSL for expressing Context Free Grammars may feel much more declarative than the Haskell program that is generated from it. As that might suggest, we also need to bear in mind that there isn't a strict hierarchy here. There are multiple dimensions in which different languages can be considered declarative or imperative relative to each other.

This has an interesting implication. For any given two abstraction levels, we can find another in between, or one sideways (and so on). If we wanted to name each, we'd go mad. Thus, proposals such as Neel Krishnaswami's are interesting thought exercises, but we simply can't scale them up (unless you find the prospect of a massive, ever-expanding dictionary appealing).

In summary, trying to define declarative and imperative precisely as absolute terms will never work, if it ever did. Using them as relative terms can feel pedantic, but at least has the virtue of not being inaccurate. Although we really only need one of imperative or declarative, I expect both terms to comfortably outlive me, and for people to use them as vague absolutes in the same way that we use quick and slow as vague absolutes. It's also worth noting that the programming language community is not the only one bedevilled by this issue. The UML folk tie themselves in knots over Platform Independent and Platform Specific Models (PIMs and PSMs). Students of war have spent hundreds of years trying to nail down perfect definitions of strategy and tactics, without success. Humans like the seeming certainty of absolute statements, even in cases where only relative comparisons are safe.

Acknowledgements: My thanks to Carl Friedrich Bolz and Edd Barrett for insightful comments on early drafts of this article. All opinions, errors, omissions, or infelicities, are my own.


[1] It's been 15 years since I last wrote ARM assembler, so my apologies if I've got any minor details wrong.

Link to this entry

General Purpose Programming Languages' Speed of Light

April 9 2013

Having recently returned from a week of talks on programming languages, I found myself wondering where general purpose programming languages might go in the future; soon I wondered where they could go. The plain truth about programming languages is that while there have been many small gains in the last 20 years, there have been no major advances. There have been no new paradigms, though some previously obscure paradigms have grown in popularity; I'm not even aware of major new language features (beyond some aspects of static type systems), though different languages combine features in slightly different ways.

The lack of major advances in a given time period may not seem surprising. After all, major advances are like earthquakes: they occur at unpredictable intervals, with little prior warning before their emergence. Perhaps we are on the cusp of a new discovery and none of us – except, possibly, its authors – are aware of that. That is possible, but I am no longer sure that it's likely. If it doesn't happen, then it seems clear to me that we are getting ever closer to reaching general purpose programming language's speed of light—the point beyond which they cannot go.

Programming languages cannot grow in size forever

The basis of my argument is simple:

  1. Every concept in a programming language imposes a certain cognitive cost upon users.
  2. Our brains ability to comprehend new concepts in a language is inversely proportional to the number of concepts already present.
  3. At a certain number of concepts, we simply can't understand new ones.
Note that I deliberately phrased the first point to reflect the fact that some concepts are more complex to learn than others; but that, at a certain point, the volume of concepts becomes at least as important as the cumulative complexity because of the inevitable interactions between features. In other words, if language L1 has 20 features, of which 6 are complex and 14 simple, the cognitive cost is similar to L2 which has 3 complex and 17 simple features.

This doesn't sound too bad, until one considers the following:

  1. The core of virtually every extant programming language is largely similar.
  2. That core contains many concepts.
  3. Many of those concepts have a high cognitive cost.
In other words, most programming languages are surprisingly similar at their core. Before we can differentiate L1 from L2, we must already have understood features ranging from variables to recursion, from built-in datatypes to evaluation order.

Programming languages' core

The overly earnest teenage boy that lives within all of us wants to believe that programming languages are massively different. At any given moment, someone with a nihilistic bent and an internet connection can find fans of different programming languages fighting pitched battles where language L1 is argued to be always better than L2. The dreary irony, of course, is that what L1 and L2 share in common is nearly always greater than that which differentiates them. How many people use programming languages without function calls? without recursion? without variables? without ...? Nobody does, really. Those languages that are truly unique, such as stack-based languages, remain locked in obscurity.

The reality is that, since the late 1950s, programming languages have steadily been converging towards a stable core of features. That core has proven itself sufficiently good to allow us to develop huge pieces of software. However, it leaves our brain surprisingly little capacity to understand new features. Try and do too much in one language, and problems accrue. We long ago developed the technical ability to create programming languages too complicated for any single person to fully understand. C++ is the standard (though not the only) example of this, and far be it for me not to pick on an easy target. As far as I can tell, not a single person on the planet understands every aspect of C++, not even Stroustrup. This is problematic because one always lives in fear of looking at a C++ program and finding uses of the language one doesn't understand. At best, individuals have to continually look up information to confirm what's going on, which limits productivity. At worse, people misinterpret programs due to gaps in their understanding; this introduces new bugs and allows old bugs to hide in plain view.

In an ideal world, we would be able to understand programming languages in a modular fashion. In other words, if I don't use a feature X, I don't need to know anything about X. Sometimes this works, and organisational coding standards are one way to try and take advantage of this. If this modularity property held fully, it would allow us to add as many modularised features to programming languages as we wanted. However, programming rarely involves creating a program in isolation from the rest of the world, particularly in the modern era. Instead, we pull in libraries from multiple external sources. Even if I don't want to use feature X, a library I use might; and, if it does, it can easily do so in a way which forces me to know about X. Take C++ again: I might not want to use templates in my C++ programs but most modern C++ libraries do. Even if a programming language is designed so that its features are modular, the social use of such a language tends to destroy its modularity. This explains why it's virtually impossible to be a competent programmer without a reasonable understanding of nearly every feature in a language.

Better education?

A reasonable question to ask is whether this ability to exceed the human brain's capacity is a temporary one, due to poor education. As time goes on, we tend to become better at condensing and passing on knowledge. I have heard it said, for example, that what might have baffled an early 20th century physics PhD student is now considered a basic part of an undergraduate education (since I struggled with physics as soon as it became semi-mathematical, I am not in a position to judge). Perhaps, given time, we can teach people the same number of concepts that they know now, but at a lower cognitive cost.

I do expect some movement along these lines, but I'm not convinced it will make a fundamental difference. It seems unlikely that programming students of the future will find it substantially easier to learn recursion or iteration, to pick two semi-random examples. Every language needs to choose at least one of these concepts. Of the two, most people find recursion harder to grasp, but nested iteration doesn't come naturally to many people either. Those of us who've been programming for many years consistently underestimate how difficult newcomers find it. Starting programming education from a young age might change this dynamic, but it's difficult to imagine that sufficient quantities of competent programmers will move to become teachers, at least in the short term.

Better language design

Let's assume, briefly, that you agree that C++ is too complex but are wondering whether its difficulties are due to bad design rather than size. Perhaps all I've really done so far is point out that badly designed languages overload our brains sooner than well designed ones? There is clearly some truth in this but, again, I'm not sure that it fundamentally changes the landscape.

Let's take two statically typed functional languages as an example of more considered language design: Scala and Haskell. Both languages aim to stop users shooting themselves in the foot through the use of expressive type systems. Maybe by constraining users, we can help our brains absorb more general language concepts? Alas, I am not convinced. Some of the things Haskell and Scala's type systems can express are astonishing; but I have also seen each type system baffle world-renowned experts. Type systems, like any other language feature, do not come for free. Even if we could design a perfect programming language, I doubt it could be much larger than such languages are today.

What the above means is that the space for a major advance in general purpose programming languages is limited. The analogy with the speed of light works well here too. As an object's speed approaches the speed of light, the energy required to accelerate it reaches infinity, which is why a normal object can't ever travel at the speed of light. Worse, the energy required is non-linear, so the closer you get to the speed of light, constant increases in energy lead to ever-slower acceleration. Language design is rather like that. Beyond a certain point, every feature – no matter how well designed – has a disproportionate cost. It takes longer to understand, longer to learn how to use idiomatically, and longer to understand its interactions with other features.


The tooling that surrounds programming languages has changed significantly in the last two decades. Most developers now use huge IDEs [1], which offer a variety of ways to understand and manipulate programs. Verification techniques and tools have made major advances [2]. And whole new styles of tools now exist. To take two examples: Valgrind allows one to analyse dangerous run-time behaviour in a way that can significantly tame the danger of even unsafe languages such as C; Moose allows one to gain a reasonable high-level understanding of large programs in a fraction of the time it takes to wade through source code by hand.

Clearly, such tooling will only increase in quantity and quality as we move forwards. But will it help us understand programming languages more effectively? Again, I expect it to help a bit, but not to make a fundamental difference. Perhaps future editors will have the ability to simplify code until we zoom in (rather like an extreme version of folding). Ultimately, however, I don't know how we can avoid understanding the detail of programming languages at some point in the development process.

Moving beyond general purpose

What are we to do? Give up on general purpose programming language design? No—for two reasons. First, because a degree of group-think in language design means that we haven't explored the language design space as well as we should have yet. For all we know, there could be useful discoveries to be made in unexplored areas [3]. Second, because the general purpose part of the name is subtly misleading. A traditional assumption is that every programming language has been expected to be applicable to every problem. This then leads to the my language is better than yours notion, which is clearly nonsense. Anyone who tells you their favoured programming language – or even their favoured paradigm – is always the best is delusional. There is no such thing, and my experience is that programming language designers are almost universally modest and realistic about the areas to which their language is best suited.

In a single day, I can do Unix daemon programming in C, virtual machine development in RPython, system administration in the Unix shell, data manipulation in Python, and websites in a hodge podge of languages. Each has strengths and weaknesses which make it better suited to some situations than others. People who need certain styles of concurrent programming might favour Erlang for that, but more conventional languages for sequential processing. Many use cases are currently uncatered for: languages which target power efficient execution may become of greater interest in the future. All in all, I suspect we will see a greater level of customisation of nominally general purpose languages in the future.

I also suspect we will see greater use of more obviously restricted languages, which are often called domain specific languages [4]. Rather than try and make features that work well for all users – and that must still be understood by those for whom they don't work well – we are likely to have better luck by focussing on specific groups and making their life easier. For example, I do not wish to have my programming language cluttered with syntax for traditional mathematics (beyond the bare essentials of addition and the like), because I don't use it often enough. A programmer crunching data to produce statistics, on the other hand, would be much more productive with such a syntax. My current guess is that we will build such languages by composing smaller ones, but there are other possible routes.

In many ways, neither of these outcomes is quite as exciting as the notion of a perfect language. The contemporary emergence of a wide class of acceptable general purpose languages is an acknowledgement that we can't squeeze every feature we might like into a single language—our brains simply can't handle the result. Rather than try and move general purpose languages beyond the speed of light, we'll probably end up with many different points near it. At the same time, the possibility of restricted domain specific languages may enhance our productivity in narrower domains. This is, perhaps, not as immediately exciting a prospect as targeting all users, but it is its relative modesty that makes it more likely to pay off. Furthermore, unlike general purpose languages, that journey is nearer its beginning than its end. We might be reaching the speed of light for general purpose programming languages, but we've barely started with domain specific languages.

And, of course, there's always the remote possibility of an unforeseeable earthquake hitting programming languages. Unlike in the real world, we are not callous or inhuman for hoping that such an earthquake will hit us, even though many of us are not sure it will ever come.

Acknowledgements: My thanks to Martin Berger, Carl Friedrich Bolz, Lukas Diekmann, and Naveneetha Vasudevan for insightful comments on an early draft of this article. All opinions, errors, and infelicities are my own.


[1] I don't, because I'm a Unix Luddite at heart, but you may well think that's my loss.

[2] Even the simple static analyser in clang catches a host of nasty bugs that previously escaped the eyes of even the best developers. More advanced tools can do even better, though one should not overstate the power of such tools.

[3] As an example of the fun to be had exploring unusual language design, I immodestly offer my experiences on integrating an Icon-like expression system into a Python-like language. If nothing else, it taught me that it is possible to completely subvert expectations of how standard parts of a programming language work. The HOPL series of workshops, and the generally wonderful papers therein, are another excellent source of programming language design ideas.

[4] Note that I'm talking about languages with distinct syntax and semantics; the term is also currently used to describe particular idioms of library design and use in conventional languages, though I suspect that will fade in the future.

Link to this entry

Another Non-Argument in Type Systems

October 25 2012

I find most static vs. dynamic debates to be painful (as I've said before) and, in general, I try to ignore them. However, every so often a new argument is raised which needs consideration. About 18 months ago, Bob Harper asserted that a dynamic language is a straightjacketed static language that affords less rather than more expressiveness. Part of the underlying argument is correct, but it avoids an important part of the story. I've recently seen several restatements of this assertion, so, unfortunately, I feel compelled to write a brief rebuttal explaining why I believe it's yet another non-argument in the ongoing debate about which class of languages is better.

My firm belief is that statically and dynamically typed languages overlap in some ways, but differ in others. Each has tasks they are better suited to, and neither is inherently better than the other. Bob shows in his article one area in which statically typed languages allow things that dynamically typed languages can't; in this article, I'm simply going to point out that this leads to a mirror-image where dynamically typed languages allow things that statically typed languages can't. In that sense, the argument I make here is a restatement of ideas in my dynamically typed languages paper.

Before we can go any further, we need a shared understanding of a concept from statically typed languages such as ML and Haskell (i.e. not Java or its ilk): sum types. Since this concept is unfamiliar to many programmers, I'm going to present a simple definition of the core features; sum types have a lot of extra features (tagging, pattern matching, etc.) that aren't germane to this article's argument. The basic idea is that sum types are a super type that ranges over other types. Whereas in OO languages a sub-class explicitly chooses its super-class(es), sum types (the super class equivalent) choose the types they range over (the subclass equivalent) — the latter have no idea which sum types they are ranged over by. I can therefore take the types T1 and T2 and make a sum type over them S. Whenever my program wants to do anything with a value of static type S, I must then use what I'll call a typecase statement to differentiate whether the dynamic type of the value is actually a T1 or a T2 (readers familiar with the concept will realise that this is intended to be a very simple form of pattern matching). Writing this out in pseudocode can help visualize the important relationships between these concepts:

type T1:

type T2:

sumtype S rangesover T1, T2

func f(x : S):
  typecase x:
    whenoftype T1:
    whenoftype T2:
As this suggests, typecase acts as a dynamic type check, with the only difference from its cousins in dynamically typed languages being that we are forced to write out all of the possible types the sum type ranges over. We could not, for example, leave off the whenoftype T2 clause in the above example.

With that under our belt, we can now look at Bob's argument. He makes two points which can seem to be separate but which I suggest are really just different ways of looking at a single underlying idea:

  1. Dynamic typing is but a special case of static typing, one that limits, rather than liberates, one that shuts down opportunities, rather than opening up new vistas.
  2. this is precisely what is wrong with dynamically typed languages: rather than affording the freedom to ignore types, they instead impose the bondage of restricting attention to a single type! Every single value has to be a value of that type, you have no choice!

Bob points out that, conceptually, a dynamically typed program has a single static sum type ranging over all the other types in the program. We could therefore take a dynamically typed program and perform a conversion to a statically typed equivalent. Although Bob doesn't make it explicit how to do this, we can work out a simple manual method (though it's not clear to me that this would be fully automatable for most real programs). We would first need to see all the points where the program performs operations that use types (perhaps by a sort of type inference; but we'd also need to include dynamic type checks using e.g. isinstance in Python or instanceof in Java). All the types referenced would then be ranged over a single sum type. Whenever a sum type is encountered statically, we would then need to use a typecase to check the dynamic type of the value, and then perform an appropriate action. In this way we can guarantee that our once-dynamically typed program is statically type-safe.

This conceptual argument is quite correct. However, it is then used as the basis for two jumps of logic, both of which I believe to be incorrect. First that this means that dynamic typing is a special case of static typing; and that, because of this, dynamically typed languages only have a single type. I also think that the thought experiment misses the very important factor of whether, if we were to do this for real, the results would be in any way usable. I'll now tackle these points in reverse order.


Sum types are a powerful theoretical tool; when used judiciously, they can also improve code. Used indiscriminately – particularly where a sum type ranges over an unduly large number of other types – they create an unwieldy mess because of the requirement to check all the types that have been ranged over.

The encoding of dynamically typed languages we used above would lead to a huge increase in the size of programs, since a static type system forces us to explicitly express all the checks that a dynamically typed run-time implicitly performs for us. It would also remove the possibility of making use of a dynamically typed system's ability to execute and run programs which are deliberately not statically type safe (a useful property when comprehending and modifying large systems). I therefore assert that while it's a valid thought experiment, it would be deeply impractical for many real systems which make use of the easy dynamicity offered by dynamically typed languages.

Unitypes and multitypes

Bob rests much of his argument on the notion that his encoding shows that dynamically typed languages only have a single type — that they are, therefore, unityped (as opposed, one presumes, to multityped). As his suggested encoding indirectly shows, dynamically typed languages can indeed be thought of as having a single static type (or, as I often tend to think of things, a static type checker which always returns true). However, they can have an arbitrary number of dynamic types. The fundamental difference between statically and dynamically typed languages isn't over how many types they have: it's about when those types are utilised.

Consider a program which tries to add an int to a string (e.g. "abc" + 123). A statically typed language, depending on its type system, is likely to either reject such a statement as untypeable, or coerce the int to a string at compile-time. A dynamically typed language has exactly the same choices as the statically typed language: it is likely to reject such a statement as untypeable, or coerce the int to a string. The only difference is that the choice of rejection or coercion happens upon every execution at run-time, not once at compile-time.

It therefore makes no sense to say that a language is unityped without qualifying whether that relates to its static or dynamic type system. Python, for example, is statically unityped and dynamically multityped; C is statically multityped and dynamically unityped; and Haskell is statically and dynamically multityped. Although it's a somewhat minor point, one can argue (with a little hand waving) that many assembly languages are both statically and dynamically unityped.

The relative expressiveness of dynamically and statically typed languages

Expressiveness is a word that we in programming languages should be wary of, because, like Humpty Dumpty, we often use it to mean whatever we want it to mean. Bob implicitly equates expressiveness with the ability to state more, stronger, static properties of a program. That's one reasonable interpretation of expressiveness, but it's not the only one — let's call it static expressiveness. If we consider dynamic expressiveness, we can see that expressiveness is a multi-dimensional quality, with no clear winners.

Every static type system designer faces a fundamental choice: should the static type system be sound or complete? If the type system is complete then it will be unsound, letting through some programs that will cause run-time type errors. Conversely, if the type system is sound then (because of the expressivity of programming languages) it will be incomplete, rejecting some programs that will run without type errors at run-time. Nearly all statically typed languages therefore aim to be sound rather than complete [1].

The following program (in pseudo-Python) shows a program which will fail to type check in any sound static type system which I know of:

l = [...]  # A list of non-empty strings
if len(l) > 0:
  l = l[0] # Extract the first string from the list
  print(l) # Print the string
if len(l) <= 0:
  l.sort() # The sort operation is valid on lists but not strings
Sound static type systems will reject such programs as type-unsafe, because a) the l variable changes type (from List(String) on line 1, to string on line 3) b) a later use of the type requires it to be of a specific type (List(*)) on line 6). While this program may not be an example of good design, we can trivially see that this program will never lead to a type error at run-time because of the non-overlapping predicates on the if clauses. We could design a sound type system that will deal with this specific case but, because a sound type-system won't be complete, we can always find another similar run-time correct example which the type system will reject [2].

As this example shows, dynamically typed languages will execute to completion every program which contains no dynamic type errors. Statically typed languages will prevent some such programs from executing in the first place (at best, forcing the user to rewrite his already correct program using sum types). On this basis, one can therefore argue that dynamically typed languages are more dynamically expressive. It also provides a mirror-image encoding from above: one can always translate statically typed programs to a dynamically typed equivalent.

All that I've done in the above is to restate a simple principle: statically and dynamically typed languages enforce their types at different points in a program's life-cycle. Depending on whether you view things through a static or dynamic type prism, either class of language can be considered more expressive. Dynamically typed languages will try and execute type-unsafe programs; statically typed languages will reject some type-safe programs. Neither choice is ideal, but all our current theories suggest this choice is fundamental. Therefore, stating that one class of language is more expressive than the other is pointless unless you clearly state which prism you're viewing life through.


In summary, if Bob's article had explicitly stated that he was solely considering static expressivity (as e.g. James Iry's chart of language's different static expressivity does), I would have agreed with it. In this article, I've tried to show that statically and dynamically typed languages overlap in many ways, but differ in others. Neither is exactly a subset of the other nor, as the underlying theory shows, could they be. Arguing whether static or dynamic typing is universally better or more expressive is therefore pointless. One must consider the pros and cons for the tasks one has at hand, and choose accordingly. Personally, I use both styles, and I have no wish to unduly restrict my language toolbox.

Acknowledgements: My thanks to Martin Berger and Christian Urban for insightful comments on early drafts of this article. All opinions, errors, omissions, or infelicities, are my own.

Updated (October 25 2012)

Joseph Cottam pointed out that the code example was incorrect if l contained an empty string at position 0. I've updated the comment in the code accordingly.


[1] Since formally proving this turns out to be rather hard, some seemingly sound type systems have turned out not to be so (e.g. Eiffel). Fixing this after the fact is then problematic because restricting the type system further to make it sound is likely to end up with perfectly correct real-world programs suddenly being identified as type-unsafe.

[2] One can equally imagine static analysis techniques which would let the program snippet through. Static analysis techniques tend to be both unsound and incomplete in order to make wider-ranging analysis possible. They therefore can not replace static type systems, but can augment them.

Link to this entry

Earlier posts


All posts


Last 10 posts

An editor for composed programs
The Bootstrapped Compiler and the Damage Done
Relative and Absolute Levels
General Purpose Programming Languages' Speed of Light
Another Non-Argument in Type Systems
Server Failover For the Cheap and Forgetful
Fast Enough VMs in Fast Enough Time
Problems with Software 3: Creating Crises Where There Aren't Any
Problems with Software 2: Failing to Use the Computing Lever
Problems with Software 1: Confusing Problems Whose Solutions Are Easy to State With Problems Whose Solutions Are Easy to Realise


Tony Clark
Zef Hemel


Mark Delgano
Steven Kelly
Jim Steel


Marc Balmer
Ross Burton
Peter Hansteen
OpenBSD Journal
Ted Unangst


Peter Bell
Gilad Bracha
Tony Clark
Cliff Click
William Cook
Jonathan Edwards
Fabien Fleutot
Martin Fowler
John Goerzen
James Hague
James Iry
Ralf Laemmel
Lambda the Ultimate
Daniel Lemire
Michael Lucas
Bertrand Meyer
Keith Packard
Havoc Pennington
Brown PLT
John Regehr
Diomidis Spinellis
Shin Tai
Markus Voelter
Phil Wadler
Russel Winder
Steve Yegge