| Home | e-mail: laurie@tratt.net github: ltratt twitter: @laurencetratt |
Fast Enough VMs in Fast Enough TimeFebruary 8 2012 [This is a long article because it covers a lot of ground that is little known by the computing mainstream, and easily misunderstood. If you're familiar with the area, you may find yourself wanting to skip certain explanatory sections.]If you can stomach the smell, put yourself briefly in the shoes of a programming language designer. What you want to do is create new programming languages, combining new and old ideas into a fresh whole. It sounds like a fun, intellectually demanding job, and occasionally it is. However, we know from experience that languages that exist solely in the mind or on paper are mostly worthless: it is only when they are implemented and we can try them out that we can evaluate them. As well as a language design, therefore, we need a corresponding language implementation. This need for a corresponding language implementation leads to what I think of as the language designer's dilemma: how much implementation is needed to show that a language design is good?
Putting VM effort into contextI implemented two Virtual Machines (VMs) for Converge, both in C: the first was epically terrible; the second (introduced in Converge 1.0) merely awful. Even the second is extremely, sometimes almost unusably, slow: it is roughly 5-10x slower than CPython, tending towards the slower end of that range. This partly reflects my lack of experience and knowledge about VM implementation when I started; but it also reflects the fact that the Converge VM was a secondary concern to the language design. Despite that, I estimate that I spent around 18 man months on the second VM (intertwined with a lot of non-VM work). If this seems a lot of effort to spend on such a poor quality VM, it's worth bearing a few things in mind: Converge has a hard to optimise expression evaluation system based on Icon (see this paper for more details); C, whilst a fun language in many ways, doesn't lend itself to the fast writing of reliable programs; and most real-world VMs have had a lot more effort put into them.As rough comparisons, CPython (the de-facto standard for Python, so named because it is written in C) has probably had a couple of orders of magnitude more effort put into it; and Java's HotSpot roughly three orders of magnitude more. It's not surprising that the second C Converge VM doesn't do well in such a comparison — in comparison, it's not so much a minnow as a single-celled organism. What all this means is that the second C Converge VM so slow that during demos of some of Converge's advanced features, I learned to make winding up gestures at the side of my laptop to amuse the audience. Even I was not sure whether its combination of features could ever be made adequately efficient. Other approachesWhy did I write my own VM and not use someone else's? The choices are instructive.The traditional route for big compilers (e.g. gcc) is to output machine code (perhaps via assembler). The output is efficient, but the compiler itself requires a large amount of effort. For example, simply learning the intricacies of a processor like the x86 sufficiently well to generate efficient code isn't for the faint of heart. In short, the amount of effort this approach demands is generally prohibitive. An alternative to generating machine code directly is to generate C code and have that compiled into machine code. Several compilers have taken this approach over the years (from Cfront to the original Sather compiler). While still relatively difficult to do, it is certainly much easier than generating machine code directly. However it can often lead to a poor experience for users: not only must they pay the costs of double-compilation, but the translation typically loses a large quantity of useful debugging information (something which Converge pays special attention to). With few exceptions (one of which we'll see later), this approach is now rare. Perhaps the obvious choice is to use an existing VM. The two Goliath's are the JVM (e.g. HotSpot) and Microsoft's CLR (the .NET VM). When I started work on Converge, the latter (in the form of Mono) didn't run on OpenBSD (my platform of choice), so was immediately discounted. HotSpot, however, remained a possibility because of its often stunning performance levels. The reason that I couldn't use it isn't really HotSpot specific. Rather, it is something inherent to VMs: they reflect the languages, or group of languages, they were designed for. If a language fits within an existing VMs mould, that VM will probably be an excellent choice; if not, the semantic mismatch between the two can be severe. Of course, given sufficient will-power, any programming language can be translated to any other: in practise, the real issues are the ease of the translation and the efficiency of programs run using it. Two examples at opposite ends of the spectrum highlight this. Jython (Python running on the JVM) is a faithful implementation of Python; but even with the power of HotSpot behind it, Jython almost never exceeds CPython in performance, and is generally slower because some features (mostly relating to Python's highly dynamic customisability) do not lend themselves to an efficient implementation on the JVM. Scala, on the other hand, was designed specifically for the JVM — to ensure reasonable performance, Scala's language design has in some parts had to be compromised (e.g. due to type erasure). Whether the semantic mismatch is manageable depends on the particular combination of language design and VM. Converge's unusual expression evaluation mechanism was enough on its own to rule out a practical JVM implementation (Jcon, an implementation of Icon for the JVM is still slower than the old C Icon interpreter, which itself is no speed demon). As Converge's development progressed, a number of other features (e.g. its approach to tracing information) have made it increasingly difficult to imagine how it could be practically wedged atop an existing VM. A language for JITing VMsWhat all the above means is that the options for implementing a language are generally unpalatable: they either require an undue amount of work, or compromises in language design, and, too often, both. My suspicion is that this status quo has severely inhibited programming language research: few groups have had sufficient resources to implement unusual languages well enough to prove them usable. And then I came across PyPy. To be more accurate, after a few years of vaguely hearing of PyPy, 6 months ago I unexpectedly bumped into a PyPy developer who convinced me that PyPy's time has come. After porting PyPy to OpenBSD, I investigated further. What I've come to realise is that PyPy is two separate things:
PyPyto cover both, which has confused many a reader (myself included). Henceforth, I shall unambiguously use RPythonto refer to the language for writing VMs in, and PyPyfor the new Python VM written in RPython. So, what is RPython? The obvious facts about it are that it is strict subset of Python whose programs are translated to C. Every RPython program is a valid Python program (which can be run using a normal Python interpreter), but not vice versa. However, RPython is suitably restricted to allow meaningful static analysis. Most obviously, static types (with a type system roughly comparable to Java's) are inferred and enforced. In addition, extra analysis is performed e.g. to assure that list indices don't become negative. Users can influence the analysis with RPython would be of relatively little interest if all it did was subset Python and output C. Though a full programming language, RPython is unlikely to be the next However, in addition to outputting optimised C code, RPython automatically creates a second representation of the user's program. Assuming RPython has been used to write a VM for language L, one gets not only a traditional interpreter, but also an optimising Just-In-Time (JIT) compiler for free. In other words, when a program written in L executes on an appropriately written RPython VM, hot loops (i.e. those which are executed frequently) are automatically turned into machine code and executed directly. This is RPython's unique selling point, as I'll now explain. Traditional VM implementationBecause RPython is unique, it's easy to overlook what's interesting about it: it took me a couple of months of using it before I had built up an accurate understanding. Looking at the traditional approach to VM implementation is perhaps the easiest way to explain what's interesting about RPython. First, let me make a couple of assumptions explicit. Languages like C are well suited to being translated directly to machine code, as they are little more than a thin layer over machine code: everything that is Take a JITing VM such as HotSpot or V8. Such a VM will initially use an interpreter to execute the user's program. Unfortunately for us, When, during execution, the interpreter in a VM such as HotSpot or V8 spots a frequently executed piece of code, it will hand that chunk of code off to a JIT compiler (along with information about the context within which that code is used) to convert it into machine code. The JIT compiler (henceforth just
Of course, all such problems can be solved with sufficient resources, but these explain in large part why major open source languages like Python and Ruby currently ship with JIT-less VMs (i.e. they have only an interpreter). JITs for freeWhat RPython allows one to do is profoundly different to the traditional route. In essence, one writes an interpreter and gets a JIT for free. I suggest rereading that sentence again: it fundamentally changes the economics of language implementation for many of us. To give a rough analogy, it is like moving from manual memory management to automatic garbage collection. RPython is able to do this because of the particular nature of interpreters. An interpreter, whether it be operating on bytecode or ASTs, is simply a large loop: In essence, one need only add two function calls to an RPython program to add a JIT. The first function call ( Now is a good time to get an idea of how RPython generates a JIT from just an interpreter (what RPython calls the language interpreter). As said earlier, RPython automatically layers alongside C code a second representation of the interpreter (the tracing interpreter). The details of how the tracing interpreter is stored are irrelevant to us, except to note that it's in a form that a JIT can manipulate (conceptually it could be an AST-like structure). RPython's JIT is a tracing JIT. When a hot loop is detected, a marker is left such that the next time the loop is about to run, the JIT will enter tracing mode. During tracing mode, a complete execution of the loop is performed and all the actions it takes are traced (i.e. recorded) using the tracing interpreter (which is much, much slower than the language interpreter). After the loop has finished, the trace is then analysed, optimised, and converted into machine code. All subsequent executions of the loop will then call the machine code version. Since subsequent executions may diverge from the recorded trace, RPython automatically inserts guards into the machine code to detect divergence from the machine code version's capabilities. If a guard fails at any point, execution falls back to the interpreter. At this point, it's worth taking a brief side-tour to introduce
Figure 1: Example code and high-level traces.
Figure 1 shows a high-level example of a tracing JIT for a dynamically typed Python-esque language. Let us assume that the code in the first column is part of a hot loop that the tracing JIT decides is worth converting into machine code. On the next execution of the loop, the initial value of The first thing the tracing JIT will create is a guard to ensure that the generated machine code is only executed if Optimising tracesWhile the example in Figure 1 gives a reasonable high-level idea about tracing JITs, it doesn't really explain how the trace is created. RPython badges itself as a meta-tracing system, meaning that the user's end program isn't traced directly (which is what Figure 1 suggests), but rather the interpreter itself is traced. Using the same example code from Figure 1, Figure 2 shows a snippet of the interpreter and the trace of the interpreter that leads to. This trace (though simplified somewhat to make it readable) is indicative of the traces that RPython introduces.
Figure 2: An interpreter fragment and a full trace of that interpreter for the end-user program from Figure 1.
jit_merge_point call: this is the point where the interpreter tells RPython if you have any machine code to execute for the program beginning at position
Initially the trace might seem rather difficult to read, but if you think of it as a flattened record of all of the interpreters actions while executing the user's code, it becomes rather easier. The astute reader will notice that the traces are in Single Static Assignment (SSA) form: all assignments are to previously unused variables. While one would probably not want to write in this style, it has many advantages for optimisers, because it trivially exposes the data flow. In normal programs, we are often unable to determine whether a variable With this knowledge, it should hopefully be fairly simple to see that the trace on the right is simply a record of the instructions the interpreter on the left performed while executing our example program. It's worth thinking about this relationship, as it's key to RPython's approach. Once one has a handle on the trace, the sheer size of it should be a concern: there's a lot of stuff in there, and if it was converted to machine code as-is, one would get disappointing performance gains (around 40%). It is at this point that RPython's trace optimiser kicks in. I'll now try and give an idea of what RPython's trace optimiser can do. The first thing that's obviously pointless about the above is the continual reading of bytecode instructions. If we start from a specific point in the program, and all the guards succeed, we know that the sequence of instructions read will always be the same in a given trace: checking that we've really got an
Figure 3: The trace with calculations related to the program counter optimised away.
append of an object on a list l is followed by a pop without any other operations on l in the interim, we can remove both calls. We can also do something similar for the dictionary stores and lookups in the middle of the trace: since they're not used until the end, storing intermediate values is pointless. This is a bigger win, as dictionary lookups are much more expensive than list lookups. The resulting optimisations are shown in Figure 4.
Figure 4: The trace with list and dictionary optimisations folded away.
v4 is an int then it will still be an int after addition: we can thus remove the second type check. Once that is done, we can easily collapse the two additions of constants (since x + 2 + 3 ≜ x + 5). Figure 5 shows both optimisations.
Figure 5: The trace with type checks and constant additions folded away.
A new Converge VMAfter becoming intrigued by the possibilities of RPython, I decided to use it to implement a new VM for Converge. To allow an apples-to-apples comparison, my initial goal was to maintain 100% compatibility with the old VM, so that the same bytecode could run on both the C and RPython versions of the VM. That goal wasn't quite met, although I came extremely close: for quite some time, bytecode for the RPython VM could be run on the C VM (but not vice versa).First, it's probably useful to give some simple stats about the C VM that I was aiming to replace. It is about 13KLoC (thousand lines of code; I exclude blank lines and purely commented lines from this count). It contains a simple mark-and-sweep garbage collector that is mostly accurate but conservatively collects the C stack (so that VM code doesn't need be clever when dealing with references to objects). It implements full continuations at the C level, copying the C stack to the heap and back again as necessary (this turned out to be much more portable than I had initially feared), so that code at both the VM and Converge program level can be written in similar(ish) style. The VM is ported to a variety of 32 and 64 bit little endian systems: OpenBSD, Linux, OS X, Cygwin, and (native binary) Windows. Overall, the VM works, but is very slow, and, in parts, rather hard to understand. There are no precise records to determine the effort put into it, but I estimate it took between 12 and 24 man months — let's call it 18 man months. I started work on the new VM on September 1st 2011. Before September 1st I had never used RPython, nor had anyone (to the best of my knowledge) outside the core PyPy group used RPython for a VM of this size (at the time, the Happy VM, which implements a subset of PHP, was the closest comparison). Though the Converge VM is obviously not a big VM, it is beyond merely a toy. It also has some unusual aspects (as touched upon earlier in this article) that make it an interesting test for RPython. By December 19th 2011 I had a feature-compatible version of the Converge VM, in the sense that it could run every Converge program I could lay my hands on (which, admittedly, is not a huge number). After an initially slow period of development, mostly because of my unfamiliarity with RPython, progress became rapid towards the end. The resulting VM is about 5.5KLoC (compared to 13KLoC for the C VM). I estimate I was able to dedicate around half of my time to the VM during those 4 months (I started a new job on September 1st and then taught a course on a largely unfamiliar subject). Although the two time estimates (18 man months for the C VM vs. 2-3 man months for the RPython VM) aren't fully comparable, they are useful. While many parts of the RPython VM were a simple translation from the C VM, that itself was partially a reimplementation of a previous C VM (though to a lesser extent). The RPython's VM structure is also substantially different than the C VM (it's far cleaner, and easier to understand), so some aspects of the translation were hardly simple. My best guess is that moving from C (a language which I enjoy a great deal, despite its flaws) to RPython was the single biggest factor. If nothing else, large amounts of the C VM involve faffing about with memory resizing; RPython, as a fully garbage collected language, sweeps all that under the carpet. Status of the VMThe new Converge VM's source is freely downloadable as are binaries for most major platforms (other than Windows, at the time of writing). Eventually this VM will form part of a Converge 2.0 release, although more testing will be needed before it's reached that point. Before you form your opinions about the new VM, it's worth knowing what it is and isn't. It's not meant to be an industrial strength VM, at least not in its current form. Converge is a language for exploring compile-time meta-programming and domain specific languages. Some things which mainstream programming languages need to care greatly about (e.g. overflow checking; Unicode) are partly or wholly ignored. Such matters are a problem for a later day.It's also worth knowing that I haven't spent a huge amount of time optimising the new VM. As soon as it got to PerformanceSo, the RPython VM was created in roughly 1/6 the time it took to create the C VM. What is the performance like? This section will try and give a flavour of the performance, though please note that it's not totally scientific.The December 19th version of Converge (git hash 84bb9d6064 if you wish to look at it) was already usefully faster than the old C VM. One of my simple benchmarks has long been to time Looking at output from
How does it perform on more general types of code? In one sense, this is an impossible question, because no two people share the same definition of I'll start with the
Figure 6: The Stone benchmark.
stone is venerable, it's also so small and artificial that it doesn't necessarily tell us very much. The Richards benchmark is a much more interesting benchmark that models task dispatching (Mario Wolczko explains Richards in depth and also provides various other language implementations). By default it performs 10 iterations; I've also included timings for 100 iterations. Figure 7 shows the timings.
Figure 7: The Richards benchmark.
As a final benchmark, and an example of something which programmers need to do frequently, I chose something which neither the old or new Converge VM has had any sort of optimisations for: sorting. Part of the reason why I expected them to do badly is that neither optimises list accesses.
Figure 8: The sorting benchmark.
Although this section has contained a number of very hard figures, one should be careful about making strong claims about performance from such a small set of benchmarks. My gut feeling is that they are over-generous to the new Converge VM, mostly because there are many areas in the VM which have received no optimisation attention at all: if one of those was used repeatedly, performance would suffer disproportionately. I suspect that, rather than appearing to be much faster than CPython 2.7.2 (as above), its performance on a wider set of benchmarks would probably be on a more even par. Even so, that would still be a huge improvement on the old VM. The interesting thing is that most of the performance gains are from RPython: I only made a few relatively easy changes to increase performance, as we shall now see. Optimising an RPython JITSome RPython VMs lead to a much more efficient JIT than others. The trace optimiser, while clever, is not magic and certain idioms prevent it working to its full potential. The early versions of the Converge VM were naive and JIT-unfriendly: interestingly, I found that a surprisingly small number of tactics hugely improved the JIT.The first tactic is to remove as many instances of arbitrarily resizable lists as possible. The JIT can never be sure when appending an item to such a list might require a resize, and is thus forced to add (opaque) calls to internal list operations to deal with this possibility. Such calls prevent many optimisations from being applicable (and are relatively slow). When this was first pointed out to me, I was horrified: my RPython VM was fairly sizeable and used such lists extensively. Most noticeably, the Converge stack was a global, resizable list. After a little bit of thought, I realised that it's possible to statically calculate how much stack space each Converge function requires (this patch started the ball rolling). I was then able to move from a global resizable stack to a fixed-size stack per function frame (i.e. the frame created upon each function call; these are called continuation frames in Converge, though that need not concern us here). At this point, the relative ease of developing in a fairly high-level language became obvious. If I had tried to do such a far-reaching change in the C VM, it would have taken at least a week to do. In RPython, it took less than a day. Some other arbitrarily resizable lists took a little more thought. After a while it became clear that, even though each function now had its own fixed-size stack, the global stack of function frames, stored of course in a resizable array, was becoming a bottleneck. That seemed hard to fix: unlike the stack size needed by a function frame, there is no way to statically determine how deeply function calls might nest. A simple solution soon presented itself: having each function frame store a pointer to its parent removed the need for a list of function frames (see this patch). The second tactic is to tell the JIT when it doesn't need to include a calculation in a trace at all. The basic idea here is that when creating a trace, we often know that certain pieces of information are fairly unlikely to change in that context. We can then tell the JIT that these are The interesting thing is that I haven't really spent that long optimising the Converge JIT: perhaps a man week in the early days (when I was trying to get a high level picture of RPython) and around two man weeks more recently. As a rough metric, I found that each JIT optimisation I was doing was giving me a roughly 5-10% speedup (though the per-function stack change was much more profitable): the cumulative effect was quite pronounced. Admittedly, I now suspect that I've now picked most of the low-hanging fruit; improving performance further will require increasingly more drastic action (much of it in the Converge compiler, which is likely to prove rather harder to change than the VM). Fortunately, I'm adequately happy with performance as it is. The end result of these optimisations is that the traces produced by the Converge VM are often very efficient: see this optimised trace (randomly chosen — I'm not even sure what piece of code it represents). What's astonishing is that between promoting values, eliding function calls, and optimising traces, many bytecodes now have little or no code attached to them. What particularly amazes me is how one of Converge's most crippling features from an efficiency point of view, is now handled. Failure is how an Icon-base expression evaluation system allows limited backtracking. In my paper on Converge's Icon inheritance, I noted that the Tracing JIT issuesTracing JITs are relatively new and have some limitations, at least based on what we currently know. Mozilla, for example, removed their tracing JIT a few months back, because while it's sometimes blazingly fast, it's sometimes rather slow. This is due to a tracing JIT optimising a single code-path at a time: if a guard fails, execution falls back to the (very slow) tracing interpreter for the remainder of that bytecode (which could be quite long), and then back to the language interpreter for subsequent bytecodes. Code which tends to take the same path time after time benefits hugely from tracing; code which tends to branch unpredictably can take considerable time to derive noticeable benefits from the JIT.The real issue is that we have no way of knowing which code is likely to branch unpredictably until it actually does so. A real program which does this is the Converge compiler: in several points it walks over an AST, calling a function
Figure 9: Inlining code with a call f(6).
In my limited experience, the inherent ability of a tracing JIT to inline code can exacerbate this issue. Consider the simple program in Figure 9. If the tracing JIT disables inlining, the trace looks as in the middle column: the call to However, while inlining is generally a big win, it can sometimes be a big loss. This is chiefly due to the fact that inlining leads to significantly longer traces. Traces are slow to create (due to the tracing interpreter), so the longer we trace, the greater the overhead we impose on the program. If the trace is later used frequently, and in the exact manner it was recorded, the relative cost of the overhead will reduce over time. If the trace is little used, or if guards fail regularly within it, the overhead can easily outweigh the gains. Unfortunately, there's no obvious way to predict when inlining will be a win or loss, because we can't see into a program's future execution patterns. As a heuristic to somewhat counter this problem, RPython has an (adjustable) upper limit to traces: if too long, they are aborted, and the subsequent trace will turn inlining off. This helps somewhat, but what a good value for With luck, future research will start to whittle away at tracing JITs weaknesses. However, it seems likely that, in the medium term at least, most hand-crafted VMs will remain method-based (referred to hereon as How fast can it go?Something that's currently unclear is how fast one can reasonably expect an RPython VM to go. The best guide we currently have to the achievable speed of RPython VMs is PyPy itself. Although it seems that most of the easy wins have now been applied, it's still getting faster (albeit the rate of gains is slowing down), and, more importantly, is increasingly giving good performance for a range of real programs. The PyPy speed centre is an instructive read. At the time of writing PyPy is a bit over 5 times faster than CPython for a collection of real-world programs; for micro-benchmarks it can be a couple of orders of magnitude quicker.It's clear that, in general, an RPython VM won't reach the performance of something like HotSpot, which has several advantages: the overall better performance of method-based JITs; the fact that it's hand-coded for one specific class of languages; and the sheer amount of effort put into it. But I'd certainly expect RPython VMs to get to comfortably within an order of magnitude performance levels as HotSpot. Time will tell, and as people write RPython VMs for languages like Java, we'll have better points of comparison. RPython issuesFrom what I've written above, I hope you get some sense of how interesting and exciting RPython is for the language design and implementation community. I also hope you get a sense of how impressed I am with RPython. Because of that, I feel able to be frank and honest about the limitations and shortcomings of the approach.The major problem anyone creating a VM in RPython currently faces is documentation or, more accurately, a lack of it. There are various papers and fragments of documentation on RPython, but they're not yet pulled together into a coherent whole. New users will struggle to find either a coherent high-level description or descriptions of vital low-level details. Indeed, the lack of documentation is currently enough to scare off all but the most dedicated of language enthusiasts. As such a dedicated enthusiast, I got a long way with Part of the problem probably stems from the strict adherence of the PyPy / RPython development process to Test Driven Development (TDD). PyPy / RPython has roughly the same amount of code for the main system as for the tests. Although this would not have been my personal choice, it appears to have served the PyPy / RPython development process extremely well. It's impossible not to admire the astonishing evolution of the project and the concepts it has developed; TDD must have played an important part in this. Indeed, although the Converge VM has inevitably uncovered a few bugs in PyPy, they have been surprisingly few and far between. Unfortunately, the prioritisation of tests seems to have been at the expense of documentation. As I rapidly discovered – to the initial bemusement of the RPython developers – tests are a poor substitute for documentation, typically combining a wealth of low-level detail with a lack of any obvious high-level intent. In short, with so many tests, it's often impossible to work out what is really being tested or why. I certainly struggled with this lack of documentation: I suspect I could have shaved at least a third off of my development time if RPython was as well documented as other language projects. Fortunately the PyPy chaps are aware of this, and there are now open issues to resolve this, and I hope my experiences will feed in to that. As alluded to above, PyPy / RPython have, since their original conception, changed to a degree unmatched by any other project I can think of. The PyPy project started off as an The The massive evolution PyPy / RPython have taken also has implications for the implementation. In short, RPython does not just have an experimental past, it has an experimental present. The translator is littered with a vast number of assertions, many of which can be triggered by seemingly valid user programs. These are often hard to resolve: one is left wondering what a particular assertion has to do with the program that triggered it. Occasionally, if one is really lucky, an assertion has an associated comment which says something like Because every RPython program is also a valid Python program, RPython VMs can also be run using CPython or PyPy — The alternative is full translation. RPython is a Whole program translation may seem an odd decision, but it has a vital use: RPython uses Python as its compile-time meta-programming language. Basically, the RPython translator loads in an RPython VM and executes it as a normal Python program for as long as it chooses. Once that has finished, the translator is given a reference to the As a final matter, RPython is not just restricted to generating C: at various points it has also had JVM, CLR, and LLVM backends (though, at the time of writing, none of these is currently usable). RPython has thus tried to create its own type system to abstract away the details of these backends, not entirely successfully. This is not a fault unique to RPython. As anyone who's tried porting a C program to a number of platforms will attest, there is no simple set of integer types which works across platforms. Unfortunately, RPython's history of multiple backends and only semi-successful attempts to abstract away low-level type systems means that it has at least 5 type systems for various parts (some of which, admittedly, are hidden from the user). Not only does each have different rules, but the most common combination ( The futureRPython, to my mind, is an astonishing project. It has, almost single-handedly, opened up an entirely new approach to VM implementation. As my experience shows, creating a decent RPython VM is not a huge amount of work (despite some frustrations). In short: never again do new languages need come with unusably slow VMs. That the the PyPy / RPython team have shown that these ideas scale up to a fast implementation of a large, real-world language (Python) is another feather in their cap.An important question is whether the approach that RPython takes is so unique that it is the only possible tool one can imagine using for the job. As my experience with RPython has grown, the answer is clearly If you've got this far, congratulations: it's been a long read, I know! This article is so long because its subject is so worthy. I am a curmudgeon and I find most new developments in software to be thoroughly uninteresting. RPython is different. It's the most interesting thing I've seen in well over a decade. Exactly what its ramifications will be is something that only time can tell, but I think they will be two fold. First, I think new languages will suddenly find themselves able to compete well enough with existing languages that they will be given a chance: I hope this will encourage language designers to experiment more than they have previously felt able. Second, the Acknowledgements: The RPython developers have been consistently helpful and the new VM wouldn't have got this far without valuable help from Carl Friedrich Bolz and Armin Rigo in particular: Maciej Fijalkowski and others on the PyPy IRC channel have also been extremely helpful. Martin Berger, Carl Friedrich Bolz, and Armin Rigo also gave insightful comments on this article. Any remaining errors and infelicities are, of course, my own. Problems with Software 3: Creating Crises Where There Aren't AnyJune 28 2011 [This is number 3 in an occasional series ‘Problems with Software’; read the previous article here.]As an undergraduate in the late 1990s, the part of my course I liked the least was Software Engineering, which started with obviously silly 1970s practices, and went downhill from there. As someone who was already familiar with both programming and, to some extent, the software industry, I pooh-poohed the course; my arrogance was duly rewarded with the lowest mark of any part of my degree, which taught me a useful life lesson. One previously unfamiliar concept which this dreadful course introduced to me was the ‘software crisis’. First promulgated in the late 1960s and early 1970s, it captured the notion that software was nearly always over-budget, behind schedule, and of low quality. Those with an appreciation of history will know that most of the software technologies (programming languages, operating systems, and so on) we now take for granted took form around this time. As with any new field, this was an exciting time, with no technical precedent: none of its participants would have been able to predict the path forward. Rather, experimentation with new ideas and tools led to successes and failures; the latter received greater attention, leading to the idea of a ‘software crisis’. To the best of my knowledge, the ‘software crisis’ was the first wide-spread moment of self-doubt in the history of computing. It reflected the idea that the problems of the day were huge and, quite possibly, perpetual. It is this latter point that, with the benefit of hindsight, looks hopelessly naive: by the early 1990s (and, arguably, many years before), software was demonstrably and continually improving in quality and scope, and has continued doing so ever since. While the massive improvements in hardware certainly helped (by the mid 1990s, hardware imposed constraints only on the most demanding of software), it was surely inevitable that, as a community, we would become better at creating software, given experience. Mercifully, I have not found myself in a discussion about the ‘software crisis’ for 5 or more years; it seems finally to have been put to rest, albeit many, many years after it had any possible relevance. However it seems that, in software, we abhor a crisis vacuum: as soon as one has disappeared, others arise in its place. I've seen a few over the last few years including a ‘skills crisis’ and an ‘outsourcing crisis’, all minor problems at best. However, one ‘crisis’ has gained particularly widespread attention in the last few years: the ‘multi-core crisis’. A quick historical detour is instructive. Although – in accordance with Gordon Moore's famous observation – the number of transistors on a CPU has doubled every 2 years for the last 50 years, by the mid part of this decade, it was proving increasingly difficult to make use of extra transistors to improve the performance of single-core (i.e. traditional) processors. In short, the easy performance increases that the software industry had indirectly relied upon for many decades largely disappeared. Instead, hardware manufacturers used the extra transistors to produce multi-core processors. The thesis of the ‘multi-core crisis’ is that software has to make use of the extra cores, yet we currently have no general techniques or languages for writing good parallel programs. Assuming, for the time being, that we accept that this might be a crisis, there are two things which, I think, we can all agree upon. First, virtually all software ever written (including virtually all software written today) has been created with the assumption of sequential execution. Second, although we feel intuitively that many programs should be paralellisable, we do not currently know how to make them so. In a sense, both these statements are closely related: most software is sequential precisely because we don't know how to parallelise it. One question, which those behind the ‘multi-core crisis’ do not seem to have asked themselves, is whether this will change. For it to do so, we would need to develop languages or techniques which would allow us to easily write parallel software. Broadly speaking, only two approaches have so far seen wide-spread use: processes and threads. The processes approach breaks software up into chunks, each of which executes in its own largely sealed environment, communicating with other chunks via well-defined channels. It has the advantage that it is relatively easy to understand each process; and that processes can not accidentally cause each other to fail. On the downside, processes must be large: the approach is not suited to fine levels of granularity, ruling out most potential uses. The threads approach is based on parallel execution within a single process, using atomic machine instructions to ensure synchronisation between multiple threads. It has the advantage of allowing fine grained parallelisation. On the downside it is notoriously difficult to comprehend multi-threaded programs, and they are typically highly unreliable because of incorrect assumptions surrounding state shared between threads. In short, processes and threads are not without their uses, but they will never get us to parallel nirvana. A number of other approaches have been proposed but have yet to prove themselves on a wide scale. Transactional memory, for example, has serious efficiency problems and is conceptually troubling with respect to common operations such as input / output. There also seems to be an interesting tradeoff between parallel and sequential suitability: languages such as Erlang, which are, relatively speaking, good at parallelism, often tend to be rather lacking when it comes to the mundane sequential processing which is still a large part of any ‘parellel’ system. However, all this is incidental detail. The point about a crisis is not about whether we have solutions to it or not, but whether it is a real, pressing problem. The simple answer is, for the vast majority of the population, “no”. Most peoples computing requirements extend little beyond e-mail, web, and perhaps word processing or similar. For such tasks – be they on a desktop computer or running on a remote server – computers have been more than powerful enough since the turn of the century. If we were able to increase sequential execution speed by an order of magnitude, most people wouldn't notice very much: they are already happy with the performance of their processors, so increasing it again won't make much difference (most users will see far more benefit from an SSD than a faster processor). In those areas where better parallel performance would be a big win – scientific computing, for example, or computer games – people are prepared to pay the considerable premium currently needed to create parallel softwre. Ultimately, we may get better at creating parallel software, but most users' lives will tick on merrily enough either way. The ‘software crisis’ was bound to solve itself given time and the ‘multi-core crisis’ is no such thing: just because we've been given an extra tool doesn't mean that we're in crisis if we don't use it. Eventually (and this may take many years), I suspect people will realise that. But, I have little doubt, it will be replaced by another ‘crisis’, because that seems to be the way of our subject: no issue is too small not to be thought of as a crisis. The irony of our fixation on artificially created crises is that deep, long-term problems – which arguably should be considered crises – are ignored. The most important of these surely relates to security — too much of our software is susceptible to attack, and too few users understand how to use software in a way that minimises security risks. I have yet to come across any major organisation whose computing systems give me the impression that they are truly secure (indeed, I have occasionally been involved in uncovering trivial flaws in real systems). That to me is a real crisis — and it's one that's being exploited every minute of every day. But it's not sexy, it's not new, and it's hard: none of which makes it worthy, to most people, of being thought of as a crisis. Problems with Software 2: Failing to Use the Computing LeverJune 7 2011 [This is number 2 in an occasional series ‘Problems with Software’; read the previous article here.]We all know at least one and I make no apologies for the picture I now paint. Hunch backed, tongue out, index fingers locked rigidly into an uncomfortable ‘r’ shape, and with no sense of the appropriate level of force to put into their actions. The neanderthal figures I refer to are, of course, two fingered typists. In my experience, a typical two fingered typist will struggle to type 30 Words Per Minute (WPM). An average touch typist will manage at least 70 WPM; and for those prepared to put in a bit of effort in, 90-100 WPM is eminently achievable. Whether high typing speeds are useful depends on the person and the tasks they need to achieve. Agricultural workers, for example, probably won’t see a huge return on investment in typing skills. What astonishes me is that I know people who spend their entire working day in front of a computer, day in, day out, yet who still use this grossly inefficient technique. The slow typists I know easily lose a few hours every week due to their poor technique — yet suggesting that a few days spent learnt touch typing will quickly pay off in increased productivity is inevitably met with a reply of “I don’t have the time”. Poor typing technique is a concrete example of a common human tendency — to stop learning as soon as one has a way of performing a task, regardless of whether that is the best way. In typing terms, the slowest typist is perhaps 5 times slower than the fastest: in other areas of human activity, the ratio can be much greater. One of those fields is computing and, by implication, software. At this point, it is instructive to ask oneself a simple question: what is a computer? The standard answers run along the lines of “a CPU, some RAM, a disk”, “a keyboard and a monitor”, or “an operating system and user software.” While correct in their own way, such answers report the mechanisms involved without considering their purpose. The answer I prefer to this question is more abstract: computers are levers. Archimedes, the first person to explain how levers work, famously said “Give me a place to stand, and I will move the Earth.” When asked to prove this by his King, Archimedes used a lever to move a ship single-handedly, something impossible for even the strongest man to do unaided. One of the first real uses of a computer shows how effective a lever they can be. The semi-programmable British Colossus computer of the mid 1940s was able to perform thousands of comparisons per second, cracking the seemingly unbreakable German Enigma code. What the code-breakers of Bletchley Park had realised was that a computer could perform simple, repetitive tasks at a speed impossible for humans. Operations that had taken a group of people several weeks to complete could be done by Colossus in half an hour: such was Colossus’s speed that, once operational, it played a significant part in shortening the course of the war. It is little exaggeration to say that computers have developed into the longest levers available to man. Since Colossus, the rate of progress has been little short of astonishing, and computers have been used to do things that were previously unimaginable: from weather prediction to visual special effects, from DNA sequencing to search engines. Yet, a lever is only truly effective if used from its end: gripped near the pivot, a lever is, in effect, shortened, and its force magnifying effect reduced. Regrettably, most people grip the computing lever extremely close to the pivot. There are many reasons for this and enumerating them all would take a small book. However, a few examples give an indication of the problem.
Problems with Software 1: Confusing Problems Whose Solutions Are Easy to State With Problems Whose Solutions Are Easy to RealiseApril 19 2011 I love computing, in particular software: since I was young, I have devoted the major portion of my life to it. In return, I have gained access to a never‐ending source of challenges; met many great people; and had my ego smashed on the jagged rocks that are off‐by‐one errors. None of this makes me very well qualified to write an occasional series of articles onProblems with Software, but I will do my best. The spirit in which I write these is a positive one, perhaps akin to a parent admonishing a child for his own long‐term benefit. Of course, I am, and always be, a student of software, not its parent; searching, learning, and in awe of this ever‐evolving subject. With that in mind, let us begin. I start with what I consider to be the most common mistake in software; a mistake that no other subject I know of commits so frequently. It is the confusion of problems whose solutions are easy to state with problems whose solutions are easy to realise. A non‐computing example may help explain this. The problem is war: around the world, it ruins the lives of countless people. The easily stated solution is that we should stop people from fighting each other. Regretfully, we know that this solution, while correct, is not worth the space it takes on the page. Telling people they should stop fighting changes nothing, and physically stopping them requires resources far beyond that which can be mustered. None of this is to say that it is not worth tackling the problem of war: but most of us have enough common sense to realise that solutions (for there will surely need to be more than one) will be expensive, long‐term, and come with no guarantee of success. Alas, in software, such common sense is an uncommon thing. In our subject, unworkable solutions are frequently proposed to deep problems; funding is acquired; manpower deployed; and, after an exhausting death march, failure guaranteed. The problem most frequently invoked is the difficulty of creating good software. This clearly is a problem: too much software is expensive, unreliable, and ill‐suited to the task it was created for. The many easily‐stated solutions proposed have had, at best, a tiny effect; at worst, they have impeded progress by suppressing less exciting, but more realistic, truths. Two concrete examples exemplify this. The first example is Aspect Orientated Programming (AOP). The problem that AOP addresses is this: developing a program is hard because all of its constituents aspects — even those which are in no way related — must be considered together at all times. Imagine a health‐care system which records patient data and has separate modules for doctors and patients. One aspect is server communication; both doctor and patient units need to communicate with a central server and cope with certain error conditions. This aspect is jumbled up with, and scattered across, the rest of the program, making it hard to check that all server interactions will be performed as expected. The AOP solution is then simple: rather than developing software monolithically, it should be developed as separate aspects, which can be worked on in isolation, before being composed together to produce the end result. There is no doubt that the AOP solution is intuitive and appealing: the jumbled and scattered nature of current software is a curse, with many deleterious effects. I can remember when I first came across AOP. My response then is my response now (though my argument, I hope, is somewhat better stated these days): it can't work. Furthermore, a simple analogy shows why it can't work. Imagine a hill, composed, horizontally, of different rock strata. Strata can be taken out and put back in, without ever changing the inherent ‘hillness’ of the hill—a hill is still identifiably a hill even if one strata is taken out (though it might have a distinct kink in it) and each strata makes sense in and of itself, whether it is in the hill or not. In this analogy, aspects are the strata; for AOP to work, it is necessary to imagine being able to take a piece of software, pull out an aspect, and have that aspect be coherent in and of itself. It is a beautiful vision, and one which would change programming. Let us now take a different analogy. Instead of a hill, imagine a gravel drive, consisting of hundreds of thousands of tiny stones of varying colours: grey, honey, black, white, and so on. Even Don Quixote would blanch if I asked him to extract all the honey coloured stones, put them to one side, and then later to put them all back in the same place. The unfortunate reality is that software necessarily resembles a gravel drive—just as different coloured stones exist alongside each other, so do `aspects' of a program. The server communication code in the health care system is likely, for example, to form parts of nested The second example is Model Driven Architecture (MDA). For my sins, I played a small part in this monstrosity; in my defence I was young, naive, and needed the money, though such pitiful excuses are unlikely to help me come the day of reckoning. The problem that MDA aims to address is the following: developing software is hard because programming languages force one to work at a level of abstraction that is not much above that of machine code. Indeed, this is a problem; while a high‐level description of a business's requirements for its new system may be a few pages long, the corresponding software may have tens or hundreds of thousands of lines of code. Relating the mess of a low‐level program back to the high‐level requirements is a skill that few people successfully acquire. The MDA solution is then simple: development should be done with high‐level (mostly UML) diagrams. Common sense quickly suggests that this approach can never work. Diagrams are wonderful for expressing static relationships, but, overall, terrible for expressing most dynamic behaviour. Looping behaviour is elegantly captured by a textual What's most frustrating about AOP and MDA is that while both correctly identify real problems, the solutions they present are appealing in their simplicity yet self‐evidently unviable. Huge amounts of money and time have been invested in both, with little prospect of meaningful reward. Indeed, I am not sure that a good solution could exist for either problem: they seem to me largely intractable and inevitable. Of course, this is not to say that people shouldn't try to come up to solutions to these and other such problems—if whinging old codgers like me always had our way, we'd still be living in dark, damp caves and laughing at Barry from the cave next door while he works on his silly sounding ‘wheel’ concept. Ours is a ‘new’ subject, a vast, unexplored continent, and optimism is a vital component of the makeup of successful settlers. But as vital as optimism is realism—expending precious resources on hopeless causes is a guaranteed way to doom settlement. A little dose of common sense — thinking through a couple of common cases, at the very least — when prevented with a wonderful sounding solution to a hard problem would do our subject a lot of good. Other subjects seem to manage it, and I don't see why in software we can't too. Parsing: The Solved Problem That Isn'tMarch 15 2011 Parsing is the act of taking a stream of characters and deducing if and how they conform to an underlying grammar. For example the sentenceBill hits Benconforms to the part of the English grammar noun verb noun. Parsing concerns itself with uncovering structure; although this gives a partial indication of the meaning of a sentence, the full meaning is only uncovered by later stages of processing. Parseable, but obviously nonsensical, sentences like Bill evaporates Benhighlight this (the sentence is still noun verb noun, but finding two people who agree on what it means will be a struggle). As humans we naturally parse text all the time, without even thinking about it; indeed, we even have a fairly good ability to parse constructs that we've never seen before.
In computing, parsing is also common; while the grammars are synthetic (e.g. of a specific programming language), the overall idea is the same as for human languages. Although different communities have different approaches to the practicalities of parsing - C programmers reach for After the creation of programming languages themselves, parsing was one of the first major areas tackled by theoretical computer science and, in many peoples eyes, one of its greatest successes. The 1960s saw a concerted effort to uncover good theories and algorithms for parsing. Parsing in the early days seems to have shot off in many directions before, largely, converging. Context Free Grammars (CFGs) eventually Unfortunately, given the extremely limited hardware of 1960s computers (not helped by the lack of an efficient algorithm), the parsing of an arbitrary CFG was too slow to be practical. Parsing algorithms such as LL, LR, and LALR identified subsets of the full class of CFGs that could be efficiently parsed. Later, relatively practical algorithms for parsing any CFG appeared, most notably Earley's 1973 parsing algorithm. It is easy to overlook the relative difference in performance between then and now: the fastest computer in the world from 1964-1969 was the CDC6600 which executed at around 10 MIPS; my 2010 mobile phone has a processor which runs at over 2000 MIPS. By the time computers had become fast enough for Earley's algorithm, LL, LR, and friends had established a cultural dominance which is only now being seriously challenged - many of the most widely used tools still use those algorithms (or variants) for parsing. Nevertheless in tools such as ACCENT / ENTIRE and recent versions of The general consensus, therefore, is that parsing is a solved problem. If you've got a parsing problem for synthetic languages, one of the existing tools should do the job. A few heroic people - such as Terence Parr, Adrian Johnstone, and Elizabeth Scott - continue working away to ensure that parsing becomes even more efficient but, ultimately, this will be transparently adopted by tools without overtly changing the way that parsing is typically done. Language compositionOne of the things that's become increasingly obvious to me over the past few years is that the general consensus breaks down for one vital emerging trend: language composition.Compositionis one of those long, complicated, but often vague terms that crops up a lot in theoretical work. Fortunately, for our purposes it means something simple: grammar composition, which is where we add one grammar to another and have the combined grammar parse text in the new language (exactly the sort of thing we want to do with Domain Specific Languages (DSLs)). To use a classic example, imagine that we wish to extend a Java-like language with SQL so that we can directly write:
for (String s : SELECT name FROM person WHERE age > 18) {
...
}
Let's assume that someone has provided us with two separate grammars: one for the Java-like language and one for SQL. Grammar composition seems like it should be fairly easy. In practice, it turns out to be rather frustrating, and I'll now explain some of the reasons why.
Grammar compositionWhile grammar composition is theoretically trivial, simply squashing two grammars together is rarely useful in practice. Typically, grammars have a single start rule; one therefore needs to choose which of the two grammars has thestartrule. More messy is the fact that the chances of the two grammars referencing each other is slight; in practice, one needs to specify a third tranche of data - often referred to, perhaps slightly misleadingly, as glue- which actually links the two grammars together. In our running example, the Java-like language has the main grammar; the glue will specify where, within the Java-like expressions, SQL statements can be referenced. For those using old parsing algorithms such as LR (and LL etc.), there is a more fundamental problem. If one takes two LR-compatible grammars and combines them, the resulting grammar is not guaranteed to be LR-compatible (i.e. an LR parser may not be able to parse using it). Therefore such algorithms are of little use for grammar composition. At this point, users of algorithms such as Earley's have a rather smugger look on their face. Since we know from grammar theory that unioning two CFGs always leads to a valid CFG, such algorithms can always parse the result of grammar composition. But, perhaps inevitably, there are problems. TokenizationParsingis generally a two-phase process: first we break the input up into tokens ( tokenization); and then we parse the tokens. Tokens are what we call wordsin everyday language. In English, words are easily defined (roughly: a word starts and ends with a space or punctuation character). Different computer languages, however, have rather different notions of what their tokens are. Sometimes, tokenization rules are easily combined; however since tokenization is done in ignorance of how the token will later be used, sometimes it is difficult. For example, in SQL SELECTmight be a keyword but in Java it is also a valid identifier; it is often hard, if not impossible, to combine such tokenization rules in traditional parsers. Fortunately there is a solution: scannerless parsing (e.g. SDF2 scannerless parsing). For our purposes, it might perhaps better be called Fine-grained compositionIn practice, the simplegluementioned earlier used to combine two grammars is often not enough. There can be subtle conflicts between the grammars, in the sense that the combined language might not give the result that was expected. Consider combining two grammars that have different keywords. Scannerless parsing allows us to combine the two grammars, but we may wish to ensure that the combined languages do not allow users to use keywords in the other language as identifiers. There is no easy way to express this in normal CFGs. The SDF2 paper referenced earlier allows rejectproductions as a solution to this; unfortunately this then makes SDF2 grammars mildly context sensitive. As far as I know, the precise consequences of this haven't been explored, but it does mean that at least some of the body of CFG theory won't be applicable; it's enough to make one a little nervous, at the very least (not withstanding the excellent work that has been created using the SDF2 formalism by Eeclo Visser and others). A recent, albeit relatively unknown, alternative are boolean grammars. These are a generalization of CFGs that include conjunction and negation, which, at first glance, are exactly the constructs needed to make grammar composition practical (allowing one to say things like AmbiguityBy severely restricting what CFGs they accept, grammars which are compatible with traditional parsing algorithms (LL, LR etc.) are always unambiguous (though, as we shall see, this does not mean that all the incompatible grammars are ambiguous: many are unambiguous). Grammar ambiguity is thus less widely understood than it might otherwise have been. Consider the following grammar of standard arithmetic:
E ::= E "+" E
| E "-" E
| E "/" E
| E "*" E
Using this grammar, a string such as 2 + 3 * 4 can be parsed ambiguouslyin two ways: as equivalent to (2 + 3) * 4; or as equivalent to 2 + (3 * 4). Parsing algorithms such as Earley's will generate all possibilities even though we often only want one of them (due to arithmetic conventions, in this case we want the latter parse). There are several different ways of disambiguating grammars, such as precedences (in this example, higher precedences winin the face of ambiguity):
E ::= E "+" E %precedence 1
| E "-" E %precedence 1
| E "/" E %precedence 2
| E "*" E %precedence 3
This might suggest that we can tame ambiguity relatively easily: unfortunately, parsing theory tells us that the reality is rather tricky. The basic issue is that, in general, we can not statically analyse a CFG and determine if it is ambiguous or not. To discover whether a given CFG is ambiguous or not we have to try every possible input: if no input triggers an ambiguous parse, the CFG is not ambiguous. However this is, in general, impractical: most CFGs describe infinite languages and can not be exhaustively tested. There are various techniques which aim to give good heuristics for ambiguity (see Bas Basten's masters thesis for a good summary; I am also collaborating with a colleague on a new approach, though it's far too early to say if it will be useful or not). However, these heuristics are inherently limited: if they say a CFG is ambiguous, it definitely is; but if they can not find ambiguity, all they can say is that the CFG might be unambiguous.
Since theoretical problems are not always practical ones, a good question is the following: is this a real problem? In my experience thus far, defining stand-alone grammars for programming languages using Earley parsing (i.e. a parsing algorithm in which ambiguity is possible), it's not been a huge problem: as the grammar designer, I often understand where dangerous ambiguity might exist, and can nip it in the bud. I've been caught out a couple of times, but not enough to really worry about. However, I do not think that my experience will hold in the face of wide-spread grammar composition. The theoretical reason is easily stated: combining two unambiguous grammars may result in an ambiguous grammar (which, as previously stated, we are unlikely to be able to statically determine in general). Consider combining two grammars from different authors, neither of whom could have anticipated the particular composition: it seems to me that ambiguity is much more likely to crop up in such cases. It will then remain undetected until an unfortunate user finds an input which triggers the ambiguity. Compilers which fail on seemingly valid input are unlikely to be popular. PEGsAs stated earlier, unambiguous parsing algorithms such as LL and LR aren't easily usable in grammar composition. More recently, arediscoveredparsing approach has gathered a lot of attention: Packrat / PEG parsing (which I henceforth refer to as PEGs). PEGs are different than everything mentioned previously: they have no formal relation to CFGs. The chief reason for this is PEGs ordered choice operator, which removes any possibility for ambiguity in PEGs. PEGs are interesting because, unlike LL and LR, they're closed under composition: in other words, if you have two PEGs and compose them, you have a valid PEG. Are PEGs the answer to our problems? Alas - at least as things stand - I now doubt it. First, PEGs are rather inexpressive: like LL and LR parsing, PEGs are often frustrating to use in practise. This is, principally, because they don't support left recursion; Alex Warth proposed an approach which adds left recursion but I discovered what appear to be problems with it, though I should note that there is not yet a general consensus on this (and I am collaborating with a colleague to try and reach an understanding of precisely what left recursion in PEGs should mean). Second, while PEGs are always unambiguous, depending on the glue one uses during composition, the ordered choice operator may cause strings that were previously accepted in the individual languages not to be accepted in the combined language - which, to put it mildly, is unlikely to be the desired behaviour. ConclusionsIf you've got this far, well done. This article has ended up much longer than I originally expected - though far shorter than it could be if I really went into detail on some of these points! It is important to note that I am not a parsing expert: I only ever wanted to be a user of parsing, not - as I currently am - someone who knows bits and pieces about its inner workings. What's happened is that, in wanting to make greater use of parsing, I have gradually become aware of the limitations of what I have been able to find. The emphasis is ongradually: knowledge about parsing is scattered over several decades (from the 60s right up to the present day); many publications (some of them hard to get hold of); and many peoples heads (some of whom no longer work in computing, let alone in the area of parsing). It is therefore hard to get an understanding of the range of approaches or their limitations. This article is my attempt to write down my current understanding and, in particular, the limitations of current approaches when composing grammars; I welcome corrections from those more knowledgeable than myself. Predicting the future is a mugs game, but I am starting to wonder whether, if we fail to come up with more suitable parsing algorithms, programming languages of the future that wish to allow syntax extension will bypass parsing altogether, and use syntax directed editing instead. Many people think parsing is a solved problem - I think it isn't.
|
| Home > Technical Articles | e-mail: laurie@tratt.net github: ltratt twitter: @laurencetratt |
| Copyright © 1995-2012 Laurence Tratt | |