|Home > Technical articles||e-mail: email@example.com github: ltratt twitter: @laurencetratt|
February 8 2012
Updated (February 07 2013): If you enjoy this article, you may also find the more academically-inclined article The Impact of Meta-Tracing on VM Design and Implementation interesting.
[This is a long article because it covers a lot of ground that is little known by the computing mainstream, and easily misunderstood. If you're familiar with the area, you may find yourself wanting to skip certain explanatory sections.]
This need for a corresponding language implementation leads to what I think of as the language designer's dilemma: how much implementation is needed to show that a language design is good?
Finding a balance between these two points is difficult. When I was designing Converge, I came down on the side of trying to get the design right (though the end result has more than its fair share of mistakes and hacks). That decision had consequences, as I shall now describe.
Putting VM effort into contextI implemented two Virtual Machines (VMs) for Converge, both in C: the first was epically terrible; the second (introduced in Converge 1.0) merely awful. Even the second is extremely, sometimes almost unusably, slow: it is roughly 5-10x slower than CPython, tending towards the slower end of that range. This partly reflects my lack of experience and knowledge about VM implementation when I started; but it also reflects the fact that the Converge VM was a secondary concern to the language design. Despite that, I estimate that I spent around 18 man months on the second VM (intertwined with a lot of non-VM work). If this seems a lot of effort to spend on such a poor quality VM, it's worth bearing a few things in mind: Converge has a hard to optimise expression evaluation system based on Icon (see this paper for more details); C, whilst a fun language in many ways, doesn't lend itself to the fast writing of reliable programs; and most real-world VMs have had a lot more effort put into them.
As rough comparisons, CPython (the de-facto standard for Python, so named because it is written in C) has probably had a couple of orders of magnitude more effort put into it; and Java's HotSpot roughly three orders of magnitude more. It's not surprising that the second C Converge VM doesn't do well in such a comparison — in comparison, it's not so much a minnow as a single-celled organism.
What all this means is that the second C Converge VM is so slow that during demos of some of Converge's advanced features, I learned to make winding up gestures at the side of my laptop to amuse the audience. Even I was not sure whether its combination of features could ever be made adequately efficient.
Other approachesWhy did I write my own VM and not use someone else's? The choices are instructive.
The traditional route for big compilers (e.g. gcc) is to output machine code (perhaps via assembler). The output is efficient, but the compiler itself requires a large amount of effort. For example, simply learning the intricacies of a processor like the x86 sufficiently well to generate efficient code isn't for the faint of heart. In short, the amount of effort this approach demands is generally prohibitive.
An alternative to generating machine code directly is to generate C code and have that compiled into machine code. Several compilers have taken this approach over the years (from Cfront to the original Sather compiler). While still relatively difficult to do, it is certainly much easier than generating machine code directly. However it can often lead to a poor experience for users: not only must they pay the costs of double-compilation, but the translation typically loses a large quantity of useful debugging information (something which Converge pays special attention to). With few exceptions (one of which we'll see later), this approach is now rare.
Perhaps the obvious choice is to use an existing VM. The two Goliath's are the JVM (e.g. HotSpot) and Microsoft's CLR (the .NET VM). When I started work on Converge, the latter (in the form of Mono) didn't run on OpenBSD (my platform of choice), so was immediately discounted. HotSpot, however, remained a possibility because of its often stunning performance levels.
The reason that I couldn't use it isn't really HotSpot specific. Rather, it is something inherent to VMs: they reflect the languages, or group of languages, they were designed for. If a language fits within an existing VMs mould, that VM will probably be an excellent choice; if not, the semantic mismatch between the two can be severe. Of course, given sufficient will-power, any programming language can be translated to any other: in practise, the real issues are the ease of the translation and the efficiency of programs run using it. Two examples at opposite ends of the spectrum highlight this. Jython (Python running on the JVM) is a faithful implementation of Python; but even with the power of HotSpot behind it, Jython almost never exceeds CPython in performance, and is generally slower because some features (mostly relating to Python's highly dynamic customisability) do not lend themselves to an efficient implementation on the JVM. Scala, on the other hand, was designed specifically for the JVM — to ensure reasonable performance, Scala's language design has in some parts had to be compromised (e.g. due to type erasure).
Whether the semantic mismatch is manageable depends on the particular combination of language design and VM. Converge's unusual expression evaluation mechanism was enough on its own to rule out a practical JVM implementation (Jcon, an implementation of Icon for the JVM is still slower than the old C Icon interpreter, which itself is no speed demon). As Converge's development progressed, a number of other features (e.g. its approach to tracing information) have made it increasingly difficult to imagine how it could be practically wedged atop an existing VM.
A language for JITing VMs
What all the above means is that the options for implementing a language are generally unpalatable: they either require an undue amount of work, or compromises in language design, and, too often, both. My suspicion is that this status quo has severely inhibited programming language research: few groups have had sufficient resources to implement unusual languages well enough to prove them usable.
And then I came across PyPy. To be more accurate, after a few years of vaguely hearing of PyPy, 6 months ago I unexpectedly bumped into a PyPy developer who convinced me that PyPy's time has come. After porting PyPy to OpenBSD, I investigated further. What I've come to realise is that PyPy is two separate things:
Unfortunately the current literature uses
So, what is RPython? The obvious facts about it are that it is strict subset of Python whose programs are translated to C. Every RPython program is a valid Python program (which can be run using a normal Python interpreter), but not vice versa. However, RPython is suitably restricted to allow meaningful static analysis. Most obviously, static types (with a type system roughly comparable to Java's) are inferred and enforced. In addition, extra analysis is performed e.g. to assure that list indices don't become negative. Users can influence the analysis with
RPython would be of relatively little interest if all it did was subset Python and output C. Though a full programming language, RPython is unlikely to be the next
However, in addition to outputting optimised C code, RPython automatically creates a second representation of the user's program. Assuming RPython has been used to write a VM for language L, one gets not only a traditional interpreter, but also an optimising Just-In-Time (JIT) compiler for free. In other words, when a program written in L executes on an appropriately written RPython VM, hot loops (i.e. those which are executed frequently) are automatically turned into machine code and executed directly. This is RPython's unique selling point, as I'll now explain.
Traditional VM implementation
Because RPython is unique, it's easy to overlook what's interesting about it: it took me a couple of months of using it before I had built up an accurate understanding. Looking at the traditional approach to VM implementation is perhaps the easiest way to explain what's interesting about RPython.
First, let me make a couple of assumptions explicit. Languages like C are well suited to being translated directly to machine code, as they are little more than a thin layer over machine code: everything that is
Take a JITing VM such as HotSpot or V8. Such a VM will initially use an interpreter to execute the user's program. Unfortunately for us,
When, during execution, the interpreter in a VM such as HotSpot or V8 spots a frequently executed piece of code, it will hand that chunk of code off to a JIT compiler (along with information about the context within which that code is used) to convert it into machine code. The JIT compiler (henceforth just
Of course, all such problems can be solved with sufficient resources, but these explain in large part why major open source languages like Python and Ruby currently ship with JIT-less VMs (i.e. they have only an interpreter).
JITs for free
What RPython allows one to do is profoundly different to the traditional route. In essence, one writes an interpreter and gets a JIT for free. I suggest rereading that sentence again: it fundamentally changes the economics of language implementation for many of us. To give a rough analogy, it is like moving from manual memory management to automatic garbage collection.
RPython is able to do this because of the particular nature of interpreters. An interpreter, whether it be operating on bytecode or ASTs, is simply a large loop:
In essence, one need only add two function calls to an RPython program to add a JIT. The first function call (
Now is a good time to get an idea of how RPython generates a JIT from just an interpreter (what RPython calls the language interpreter). As said earlier, RPython automatically layers alongside C code a second representation of the interpreter (the tracing interpreter). The details of how the tracing interpreter is stored are irrelevant to us, except to note that it's in a form that a JIT can manipulate (conceptually it could be an AST-like structure). RPython's JIT is a tracing JIT. When a hot loop is detected, a marker is left such that the next time the loop is about to run, the JIT will enter tracing mode. During tracing mode, a complete execution of the loop is performed and all the actions it takes are traced (i.e. recorded) using the tracing interpreter (which is much, much slower than the language interpreter). After the loop has finished, the trace is then analysed, optimised, and converted into machine code. All subsequent executions of the loop will then call the machine code version. Since subsequent executions may diverge from the recorded trace, RPython automatically inserts guards into the machine code to detect divergence from the machine code version's capabilities. If a guard fails at any point, execution falls back to the interpreter.
At this point, it's worth taking a brief side-tour to introduce
Figure 1 shows a high-level example of a tracing JIT for a dynamically typed Python-esque language. Let us assume that the code in the first column is part of a hot loop that the tracing JIT decides is worth converting into machine code. On the next execution of the loop, the initial value of
The first thing the tracing JIT will create is a guard to ensure that the generated machine code is only executed if
While the example in Figure 1 gives a reasonable high-level idea about tracing JITs, it doesn't really explain how the trace is created. RPython badges itself as a meta-tracing system, meaning that the user's end program isn't traced directly (which is what Figure 1 suggests), but rather the interpreter itself is traced. Using the same example code from Figure 1, Figure 2 shows a snippet of the interpreter and the trace of the interpreter that leads to. This trace (though simplified somewhat to make it readable) is indicative of the traces that RPython introduces.
Hopefully the interpreter code in Figure 2 is mostly self-explanatory, being a simple-minded stack-based interpreter. One thing that needs explanation is the
Initially the trace might seem rather difficult to read, but if you think of it as a flattened record of all of the interpreters actions while executing the user's code, it becomes rather easier. The astute reader will notice that the traces are in Single Static Assignment (SSA) form: all assignments are to previously unused variables. While one would probably not want to write in this style, it has many advantages for optimisers, because it trivially exposes the data flow. In normal programs, we are often unable to determine whether a variable
With this knowledge, it should hopefully be fairly simple to see that the trace on the right is simply a record of the instructions the interpreter on the left performed while executing our example program. It's worth thinking about this relationship, as it's key to RPython's approach.
Once one has a handle on the trace, the sheer size of it should be a concern: there's a lot of stuff in there, and if it was converted to machine code as-is, one would get disappointing performance gains (around 40%). It is at this point that RPython's trace optimiser kicks in. I'll now try and give an idea of what RPython's trace optimiser can do.
The first thing that's obviously pointless about the above is the continual reading of bytecode instructions. If we start from a specific point in the program, and all the guards succeed, we know that the sequence of instructions read will always be the same in a given trace: checking that we've really got an
Our trace has now become a fair bit of smaller, but we need it to get a lot smaller still if we want good performance. Fortunately the SSA form of the trace now comes to the fore. We can follow the flow of operations on a given list l: if an
Our trace is now looking much smaller, but we can still make two further, easy optimisations. First, we know that if
Figure 5: The trace with type checks and constant additions folded away.
At last, we have a highly optimised trace which is suitable for conversion to machine code. Not only is it much smaller than the original trace, but it contains far fewer complicated, slow function calls: the resulting machine code will run massively faster than the original interpreter. Trying to produce small traces like this is one of the skills of writing a VM in a tracing JIT. The above example should give you a flavour of how this is done in RPython, though there are many low-level details that can make doing so difficult. As we shall see later, we often need to help the JIT to produce small traces.
A new Converge VMAfter becoming intrigued by the possibilities of RPython, I decided to use it to implement a new VM for Converge. To allow an apples-to-apples comparison, my initial goal was to maintain 100% compatibility with the old VM, so that the same bytecode could run on both the C and RPython versions of the VM. That goal wasn't quite met, although I came extremely close: for quite some time, bytecode for the RPython VM could be run on the C VM (but not vice versa).
First, it's probably useful to give some simple stats about the C VM that I was aiming to replace. It is about 13KLoC (thousand lines of code; I exclude blank lines and purely commented lines from this count). It contains a simple mark-and-sweep garbage collector that is mostly accurate but conservatively collects the C stack (so that VM code doesn't need be clever when dealing with references to objects). It implements full continuations at the C level, copying the C stack to the heap and back again as necessary (this turned out to be much more portable than I had initially feared), so that code at both the VM and Converge program level can be written in similar(ish) style. The VM is ported to a variety of 32 and 64 bit little endian systems: OpenBSD, Linux, OS X, Cygwin, and (native binary) Windows. Overall, the VM works, but is very slow, and, in parts, rather hard to understand. There are no precise records to determine the effort put into it, but I estimate it took between 12 and 24 man months — let's call it 18 man months.
I started work on the new VM on September 1st 2011. Before September 1st I had never used RPython, nor had anyone (to the best of my knowledge) outside the core PyPy group used RPython for a VM of this size (at the time, the Happy VM, which implements a subset of PHP, was the closest comparison). Though the Converge VM is obviously not a big VM, it is beyond merely a toy. It also has some unusual aspects (as touched upon earlier in this article) that make it an interesting test for RPython.
By December 19th 2011 I had a feature-compatible version of the Converge VM, in the sense that it could run every Converge program I could lay my hands on (which, admittedly, is not a huge number). After an initially slow period of development, mostly because of my unfamiliarity with RPython, progress became rapid towards the end. The resulting VM is about 5.5KLoC (compared to 13KLoC for the C VM). I estimate I was able to dedicate around half of my time to the VM during those 4 months (I started a new job on September 1st and then taught a course on a largely unfamiliar subject).
Although the two time estimates (18 man months for the C VM vs. 2-3 man months for the RPython VM) aren't fully comparable, they are useful. While many parts of the RPython VM were a simple translation from the C VM, that itself was partially a reimplementation of a previous C VM (though to a lesser extent). The RPython's VM structure is also substantially different than the C VM (it's far cleaner, and easier to understand), so some aspects of the translation were hardly simple. My best guess is that moving from C (a language which I enjoy a great deal, despite its flaws) to RPython was the single biggest factor. If nothing else, large amounts of the C VM involve faffing about with memory resizing; RPython, as a fully garbage collected language, sweeps all that under the carpet.
Status of the VMThe new Converge VM's source is freely downloadable as are binaries for most major platforms (other than Windows, at the time of writing). Eventually this VM will form part of a Converge 2.0 release, although more testing will be needed before it's reached that point. Before you form your opinions about the new VM, it's worth knowing what it is and isn't. It's not meant to be an industrial strength VM, at least not in its current form. Converge is a language for exploring compile-time meta-programming and domain specific languages. Some things which mainstream programming languages need to care greatly about (e.g. overflow checking; Unicode) are partly or wholly ignored. Such matters are a problem for a later day.
It's also worth knowing that I haven't spent a huge amount of time optimising the new VM. As soon as it got to
PerformanceSo, the RPython VM was created in roughly 1/6 the time it took to create the C VM. What is the performance like? This section will try and give a flavour of the performance, though please note that it's not totally scientific.
The December 19th version of Converge (git hash 84bb9d6064 if you wish to look at it) was already usefully faster than the old C VM. One of my simple benchmarks has long been to time
Looking at output from
How does it perform on more general types of code? In one sense, this is an impossible question, because no two people share the same definition of
I'll start with the
Figure 6: The Stone benchmark.
One thing worthy of note in Figure 7 is the better all-round performance of PyPy compared to Converge: it is substantially faster.
As a final benchmark, and an example of something which programmers need to do frequently, I chose something which neither the old or new Converge VM has had any sort of optimisations for: sorting. Part of the reason why I expected them to do badly is that neither optimises list accesses.
The terrible performance of the old Converge VM in Figure 8 surprised even me. The most likely explanation is that the large number of elements overloads the garbage collector: at a certain point, it can overflow its stack, and performance then degrades non-linearly. I was also surprised by the PyPy figures, with a larger than expected slowdown on the larger number of elements. This appears to be fixed in the nightly PyPy build I downloaded (which is very close to what will be PyPy 1.8): the timings were 0.21s and 2.22s for the small and large datasets respectively.
Although this section has contained a number of very hard figures, one should be careful about making strong claims about performance from such a small set of benchmarks. My gut feeling is that they are over-generous to the new Converge VM, mostly because there are many areas in the VM which have received no optimisation attention at all: if one of those was used repeatedly, performance would suffer disproportionately. I suspect that, rather than appearing to be much faster than CPython 2.7.2 (as above), its performance on a wider set of benchmarks would probably be on a more even par. Even so, that would still be a huge improvement on the old VM. The interesting thing is that most of the performance gains are from RPython: I only made a few relatively easy changes to increase performance, as we shall now see.
Optimising an RPython JITSome RPython VMs lead to a much more efficient JIT than others. The trace optimiser, while clever, is not magic and certain idioms prevent it working to its full potential. The early versions of the Converge VM were naive and JIT-unfriendly: interestingly, I found that a surprisingly small number of tactics hugely improved the JIT.
The first tactic is to remove as many instances of arbitrarily resizable lists as possible. The JIT can never be sure when appending an item to such a list might require a resize, and is thus forced to add (opaque) calls to internal list operations to deal with this possibility. Such calls prevent many optimisations from being applicable (and are relatively slow). When this was first pointed out to me, I was horrified: my RPython VM was fairly sizeable and used such lists extensively. Most noticeably, the Converge stack was a global, resizable list. After a little bit of thought, I realised that it's possible to statically calculate how much stack space each Converge function requires (this patch started the ball rolling). I was then able to move from a global resizable stack to a fixed-size stack per function frame (i.e. the frame created upon each function call; these are called continuation frames in Converge, though that need not concern us here). At this point, the relative ease of developing in a fairly high-level language became obvious. If I had tried to do such a far-reaching change in the C VM, it would have taken at least a week to do. In RPython, it took less than a day.
Some other arbitrarily resizable lists took a little more thought. After a while it became clear that, even though each function now had its own fixed-size stack, the global stack of function frames, stored of course in a resizable array, was becoming a bottleneck. That seemed hard to fix: unlike the stack size needed by a function frame, there is no way to statically determine how deeply function calls might nest. A simple solution soon presented itself: having each function frame store a pointer to its parent removed the need for a list of function frames (see this patch).
The second tactic is to tell the JIT when it doesn't need to include a calculation in a trace at all. The basic idea here is that when creating a trace, we often know that certain pieces of information are fairly unlikely to change in that context. We can then tell the JIT that these are
The interesting thing is that I haven't really spent that long optimising the Converge JIT: perhaps a man week in the early days (when I was trying to get a high level picture of RPython) and around two man weeks more recently. As a rough metric, I found that each JIT optimisation I was doing was giving me a roughly 5-10% speedup (though the per-function stack change was much more profitable): the cumulative effect was quite pronounced. Admittedly, I suspect that I've now picked most of the low-hanging fruit; improving performance further will require increasingly more drastic action (much of it in the Converge compiler, which is likely to prove rather harder to change than the VM). Fortunately, I'm adequately happy with performance as it is.
The end result of these optimisations is that the traces produced by the Converge VM are often very efficient: see this optimised trace (randomly chosen — I'm not even sure what piece of code it represents). What's astonishing is that between promoting values, eliding function calls, and optimising traces, many bytecodes now have little or no code attached to them.
What particularly amazes me is how one of Converge's most crippling features from an efficiency point of view, is now handled. Failure is how an Icon-base expression evaluation system allows limited backtracking. In my paper on Converge's Icon inheritance, I noted that the
Tracing JIT issuesTracing JITs are relatively new and have some limitations, at least based on what we currently know. Mozilla, for example, removed their tracing JIT a few months back, because while it's sometimes blazingly fast, it's sometimes rather slow. This is due to a tracing JIT optimising a single code-path at a time: if a guard fails, execution falls back to the (very slow) tracing interpreter for the remainder of that bytecode (which could be quite long), and then back to the language interpreter for subsequent bytecodes. Code which tends to take the same path time after time benefits hugely from tracing; code which tends to branch unpredictably can take considerable time to derive noticeable benefits from the JIT.
The real issue is that we have no way of knowing which code is likely to branch unpredictably until it actually does so. A real program which does this is the Converge compiler: in several points it walks over an AST, calling a function
In my limited experience, the inherent ability of a tracing JIT to inline code can exacerbate this issue. Consider the simple program in Figure 9. If the tracing JIT disables inlining, the trace looks as in the middle column: the call to
However, while inlining is generally a big win, it can sometimes be a big loss. This is chiefly due to the fact that inlining leads to significantly longer traces. Traces are slow to create (due to the tracing interpreter), so the longer we trace, the greater the overhead we impose on the program. If the trace is later used frequently, and in the exact manner it was recorded, the relative cost of the overhead will reduce over time. If the trace is little used, or if guards fail regularly within it, the overhead can easily outweigh the gains. Unfortunately, there's no obvious way to predict when inlining will be a win or loss, because we can't see into a program's future execution patterns. As a heuristic to somewhat counter this problem, RPython has an (adjustable) upper limit to traces: if too long, they are aborted, and the subsequent trace will turn inlining off. This helps somewhat, but what a good value for
With luck, future research will start to whittle away at tracing JITs weaknesses. However, it seems likely that, in the medium term at least, most hand-crafted VMs will remain method-based (referred to hereon as
How fast can it go?Something that's currently unclear is how fast one can reasonably expect an RPython VM to go. The best guide we currently have to the achievable speed of RPython VMs is PyPy itself. Although it seems that most of the easy wins have now been applied, it's still getting faster (albeit the rate of gains is slowing down), and, more importantly, is increasingly giving good performance for a range of real programs. The PyPy speed centre is an instructive read. At the time of writing PyPy is a bit over 5 times faster than CPython for a collection of real-world programs; for micro-benchmarks it can be a couple of orders of magnitude quicker.
It's clear that, in general, an RPython VM won't reach the performance of something like HotSpot, which has several advantages: the overall better performance of method-based JITs; the fact that it's hand-coded for one specific class of languages; and the sheer amount of effort put into it. But I'd certainly expect RPython VMs to get to comfortably within an order of magnitude performance levels as HotSpot. Time will tell, and as people write RPython VMs for languages like Java, we'll have better points of comparison.
RPython issuesFrom what I've written above, I hope you get some sense of how interesting and exciting RPython is for the language design and implementation community. I also hope you get a sense of how impressed I am with RPython. Because of that, I feel able to be frank and honest about the limitations and shortcomings of the approach.
The major problem anyone creating a VM in RPython currently faces is documentation or, more accurately, a lack of it. There are various papers and fragments of documentation on RPython, but they're not yet pulled together into a coherent whole. New users will struggle to find either a coherent high-level description or descriptions of vital low-level details. Indeed, the lack of documentation is currently enough to scare off all but the most dedicated of language enthusiasts. As such a dedicated enthusiast, I got a long way with
Part of the problem probably stems from the strict adherence of the PyPy / RPython development process to Test Driven Development (TDD). PyPy / RPython has roughly the same amount of code for the main system as for the tests. Although this would not have been my personal choice, it appears to have served the PyPy / RPython development process extremely well. It's impossible not to admire the astonishing evolution of the project and the concepts it has developed; TDD must have played an important part in this. Indeed, although the Converge VM has inevitably uncovered a few bugs in PyPy, they have been surprisingly few and far between. Unfortunately, the prioritisation of tests seems to have been at the expense of documentation. As I rapidly discovered – to the initial bemusement of the RPython developers – tests are a poor substitute for documentation, typically combining a wealth of low-level detail with a lack of any obvious high-level intent. In short, with so many tests, it's often impossible to work out what is really being tested or why. I certainly struggled with this lack of documentation: I suspect I could have shaved at least a third off of my development time if RPython was as well documented as other language projects. Fortunately the PyPy chaps are aware of this, and there are now open issues to resolve this, and I hope my experiences will feed in to that.
As alluded to above, PyPy / RPython have, since their original conception, changed to a degree unmatched by any other project I can think of. The PyPy project started off as an
The massive evolution PyPy / RPython have taken also has implications for the implementation. In short, RPython does not just have an experimental past, it has an experimental present. The translator is littered with a vast number of assertions, many of which can be triggered by seemingly valid user programs. These are often hard to resolve: one is left wondering what a particular assertion has to do with the program that triggered it. Occasionally, if one is really lucky, an assertion has an associated comment which says something like
Because every RPython program is also a valid Python program, RPython VMs can also be run using CPython or PyPy —
The alternative is full translation. RPython is a
Whole program translation may seem an odd decision, but it has a vital use: RPython uses Python as its compile-time meta-programming language. Basically, the RPython translator loads in an RPython VM and executes it as a normal Python program for as long as it chooses. Once that has finished, the translator is given a reference to the
As a final matter, RPython is not just restricted to generating C: at various points it has also had JVM, CLR, and LLVM backends (though, at the time of writing, none of these is currently usable). RPython has thus tried to create its own type system to abstract away the details of these backends, not entirely successfully. This is not a fault unique to RPython. As anyone who's tried porting a C program to a number of platforms will attest, there is no simple set of integer types which works across platforms. Unfortunately, RPython's history of multiple backends and only semi-successful attempts to abstract away low-level type systems means that it has at least 5 type systems for various parts (some of which, admittedly, are hidden from the user). Not only does each have different rules, but the most common combination (
The futureRPython, to my mind, is an astonishing project. It has, almost single-handedly, opened up an entirely new approach to VM implementation. As my experience shows, creating a decent RPython VM is not a huge amount of work (despite some frustrations). In short: never again do new languages need come with unusably slow VMs. That the the PyPy / RPython team have shown that these ideas scale up to a fast implementation of a large, real-world language (Python) is another feather in their cap.
An important question is whether the approach that RPython takes is so unique that it is the only possible tool one can imagine using for the job. As my experience with RPython has grown, the answer is clearly
If you've got this far, congratulations: it's been a long read, I know! This article is so long because its subject is so worthy. I am a curmudgeon and I find most new developments in software to be thoroughly uninteresting. RPython is different. It's the most interesting thing I've seen in well over a decade. Exactly what its ramifications will be is something that only time can tell, but I think they will be two fold. First, I think new languages will suddenly find themselves able to compete well enough with existing languages that they will be given a chance: I hope this will encourage language designers to experiment more than they have previously felt able. Second, the
Acknowledgements: The RPython developers have been consistently helpful and the new VM wouldn't have got this far without valuable help from Carl Friedrich Bolz and Armin Rigo in particular: Maciej Fijalkowski and others on the PyPy IRC channel have also been extremely helpful. Martin Berger, Carl Friedrich Bolz, and Armin Rigo also gave insightful comments on this article. Any remaining errors and infelicities are, of course, my own.
|Home > Technical Articles||e-mail: firstname.lastname@example.org github: ltratt twitter: @laurencetratt|
|Copyright © 1995-2012 Laurence Tratt|