October 17 2006
Recently I have heard several people musing about the difficulties of evolving a DSL. Actually,
musing might be being rather polite - rather, I have heard several people arguing vehemently that creating a DSL inevitably leads to doom when the requirements for the DSL change. This is a very interesting point, because in my opinion the ability for a DSL to evolve is critical.
Paul Hudak got a lot right in a sequence of papers he published on DSLs in the mid-90's. Specifically he notes that DSLs generally start tackling a small problem, then need to grow bigger as more aspects of the problem are tackled. [He also noted - and I think he may have been the first to articulate this so eloquently - that DSLs eventually tend to evolve into a badly designed general purpose language, but that's beyond the scope of this entry.] The way in which this evolution of requirements happens is almost always unpredictable, because it is the act of building and using the DSL that gives users the insight to change their requirements. In my mind, Hudak is entirely correct; I believe that the terms
need to evolve go hand in hand.
The question one has to ask oneself is thus fairly simple: are DSLs hard to evolve? I think the answer is that, yes, today DSLs are hard to evolve. Why is this? Well, there are two types of DSL in common use. The first is standalone (often called
external) DSLs such as Make. The second is integrated (often called
internal) DSLs such as those that Hudak talks about, or those that are frequently talked about in conjunction with Ruby. These two types of DSLs are fundamentally different. Standalone DSLs are flexible, but an awful lot of work to create, because in a sense they're a complete implementation of a mini-programming language. Integrated DSLs aren't very much work to create, but they tend to have an unhealthy coupling to their host language, which means they have many limitations imposed upon them both in terms of what they can express and how they can express it. The difference between most standalone and integrated DSLs is so severe that I sometimes wonder if the umbrella term
DSL is entirely helpful, but that's an argument for another time.
The irony is that standalone and integrated DSLs share one thing in common: their implementations are typically hard to change. The reasons for this are rather different. Standalone DSLs have fairly large implementations, and are subject to all the problems that any non-trivial implementation suffers from e.g. the interaction between components is often complex and brittle. Integrated DSLs on the other hand are often small but rather hackish in nature; they frequently rely on stretching often somewhat obscure language features to near breaking point. At some point, either the language feature can be stretched no further or, worse, the whole hackish facade comes crumbling down. Please don't get me wrong: I enjoy a cunning hack as much as the next man, but cunning hacks are not what I want to base a whole approach on.
In my opinion, while DSLs are implemented in one of these two ways, they will always be hard to evolve. Therefore I agree with those who point out that, at the moment, implementing a DSL is an almost guaranteed way of giving oneself huge problems when evolution rears its inconvenient head. My argument is that the approaches I've outlined above are fundamentally flawed. What one wants is a way of implementing DSLs with relatively little code but which don't rely on abusing language features. Since small, well written programs are generally considered fairly evolvable, this should give one a reasonable chance of having DSLs that are evolvable. This has been one of my goals in the recent new version of Converge; it's far too early to tell if that's been achieved yet, but I'm fairly convinced already that this approach is, at the very least, no worse than the traditional approaches.
Follow me on Twitter