The Importance of Syntax

May 1 2005

For me, programming language syntax may well be the love that can not speak its name. In the modern era we are often conditioned to believe that "syntax doesn't matter" - that we should only be interested in the underlying concepts that syntax allows us access to. Any particular syntax - sometimes called the concrete syntax - is malleable and can be easily exchanged for another without affecting the underlying meaning - the semantics - of what is being represented. I myself have said more than once that the syntax of various languages is largely irrelevant. Yet I also believe that a languages syntax is very important, and that getting it right is a large part of making a language useable. Somehow these two conflicting ideas manage to exist in my mind without rendering me incapable of functioning.

If one is being impartial, the first question one surely has to ask is: why would anyone say that syntax is unimportant? After all, syntax is our mechanism for reading and writing programs. Clearly, bad syntaxes can make our tasks much more difficult. I have pondered on this question for quite some time. I think there are two very different reasons which together explain this.

The first, and most obvious, is that syntax can become a very personal issue - based on a persons individual tastes, and experiences - and conversations about different syntaxes invariably descend into pointless, and circular, trivialities. There is thus an implicit social contract that we avoid talking about particular syntaxes, to avoid bogging down more important debate (and to try and avoid setting people against each other). If you've ever had the misfortune to step into a 'tab vs. space' debate, or something similar, you will appreciate why such topics are generally best avoided. Note that this is essentially a socio-polictical decision, and I would say that, on the whole, it's a wise one.

The other reason why we say that syntax is unimportant is because we have so few good examples of it. Look at the widely used programming languages of today for example. The C style of families (including Java and C#) have a syntax so ugly that I struggle to believe that its own mother (which, in this case, probably means Ritchie and Thompson) could really love it. The proliferation of curly brackets is an eye-sore, whilst the use of semi-colons harks back to the days of horribly primitive lexing and parsing technologies. Outside of programming languages, SQL has a very different style of syntax, but it's very difficult to write long SQL statements in such a way that they are easily read afterwards. I could go on, but I hope the reader gets the general picture.

Surprisingly, one rarely hears criticism of syntaxes such as C's and SQL's, because they have one saving grace: familiarity. We're familiar with them, and even though though they're far from perfect, that's enough for us. About the only syntax that everyone feels happy to bash is Perl's. I too have been known to criticise Perl's syntax, but I have always maintained that as an example of what modern parsing technology can achieve it is an incredible feat. Unfortunately human being's parsing technology has not increased appreciably over the past few decades and, for most of us, Perl's random collection of characters, white space sensitive operators and so on is a recipe for disaster. The problem with the syntaxes we encounter is that because of their general mediocrity, or outright awfulness, it is very difficult for us to visualize what a good syntax might look like. Because of that, we accept what we have, and don't consider that there may be something better out there.

At this point, I must mention LISP's syntax; if I don't, someone else surely will. LISP is remarkable for its minimal syntax - it essentially only has a couple of grammatical constructs. LISP's minimal syntax has allowed some very powerful constructs to be built on top of it. LISP's macro system is a testament to the utility of the minimal syntax. Anyone who has used one of the myriad attempts at a proper macro system in C will be very aware that such systems fall hopelessly short of the target. For decades, LISP's syntax has allowed its users expressive power that is simply unavailable in other languages. From this, LISP's users infer that its syntax is inherently superior. I believe this not to be the case. The small fact that LISP fans often fail to acknowledge is that the power that LISP derives from its syntax is almost certainly a coincidence. I really don't think that McCarthy had in mind macro systems when he designed LISP (a thought confirmed by the fact that the original LISP paper proposed both S-Exps and M-Exps). I think instead that a language with a couple of grammar rules and minimal lexing needs fitted the woefully underpowered, and difficult to develop for, systems of the day well. It is to the LISP communities credit that they later managed to make use of decisions imposed by technical limitations, but I feel that they should be aware that they have often been guilty of making a virtue of necessity. LISP's minimal syntax is simply too minimal for most users, and any language with similarly minimal pretensions misses the fundamental lesson of LISP. That lesson is about what you can do with a language; it has little to do with its syntax. Template Haskell, for example, has a macro-esque system of equivalent power to LISP's, but on top of a language with a rich (if ghastly) modern syntax.

The only modern language whose syntax I believe has had any consideration of aesthetics and usability put into it is Python's. It's elegant indentation structure is intuitive for those prepared to de-program their curly-bracket infested minds. Furthermore, good Python code is very simply easy on the eye. Bugs often jump out of the page, instead of remaining buried under a wealth of parantheses. Unfortunately I have recently been forced to wonder whether the general quality of Python's syntax was not part of a grand plan, but a mere coincidence. Recent versions have seen increased syntactic cruft - the recent addition of function decorators means that some Python code can now end up looking positively Perl-esque. That is never a good thing, and perhaps matches a gut feeling I've had a for a while, which is that Python reached its peak around version 1.4 or 1.5 and has been on a gentle slide ever since.

I have no answers to offer on this topic. But I hope that one day we will put aside our prejudice against investigating syntax - and reluctance to try anything other than the same old C-style syntaxes - in the hope that new languages will reduce one barrier to entry, and to use. And to anyone who says syntax isn't important, may I point out the vast scale of the cosmetics industry? Like it or not, looks count.

Follow me on Twitter

 

Blog archive

 

Last 10 posts

What Challenges and Trade-Offs do Optimising Compilers Face?
Fine-grained Language Composition
Debugging Layers
An Editor for Composed Programs
The Bootstrapped Compiler and the Damage Done
Relative and Absolute Levels
General Purpose Programming Languages' Speed of Light
Another Non-Argument in Type Systems
Server Failover For the Cheap and Forgetful
Fast Enough VMs in Fast Enough Time