Laurence Tratt: Timing setjmp, and the Joy of Standards

When designing a C application - the new version of the Converge VM - I had cause to wonder how I should encode exceptions. There are essentially two choices here:

Each function returns a value which, if not NULL, signifies that an error occurred. [In other words, a simple variant of the standard UNIX mechanism, albeit one that doesn’t involve the monstrosity known as errno.]
Using some setjmp / longjmp hackery to emulate the exceptions one finds in many modern OO languages.

The former has the advantage of familiarity, and of forcing the programmer to be very aware of every exceptional condition that can happen in a program. The disadvantage is that it terminally pollutes the API (since virtually no function in the API can return a value that is not related to errors, since errors must be percolated back to the top of the call chain), and muddies code by forcing checks at every function call.

setjmp / longjmp based exceptions do not suffer from the same disadvantages, and are particularly appealing when one needs to catch exceptions relatively rarely in proportion to the amount of code one has. However such a scheme is not without potential problems. One that I was worried about concerned performance, as I would need to have exception handlers in some of the most performance critical parts of my code. I could find no mention in documentation or on the web to give me a hint in this matter, so I decided to do a very simple benchmark to determine if a setjmp / longjmp based system would perform adequately.

The property I was most interested in measuring was the time taken for setjmp to run, as it would be called frequently in my proposed exception handling system, whereas longjmp would be called exceedingly rarely. The performance of longjmp is thus largely irrelevant for my purposes. I started by writing the two following programs, which would give me a very rough idea of how long setjmp takes to execute:


```sh
Program `t1`	Program `t2`

The idea of these two programs is simply to time a number of ‘empty’ function calls (f) versus setjmp. Each program was run in the simplest possible fashion by executing gcc tx.c && time ./a.out and reading the ‘user’ output of time. When I ran t1 on my main OpenBSD box (a 2.8GHz P4) it took 23.73 seconds to run. When I ran t2 on the same machine it took 0.55 seconds to run. Thus setjmp appeared to take over 40 times longer than an ‘empty’ function call - far too slow for my needs.

At this point I decided to quickly run the test programs on another OS, just to see if the figure was about the same. So I moved to a Sun server running Solaris that I have access to (which I seem to remember has 4 processors of approximately 500MHz each). On this machine t1 ran in an astonishing 2.9 seconds, with t2 only marginally faster at a smidgen under 2 seconds. I found the results replicated on a Linux box.

At this point I was confused. Why was OpenBSD’s setjmp so slow compared to other OS’s? Considering my lack of regard for documentation under Linux the answer came, to my surprise, from the setjmp man page on the Linux box I had access to:

NOTES
     POSIX does not specify whether setjmp will save the signal
     context. (In SYSV  it  will  not. In BSD4.3 it will, and
     there is a function _setjmp that will not.)  If you want
     to save signal masks, use sigsetjmp.

```sh

In other words, OpenBSD's `setjmp` was actually doing a lot more work than Linux or Solaris, since it was saving the signal mask. Worse this discrepancy between operating systems appears to be perfectly legitimate in POSIX! Returning to OpenBSD's `setjmp` man page, the following suddenly made a lot more sense:

CAVEATS Historically, on AT&T System V UNIX, the setjmp()/longjmp() functions have been equivalent to the BSD _setjmp()/ _longjmp() functions and do not restore the signal mask. Because of this discrepancy, the sigsetjmp()/siglongjmp() interfaces should be used if portability is desired.


While I will admit that I am surprised at the huge amount of extra time saving the signal mask takes, I was happy to uncover the cause of the discrepancy. I was particularly happy due to the fact that my program does not need to save or restore the signal mask; thus I could happily sidestep this lengthy operation without penalty.

Rewriting `t1` to the following:

#include <setjmp.h>

int main() { int i; jmp_buf env;

for (i = 100000000; i > 0; i--) {
    sigsetjmp(env, 0);
}

return 0;

}


forces `setjmp` to do the same amount of work on all platforms. At this point `t1` under OpenBSD took a respectable 0.8 seconds, a relatively small factor slowdown over an 'empty' function call of 1.6. This should be more than adequate for my purposes, since the time needed to call `setjmp` is insignificant given the proportion of function calls per `setjmp`.

There are two lessons I've taken away from this little exercise, both of which I think we all know, but of which I - at least - need reminding of from time to time. Firstly, you can never be sure of anything until you measure it - I would never have guessed that saving and restoring the signal mask was such a time-consuming operation, for example. Secondly, standards don't always determine every aspect of an implementations behaviour.

Newer 2005-07-21 08:00 Older

If you’d like updates on new blog posts: follow me on Mastodon or Twitter; or subscribe to the RSS feed; or subscribe to email updates:

Timing setjmp, and the Joy of Standards

Comments