multitime FAQ

RSS feed: whole site
  1. Can multitime guarantee accurate timings?
  2. What use is manipulating stdin?
  3. What’s the point of a batchfile?

Can multitime guarantee accurate timings?

No, but nor can any other timing program make such guarantees either. multitime is designed to increase the chances of obtaining accurate timings e.g. by executing commands multiple times and reporting means and standard deviations; randomly interleaving executions; etc. However, the operating system and tasks executing in the system can significantly affect timing measurements. For example, multitime timings include the time to fork a process and execvp a command, which are entirely outside its hands. Short-running tasks can be particularly affected by seemingly minor blips in system activity.

There are methods which can increase the likely accuracy of timing measurements. For example, raising the number of executions (and, depending on your circumstances, the sleep time between executions) reduces the likelihood of temporary blips distorting timing measurements. If comparing the execution times of multiple commands, the use of batchfiles can spread blips out rather than concentrating them on a single command. Increasing the process priority of multitime can decrease the likelihood of other tasks interfering with timings. Ultimately, however, there can never be absolute guarantees of accuracy. Instead, such methods should be thought of as increasing the likelihood that the numbers returned are indicative of the true measurements. By presenting means and standard deviations, multitime encourages the use of confidence intervals, a statistical technique which encourages this mode of thinking.

What use is manipulating stdin?

If you are executing a command once (as traditional time does), there is no use in manipulating stdin. If you are timing a command which reads in from stdin, and if you are executing a command multiple times then controlling stdin is vital. Without it:

multitime’s -i switch allows you to run a command which precisely controls the input a command sees on stdin. For example, if you want to execute sort 10 times on a single piece of data and time how long it takes:

$ multitime -i "cat /tmp/data" -n 10 sort
===> multitime results
1: sort
            Mean        Std.Dev.    Min         Median      Max
real        0.384       0.011       0.373       0.382       0.401
user        0.295       0.029       0.230       0.295       0.340
sys         0.051       0.022       0.020       0.045       0.080

If, instead, you want to execute sort 10 times on a different pieces of random data all of the same size:

$ multitime -i "jot -r 1000000 1 100000" -q -n 10 sort
===> multitime results
1: sort
            Mean        Std.Dev.    Min         Median      Max
real        0.386       0.013       0.372       0.379       0.409
user        0.284       0.028       0.240       0.290       0.340
sys         0.060       0.022       0.040       0.050       0.110

Combining -I (an unfortunate name, perhaps, but the precedent for this particular choice of name was set long ago by xargs) and -i allows you to execute sort 10 times on 10 different files named data1datan:

$ multitime -I{} -i "cat /tmp/data{}" -n 10 sort

Note that multitime’s randomization means that the order that data file is passed can’t be predicted: data7 could be followed by data2 and so on.

What’s the point of a batchfile?

Let’s say you want to compare the timings of the commands X, Y, and Z over 30 runs. X and Y run fairly fast; Z runs fairly slow. You use multitime to run X 30 times, then Y 30 times, and finally Z 30 times. In theory, this should give you fairly accurate data which would allow comparisons to be made.

Imagine that you kick the timings off at 10pm. However, the machine you run this on has a cron job scheduled at 3am. The cron job might run for all 30 of Y’s iterations, slowing it down substantially and making any comparison of its timings with X and Z worthless. Of course, you probably won’t realise any of this has happened: the mins and maxes for Y will appear perfectly sensible.

Batch files allow a single invocation of multitime to randomly interleave executions of X, Y, and Z. If we have this batchfile:

$ cat batch
X
Y
Z

and then run:

$ multitime -b batch -n 30 -v

(i.e. using the -v switch to see which commands are being run) we will see:

Y
Y
X
Z
Y
Z
X

and so on. By randomly interleaving the execution of each command, temporary blips such as the imaginary cron job are less likely to affect timings and, if they do, they’re more likely to be noticed.