In my opinion, what separates the men from the boys when it comes to programming is debugging. It doesn’t matter how good one is, bugs are an inevitable part of a programmers life. The difference in the amount of time it takes different people to notice a bug, track down its cause, and provide a fix can be quite amazing. Maybe I am a troglodyte, but I believe that the best advice I have seen on the subject of debugging is from Brian Kernighan, who once said that the best tool for debugging is printf
and common sense. Frankly I have never found debuggers of any practical use (apart from when using those languages so crude that one needs a debugger to view a stack trace).
There is however one other technique in my debugging armoury, and it involves the humble grep
utility. [For those unfamiliar with grep
, it searches through one or more files searching for a match against a given regular expression]. I use it to hunt for every occurrence of a function name or data-type in a large code base while trying to track down a problem. If I have a hunch of a possible problem, this often enables me to track down the offending calling code far faster than any other mechanism I am aware of. A very handy idiom that I use is the following which finds every file containing Func
and loads it straight into my text editor:
grep -iRl Func | xargs nedit
This does a case insensitive (-i
), recursive (-R
) search and then prints out just the filename of matching files (-l
). Cunning uses of grep
’s regular expressions can result in a very powerful debugging aid which unfortunately seems to be severely underutilized by most people.
Despite my fondness for grep
, I have always felt that it is lacking in one important regard: it can not replace what it matches with another string. I have therefore long had a simple utility in my ~/bin
directory of useful little programs which was a simple wrapper around the sub
function in Python’s regular expression library. It essentially did a recursive search through a list of files replacing the regular expression R with the string S. I have used this utility extensively for debugging and non-debugging related purposes and it is incredibly useful. However it is something of a crude tool. Experience has taught me that often a regular expression matches against more things than one intended, and that it is therefore a very good idea to take a backup of all relevant data before running the utility.
Recently I have had much cause to make use of my simple utility on an evolving code base. Continually backing up data, and refining a regular expression until it matches only its intended target is highly repetitive and tedious, and in my experience anything that is repetitive and tedious leads, sooner or later, to boredom induced errors. So I sat down and quickly cooked up a new variant of my utility which I have flippantly named srep
(Search and REPlace). srep
has one novelty of particular interest: rather than always directly modifying files, it can be told produce output which is acceptable format for the patch
utility i.e. it outputs diffs. This has some interesting benefits:
- One can inspect the diff output of
srep
, and check that it has only matched against what was expected. - One can edit any incorrect changes manually.
- The standard
patch
utility can be used to actually commit the changes verified by the user to the data in question. - If for any reason the changes turn out not to be correct, running
patch
with the-R
(‘reverse’) switch backs out the changes from the data in question.
Here’s an example of using srep
on a code base of C files. The following command executes srep
on all .c
and .h
files in the current directory, and outputs a unified diff (-u
) into the changes
file.
find . | grep "\\.[ch]$" | xargs srep -u Con_Func_Obj \ Con_Func_Seg > changes
A fragment of the changes
file is as follows (the full unified diff can be found here)
--- ./VM.c Sun Oct 2 14:31:38 2005 +++ ./VM.c Sun Oct 2 14:31:38 2005 @@ -183,12 +183,12 @@ Con_Obj * Con_VM_apply(Con_EC_Obj *ec, Con_Obj *func) { jmp_buf env; - Con_Func_Obj *func_seg; + Con_Func_Seg *func_seg; Con_Obj *return_obj; if ((func->seg_c_class != ec->vm->builtins[CON_BUILTIN_FUNC_CLASS])) return NULL; - func_seg = (Con_Func_Obj *) func; + func_seg = (Con_Func_Seg *) func; if (func_seg->pc_type == PC_TYPE_C_FUNCTION) { if (sigsetjmp(env, 0) == 0) {
Once I have verified that the changes that will be made are what I expect, I can then apply this diff in the normal fashion:
patch -p0 < changes
srep
has a useful variant on this, which is to output files in a unified diff but with the additional output from Tim Peter’s ndiff
utility. The -n
flag tells srep
to produce a hybrid unified / ndiff
patch such as the following fragment (the full hybrid diff can be found here):
--- ./VM.c Sun Oct 2 14:31:38 2005 tags = ["essay"] +++ ./VM.c Sun Oct 2 14:31:38 2005 @@ -154,12 +154,12 @@ Con_Obj * Con_VM_apply(Con_EC_Obj *ec, Con_Obj *func) { jmp_buf env; - Con_Func_Obj *func_seg; ? ^^^ + Con_Func_Seg *func_seg; ? ^^^ Con_Obj *return_obj; if ((func->seg_c_class != ec->vm->builtins[CON_BUILTIN_FUNC_CLASS]) return NULL; - func_seg = (Con_Func_Obj *) func; ? ^^^ + func_seg = (Con_Func_Seg *) func; ? ^^^ if (func_seg->pc_type == PC_TYPE_C_FUNCTION) { if (sigsetjmp(env, 0) == 0) {
What srep
takes from ndiff
is the lines beginning with ?
which show you which characters within a line are affected by the diff. This can be very useful when you are trying to visually track which intra-line changes will be made by applying a diff. Unfortunately the patch utility complains about such lines, so one can not directly feed such a diff into patch
. One can however use srep
to automatically modify the changes
diff into valid patch
input by removing all lines starting with ?
:
srep "^\\?.*?\n" "" changes
As this example suggests, if srep
is run without either the unified (-u
) or ndiff
(-n
) output options, it modifies files in situ.
srep
is what I would consider a hack in the best sense of that term. It’s basically a simple idea with a correspondingly simple implementation. I wouldn’t necessarily trust it not to eat my hard disk, although so far I’ve not had any problems; I can however tell you with some confidence that it is grossly resource inefficient, so don’t expect it to run particularly fast. If you’re feeling a touch brave and would like to try out this potentially interesting way to debug and change programs, download the put-together-very-quickly version of srep
and feel free to play. And if you think of any new uses for it, please let me know! Perhaps one day someone might make srep
a fully fledged utility rather than the slightly ugly hack it currently is.
Updated (October 7 2005): Attributed the paraphrased “common-sense” quote to Kernighan.