Starting with extsmail I seem to have developed the accidental hobby of occasionally developing new tools in the Unix tradition. In this post I’m going to introduce snare, a minimalistic GitHub webhooks runner daemon for those cases where something like GitHub actions don’t give you sufficient control. I’ve been using snare in production for over 2 years for tasks from running CI using specific hardware to sending out email diffs to automatically restarting Unix daemons when their configuration has been changed. It’s a simple, but now stable, tool [1].
What is a webhooks runner? Well, whenever something happens to a GitHub repository – for example, a new PR is raised, or a commit is pushed to a PR – GitHub can send an HTTP request to a specified URL informing it of what happened. Configuring a webhook for a given GitHub repository is relatively simple: go to that repository, then Settings > Webhooks > Add webhook.
snare is a simple program which listens for GitHub webhook events and runs
Unix shell commands based on them. You need a server of your own to run snare
on. When you add a GitHub webhook, you then need to specify
http://yourmachine.com:port/
[2], a secret (in essence, a shared
password between GitHub and snare) and then choose which events you wish GitHub
to deliver. For example, the default “Just the push event” works well if you
want to send email diffs whenever someone pushes commits to a repository.
snare needs a configuration file to tell it what to do when an event comes
in. A very simple snare.conf
file looks as follows [3]:
listen = "<ip-address>:<port>"; github { match ".*" { cmd = "somecmd"; secret = "<secret>"; } }
In essence, snare will listen on <ip-address>:<port>
for webhook events, verifying that they were created with the secret
<secret>
. Each request is relative to a repository:
match
blocks match against a “ owner/repository
“
string using Rust’s regex crate for
regular expressions“. Thus ".*"
matches against any repository and
the Unix shell command somecmd
will be run when (any) event is
received for that repository.
Let’s imagine we want to send out email diffs when someone pushes to the
repositories “owner/repo1” or “owner/repo2”. We might create a github
block along the lines of the following:
github { match "owner/repo[12]" { cmd = "ghemaildiff %o %r %e %j email@example.com"; secret = "<secret>"; } }
This only matches against the particular repositories we wanted
to match. The command we’re now going to execute is called
ghemaildiff
(I’ll show an example of this below) and it takes five
or more arguments: the repository’s owner (%o
), name
(%r
), the GitHub event type (%e
), a path to the full
JSON of the GitHub event (%j
), and one or more email addresses to
send diffs to. As you’ve probably guessed, snare searches for text like
%e
and replaces it with other text; %%
escapes
percentage characters, should you need to do so.
One of the big problems when executing commands like this is when something
goes wrong – it’s easy for the error to sit unnoticed in a log. Instead,
snare allows one to add an errorcmd
,
which is very similar to cmd
, except a) it’s only executed when
cmd
fails b) it has an additional %s
modifier, which
is a path to a file with the stdout / stderr of the failed command. I typically
use it as follows:
github { match "user/repo[12]" { cmd = "ghemaildiff %o %r %e %j email@example.com"; errorcmd = "cat %s | mailx -s \"snare error: github.com/%o/%r\" email@example.com"; } }
so that if executing a command fails, I’m sent an email that helps me debug the problem.
Security
For most purposes, the example configuration above is enough to use snare in
anger. However, any program which takes input from a network and runs commands
based on it is a security risk. snare tries to reduce these worries by
rejecting incoming requests if any part of the input isn’t exactly as expected.
The %
escape sequences available to cmd
are guaranteed to:
- satisfy the regular expression
[a-zA-Z0-9._-]+
- not to be the strings “.” or “..”.
This means that the escape sequences are safe to use as shell arguments and/or to be included in file system paths.
However, the user still has to be thoughtful in the commands they run which boils down to:
- All input (including JSON files) must be treated as potentially suspect: I urge you to accept input only if it precisely matches the format you expect, rather than merely rejecting input if it does something that you happen to recognise as unexpected or bad. The problem with the latter approach is that it’s easy to overlook things that will subsequently turn out to be bad. Put another way: it’s better to be overstrict and relax later.
- Use at least
set -euf
(and perhaps more) in shell scripts so that errors in subcommands cause your script to immediately terminate rather than limp on in a way that you almost certainly didn’t anticipate. - Think carefully about who can cause an event to be triggered: for example, if you run webhooks when a pull request is merged, can someone outside your organisation cause a merge to occur?
- If a command fails, think about whether your
errorcmd
(if you have one) can unintentionally leak private information.
I am deliberately making the above scary sounding, because I want to emphasise that you need to use snare in “don’t trust until proven trustworthy” mode. If you do so, I believe that snare can be used in a way that is wholly secure.
An example command
You can execute whatever command you want with snare, but here’s an example
ghemaildiff
script which creates simple, but useful, diffs which
it sends via email:
#! /bin/sh set -euf if [ $# -lt 5 ]; then echo "Usage: ghemaildiff <owner> <repository> <event> </path/to/JSON> <email_1> [...<email_n>]" > /dev/stderr exit 1 fi # We only generate diffs for push events if [ "$3" != "push" ]; then exit 0 fi before_hash=`jq .before "$4" | tr -d '\"'` after_hash=`jq .after "$4" | tr -d '\"'` echo "$before_hash" | grep -E "^[a-fA-F0-9]+$" 2>&1 > /dev/null echo "$after_hash" | grep -E "^[a-fA-F0-9]+$" 2<&1 > /dev/null owner=$1 repo=$2 shift ; shift ; shift ; shift git clone https://github.com/$owner/$repo repo cd repo for email in $@; do git log --reverse -p "$before_hash..$after_hash" | mail -s "Push to $owner/$repo" "$email" done
Notice that this script doesn’t check inputs which snare has already validated
(e.g. $1
is snare’s %o
and has thus already been
validated as a sensible input) but is careful to check that the git commit IDs
extracted via jq satisfy a very
narrow regular expression before passing them on as shell arguments.
Advanced configuration
As you can see from the snare.conf man page, snare doesn’t have a huge number of configuration options. That’s deliberate, because I wanted to keep snare simple: snare doesn’t even provide a builtin way to fetch a repository! However, there are two additional configuration tricks that are worth knowing about.
When a request comes in, snare “executes” all the match
statements in the config file, from top to bottom: later settings override
earlier settings [4]. This allows the user to set, or
override, defaults in a predictable manner. Indeed, snare
inserts
an implicit match
block before the user’s configuration:
match ".*" { queue = sequential; timeout = 3600; }
I’ll explain queue
shortly; the timeout is 1 hour. If, for
example, the user has this configuration file:
github { match ".*" { cmd = "somecmd"; errorcmd = "cat %s | mailx -s \"snare error: github.com/%o/%r\" abc@def.com"; secret = "sec"; } match "a/b" { errorcmd = "lpr %s"; } }
then the following repositories will have these settings:
a/b: queue = sequential timeout = 3600 cmd = "somecmd"; errorcmd = "lpr %s"; secret = "sec" c/d: queue = sequential timeout = 3600 cmd = "somecmd"; errorcmd = "cat %s | mailx -s \"snare error: github.com/%o/%r\" abc@def.com"; secret = "sec"
You can override settings as many times as you want in a file: it’s a powerful technique!
By default, snare queues requests for any given repository and only executes
the next in the queue when the previous command has finished. This is a safe
default, but can lead to undue work and delay, particularly for repositories
with significant activity. There are two other queue modes. queue = parallel
executes requests in parallel to each other. I’ve not used
this much myself, but there are obvious use cases for it.
In contrast, I use queue = evict
extensively: it means a
repository has a maximum queue length of 1, with any new request coming in
replacing the existing queue entry (if it exists). For example, we have many
webhooks which build documentation for a repository after a pull request is
merged. If several pull requests are merged in quick succession (which is
common), there’s no point waiting to build the documentation for all the pull
requests: we might as well only build the documentation relating from the
“latest and greatest” merge. Note that evict
does not stop
any currently running job.
Summary
snare is a niche tool, but I suspect more people could benefit from this niche than currently realise it: certainly, we’ve found ourselves using snare in more ways than I ever expected.
An obvious example is where we use it to automatically
rebuild and release websites when a commit is pushed to a repository. Less
obviously, we frequently pair it with bors and
buildbot. Sometimes that’s because we
need to run actions on specific hardware, but there are other simpler uses too.
For example, we use it to build
grmtools documentation
and force push it to a gh-pages
branch on every pull request
merge: this way the grmtools documentation is always up-to date, but we don’t have to share a GitHub
access token in the globally visible .buildbot.sh
file. I’m sure other people can think of
uses for snare which would never have occurred to me!
At a later date, I’ll write a short blog post about my experiences about writing snare in Rust.
Footnotes
Rust’s cargo is, by some distance, the best language package manager I’ve used and, in my opinion, a significant factor in Rust’s success. However, the culture of having many (many!) small dependencies means that it’s not possible to take the traditional Unix approach of OS-level packaging to crates. That means that if I want to make sure that snare users have access to the latest security release of a dependency-of-a-dependency, the expectation is that I release a new version of snare. Most recent updates of snare have thus really just been about updating dependencies.
Rust’s cargo is, by some distance, the best language package manager I’ve used and, in my opinion, a significant factor in Rust’s success. However, the culture of having many (many!) small dependencies means that it’s not possible to take the traditional Unix approach of OS-level packaging to crates. That means that if I want to make sure that snare users have access to the latest security release of a dependency-of-a-dependency, the expectation is that I release a new version of snare. Most recent updates of snare have thus really just been about updating dependencies.
If, as I recommend, you want to put snare behind https
you’ll need
to use a forwarding proxy server. It would be nice if snare could support
https directly, perhaps including automatic certificate support to avoid
problems with untrusted SSL certificates.
If, as I recommend, you want to put snare behind https
you’ll need
to use a forwarding proxy server. It would be nice if snare could support
https directly, perhaps including automatic certificate support to avoid
problems with untrusted SSL certificates.
The reason for an explicit github
block is because I can imagine
snare easily being extended in the future to cope with the webhooks-equivalents
for other sites such as GitLab and the like.
The reason for an explicit github
block is because I can imagine
snare easily being extended in the future to cope with the webhooks-equivalents
for other sites such as GitLab and the like.
I don’t know who first came up with this style of config file, but it’s certainly become a common idiom in OpenBSD daemons over the years, which is what influenced me.
I don’t know who first came up with this style of config file, but it’s certainly become a common idiom in OpenBSD daemons over the years, which is what influenced me.