regex - Henry Spencer's regular expression libraries

Henry Spencer wrote three different regular expression libraries. He originally distributed them via Usenet or FTP, but for convenience I have collected them here.


Firstly the "old library" or the "book library". This was originally posted to the Usenet group mod.sources on 19 January 1986 and was updated for the book Software Solutions In C, ed. Dale Schumacher, Academic Press, 1994 [ref: Usenet]. This version was obtained from the ftp server at zoo.toronto.edu (now defunct). The man page is dated 5 September 1996 and original tar archive was dated 4 April 1998.

The library can be obtained from a github repository, or download regex.old-master.zip.


The second is the "BSD library". This is a POSIX.2 compliant library that was included in 4.4BSD Unix. Spencer wrote that it was basically an alpha release, and pretty slow [ref: Usenet]. For more information on this library, see the README and COPYRIGHT files and the man pages regex(3) and regex(7). This version was also obtained from zoo.toronto.edu and is dated 10 August 1999.

The library can be obtained from a github repository, or download from the release archives. The alpha3.8 release is the code obtained from zoo.toronto.edu, while subsequent releases contain bug fixes.

I have also prepared a shared library version, using Autoconf etc., to allow installation as a system library in Linux or similar systems. See the WHATSNEW page for details of changes.

This version also has a github repository, or download from the release archives.


The third is the "Tcl library", which was added to Tcl in 1999 (version 8.1) and supports wide-character Unicode. Although Spencer was working on packaging this library as a standalone distribution [ref: Usenet], as far as I know he never released it.

However, a couple of ports are available. Walter Waldo made a C++ library, which can be obtained from a github repository, or download hsrex-master.zip. Documentation is somewhat lacking for this library, but here's a basic example for the ASCII version: hsrex-char.c. The Tcl library documentation may also be useful.

There's also a Java port by Basis Technology Corporation available in github.