mod_rewrite, win32 and colons

I’ve been spending my evenings divided between frantically studying history and working on
my new project. The
latter is soon to debut on lagen.nu
(but it will be in swedish only).

In swedish law, all laws are uniquely identified by what’s known as
”SFS-nummer”, a string consisting of the year the law was enacted,
followed by a colon, followed by a index number for that year (there
are some exceptions to this rule, but I’m ignoring them for now). For
example, the swedish copyright law is known as 1960:729.

Lagen.nu will contain all swedish laws together with all manners of
cross-referencing goodness. So, wouldn’t it be great if there was some
URL-rewriting magic at work, so that instead of going to
http://lagen.nu/1960/729.html, you could just go to http://lagen.nu/1960:729?
Turns out this is fairly simple with apache and mod_rewrite:


RewriteEngine On
RewriteRule ^(\d+):(\d+)$ /$1/$2.html [L]

Well, on my Unix box, that is. On Windows, things are a little more
complicated, and since my development machine is a WinXP laptop, I ran
into these complications. I run Apache on the laptop, to have the same
environment on both computers. As you may know, NTFS has a
little-known feature called Alternate Data
Streams
, which are specified by appending a colon and the stream
name to the file name. That’s why colon isn’t allowed in filenames (at
least I think that’s why…)

Anyway, Apache has
a problem
with this on win32. Even though we never want apache to
look at the disk for a file named 1960:729, somewhere deep in
the apache core the incoming URL (or at least part of it) is tested
for filename validity, and fails, resulting in a permission denied
error.

So, what to do? IIS to the rescue! It turns out that there is a IIS
plugin closely modelled after mod_rewrite called ISAPI_rewrite, which is
closed-source but free as in beer in its lite version. Good enough for
me. I had some problems with it (couldn’t use \d to match
just digits, I had to use RewriteRule ([^:]*):(.*)
$1/$2.html
) but otherwise, works just fine. Might be worth a look
if you’re on IIS but want to have more control over your URLs.

update: just discovered an interesting bug (i think?) in IE:
If you have a url on the form <a href="1960:729">, IE
assumes that it’s not a relative URL, but instead a absolute URL to
the server 1960 on port 729. Heh.