Mod_rewrite, how could I forget?!?

In a recent project at work I was asked how we could change the URL of a page in a website. It’s a ColdFusion website, and everything is generated out of the index page. Therefore, every page in the website looks like blah.com/index.cfm?page=123. However, for the purposes of web analytics, at least according to the marketing consultants this customer is working with, they need to make the pages appear as blah.com/page.

As I thought about it, it seemed like the only time I had done that in the past was with a complex reverse proxy thing. So I told them that’s how I could do it, but it was disgusting hack that could only be done in the Apache config, requiring manual configuration every time the customer wanted to change something.

Fast forward two weeks. The consultants from my company conferred with the consultants from their company. They said they have a CF guy in-house who knows how to do this and it’s EASY. They also said that he’s really busy and not available to tell us how to do this right now. So we wait. In the mean time, I take a look at the WordPress documentation for information on customizing “static” pages. In some information on various options with a custom WordPress installation they mention something about “permalinks” and “you’ll need to make sure your .htaccess file is writeable.”

Then a light comes on: we could dynamically generate a .htaccess file with a ColdFusion-based admin page that would contain RewriteRule entries to mask the index.cfm?querystring.

Here is the format of the content that needs to be generated for the .htaccess file:

RewriteEngine On
RewriteRule <incoming_request_path> <rewritten_request_path>

In this example, <incoming_request_path> is a regular expression that will be matched to the page path the client supplies and then rewritten according to the <rewritten_request_path> regex. This method supports regex groups and the like, so it’s possible to match, say, /search/puke to /search.php?query=puke. That RewriteRule would look something like:

RewriteRule ^/search/(.*) /search.php?query=$1

In our case, we’ll generate a RewriteRule directive for each page that needs to be cloaked.  When the customer updates their website, the htaccess file will be regenerated.

Perhaps a cleaner solution would use a script that would query the database (which can be done with mod_rewrite using RedirectMap directives, but this would be less efficient, especially if the site has high traffic.

Apache documentation on mod_rewrite: http://httpd.apache.org/docs/1.3/mod/mod_rewrite.html

One Response

  1. I wrote some updated information in a more recent post: Mod_rewrite / .htaccess nuances

    Hoss - January 18th, 2007 at 5:02 pm

Leave a Reply