Quick guide to URL rewriting with Apache

Redirecting whole websites or single webpages with mod_rewrite is a very important feature when doing search engine optimization (SEO) or when moving from one webserver or domain to another.

Redirection is usually done with a small block of code that is added to an .htaccess file in your websites root path. The code snippet looks like:

RewriteEngine on 
RewriteCond %{HTTP_HOST} !^www\.example\.com$ 
RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]

It is often misunderstood and the bulletin boards have hundrets of incomplete or wrong assumptions about Apache and mod_rewrite. That's why I decided to write an article about it.

First, for your better understanding, mod_rewrite is a module for the Apache webserver. It's a kind of plug-in that Apache utilizes in case the administrator (resp. webmaster) set up special directives (read: commands) in the httpd.conf or in .htaccess of the webserver.

The first line of code turns the RewriteEngine on. That one was simple, wasn't it? Don't worry, the following two lines are easy to understand, too.

RewriteCond %{HTTP_HOST} !^www\.example\.com$

The above code is a rewrite condition, and it sais: if the HTTP_HOST (this is the host domain) is NOT (!) www.example.com, the following RewriteRule (explained below) applies.

%{HTTP_HOST} references the server variable HTTP_HOST. There are a number of other server variables available for use in a RewriteCond directive, please find the complete list below.

!^www\.example\.com$ is a Regular Expression (RegEx) pattern matching the string "www.example.com" exactly. The exclamation mark prepending the pattern is mod_rewrite specific and means "not". That's why any HTTP host other than www.example.com matches our rewrite condition.

Continuing with the actual rewrite rule:

RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]

^(.*)$ again is a Regular Expression. It matches any path (in fact even an empty path) and puts the path into variable $1.

http://www.example.com/$1 sets the new URL. Please note the $1: each path is aliased properly. For example, if we called the server with an url like www.example.net/blah it would construct an alias www.example.com/blah from it. I call this an alias, because until now the code did no redirection. The redirection is done in the next part of the directive:

[L,R=301] defines two "flags" for the rewrite rule. Flag L means this is the last rule to process. Thus other following rules will be ignored even if their rewrite condition matches. Flag R defined the kind of rewriting, which in this case is a HTTP status 301: a permanent redirection.

Apache Rewrite Module (mod_rewrite) server variables

HTTP headers: HTTP_USER_AGENT, HTTP_REFERER, HTTP_COOKIE, HTTP_FORWARDED, HTTP_HOST, HTTP_PROXY_CONNECTION, HTTP_ACCEPT

connection & request: REMOTE_ADDR, REMOTE_HOST, REMOTE_USER, REMOTE_IDENT, REQUEST_METHOD, SCRIPT_FILENAME, PATH_INFO, QUERY_STRING, AUTH_TYPE

server internals: DOCUMENT_ROOT, SERVER_ADMIN, SERVER_NAME, SERVER_ADDR, SERVER_PORT, SERVER_PROTOCOL, SERVER_SOFTWARE

system stuff: TIME_YEAR, TIME_MON, TIME_DAY, TIME_HOUR, TIME_MIN, TIME_SEC, TIME_WDAY, TIME

specials: API_VERSION, THE_REQUEST, REQUEST_URI, REQUEST_FILENAME, IS_SUBREQ