22 October 2008

Author

Jan Vantomme

Categories

Web Development
SEO

Optimizing Textpattern: .htaccess

After reading Building Findable Websites by Aarron Walter, I decided to test some of his server side strategies with Textpattern. In this article I’m going show you how to modify the Textpattern .htacces file to optimize your website for faster search engine indexing.

Start by opening your .htaccess file or fetch the current one at this url:
http://textpattern.googlecode.com/svn/releases/4.0.6/source/.htaccess

Remove Canonical URLs with mod_rewrite

If you can access your website with and without the www prefix, your website may suffer from the Google Canonical Problem. You can easily avoid this problem by addding two lines to the mod_rewrite module on your Apache server. Add these lines right after RewriteEngine On to your Textpattern .htaccess file.

Force Apache to use www

RewriteCond %{HTTP_HOST} ^yourdomain.com$ [NC]
RewriteRule ^(.*) http://www.yourdomain.com/$1 [R=301,L]

Get rid of the www prefix

RewriteCond %{HTTP_HOST} ^www.yourdomain.com$ [NC]
RewriteRule ^(.*) http://yourdomain.com/$1 [R=301,L]

This solves only one part of the duplicate content issue. Take a look at the URLs below. They both point to the same article but Google sees these as two separate pages.

http://domain.com/section/article-title
http://domain.com/section/article-title/

This one is easy to fix by adding a trailing slash to each url with this line:

RewriteRule ^(.+[^/])$ http://domain.com/$1/ [R=301,L]

Using mod_headers to set Cache Control

Aarron uses a cache control header of 1 day for HTML and PHP documents. This is fine for static HTML pages, but with dynamically generated pages this can cause a problem. This is the reason why I use 1 second.

<IfModule mod_headers.c>
    # Default: 1 Week
    Header set Cache-Control "max-age=604800, public"
    # 1 Year: ICO, PDF, FLV
    <FilesMatch "\.(ico|pdf|flv)$">
        Header set Cache-Control "max-age=29030400, public"
    </FilesMatch>
    # 1 Month: JPG, PNG, GIF, SWF
    <FilesMatch "\.(jpg|png|gif|swf)$">
        Header set Cache-Control "max-age=2592000, public"
    </FilesMatch>
    # 1 Month: XML, TXT, CSS, JS
    <FilesMatch "\.(xml|txt|css|js)$">
        Header set Cache-Control "max-age=2592000, public"
    </FilesMatch>
    # 1 Second: HTML, PHP
    <FilesMatch "\.(html|htm|php)$">
        Header set Cache-Control "max-age=1, public"
    </FilesMatch>
</IfModule>

Using mod_expires

Be very careful with this one, it can ‘break’ the built in commenting system if you send an expires header of a day for dynamically generated pages.

<IfModule mod_expires.c>
    # Enable expirations.
    ExpiresActive On
    # Cache all files for 1 week
    ExpiresDefault A604800
    # 1 Year: ICO, PDF, FLV
    <FilesMatch "\.(ico|pdf|flv)$">
        ExpiresDefault A31449600
    </FilesMatch>
    # 1 Month: JPG, PNG, GIF, SWF
    <FilesMatch "\.(jpg|png|gif|swf)$">
        ExpiresDefault A2592000
    </FilesMatch>
    # 1 Month: XML, TXT, CSS, JS
    <FilesMatch "\.(xml|txt|css|js)$">
        ExpiresDefault A604800
    </FilesMatch>
    # 1 Second: HTML, PHP
    <FilesMatch "\.(html|htm|php)$">
        ExpiresDefault A1
    </FilesMatch>
</IfModule>

The final .htaccess file

Your final .htaccess file looks like this. Download this file to use on your own website.

#DirectoryIndex index.php index.html
#Options +FollowSymLinks
#Options -Indexes
<IfModule mod_rewrite.c>
    RewriteEngine On
    #RewriteBase /relative/web/path/
    # Force Apache to use WWW
    RewriteCond %{HTTP_HOST} ^yourdomain.com$ [NC]
    RewriteRule ^(.*) http://www.yourdomain.com/$1 [R=301,L]
    RewriteCond %{REQUEST_FILENAME} -f [OR]
    RewriteCond %{REQUEST_FILENAME} -d
    RewriteRule ^(.+) - [PT,L]
    #add trailing slash
    RewriteRule ^(.+[^/])$ http://www.yourdomain.com/$1/ [R=301,L]
    RewriteRule ^(.*) index.php
    RewriteCond %{HTTP:Authorization}  !^$
    RewriteRule .* - [E=REMOTE_USER:%{HTTP:Authorization}]
</IfModule>
#php_value register_globals 0
AddDefaultCharset utf-8
<IfModule mod_headers.c>
    # Default: 1 Week
    Header set Cache-Control "max-age=604800, public"
    # 1 Year: ICO, PDF, FLV
    <FilesMatch "\.(ico|pdf|flv)$">
        Header set Cache-Control "max-age=29030400, public"
    </FilesMatch>
    # 1 Month: JPG, PNG, GIF, SWF
    <FilesMatch "\.(jpg|png|gif|swf)$">
        Header set Cache-Control "max-age=2592000, public"
    </FilesMatch>
    # 1 Month: XML, TXT, CSS, JS
    <FilesMatch "\.(xml|txt|css|js)$">
        Header set Cache-Control "max-age=2592000, public"
    </FilesMatch>
    # 1 Second: HTML, PHP
    <FilesMatch "\.(html|htm|php)$">
        Header set Cache-Control "max-age=1, public"
    </FilesMatch>
</IfModule>
<IfModule mod_expires.c>
    # Enable expirations.
    ExpiresActive On
    # Cache all files for 1 week
    ExpiresDefault A604800
    # 1 Year: ICO, PDF, FLV
    <FilesMatch "\.(ico|pdf|flv)$">
        ExpiresDefault A31449600
    </FilesMatch>
    # 1 Month: JPG, PNG, GIF, SWF
    <FilesMatch "\.(jpg|png|gif|swf)$">
        ExpiresDefault A2592000
    </FilesMatch>
    # 1 Month: XML, TXT, CSS, JS
    <FilesMatch "\.(xml|txt|css|js)$">
        ExpiresDefault A604800
    </FilesMatch>
    # 1 Second: HTML, PHP
    <FilesMatch "\.(html|htm|php)$">
        ExpiresDefault A1
    </FilesMatch>
</IfModule>

Top · Tweet about this

Browse Articles

Previous Article:
Next Article:

2 Opinions posted so far. Now go post your own. To the comment form!

  1. From:John
    Date:22 October 2008, 23:03

    gravatar

    Thanks for writing this article. This will be very useful for a new project I’m working on.

    Top · Permanent link to this comment

  2. From:Jan Vantomme
    Date:27 October 2008, 22:28

    gravatar

    No problem. I hope this will be useful for everybody using Textpattern.

    Top · Permanent link to this comment

Commenting is closed for this article.

Subscribe to this blog

About this blog

This is the personal weblog of Jan Vantomme.
I write about the everyday things that move me as a designer. I write shorter things on Twitter.

Add to Technorati Favorites

Some of the blogs I like

Authentic Boredom
A blog on web design by Cameron Moll
Open Sphere
Some snippets for research by Filip Daniƫls
Aral Balkan
Confessions of an internet junkie.
Cocoa with Love
A blog on Cocoa development.
Andy Budd
Webdesign