Optimizing Textpattern: .htaccess
After reading Building Findable Websites by Aarron Walter, I decided to test some of his server side strategies with Textpattern. In this article I’m going show you how to modify the Textpattern .htacces file to optimize your website for faster search engine indexing.
Start by opening your .htaccess file or fetch the current one at this url:
http://textpattern.googlecode.com/svn/releases/4.0.6/source/.htaccess
Remove Canonical URLs with mod_rewrite
If you can access your website with and without the www prefix, your website may suffer from the Google Canonical Problem. You can easily avoid this problem by addding two lines to the mod_rewrite module on your Apache server. Add these lines right after RewriteEngine On to your Textpattern .htaccess file.
Force Apache to use www
RewriteCond %{HTTP_HOST} ^yourdomain.com$ [NC]
RewriteRule ^(.*) http://www.yourdomain.com/$1 [R=301,L]
Get rid of the www prefix
RewriteCond %{HTTP_HOST} ^www.yourdomain.com$ [NC]
RewriteRule ^(.*) http://yourdomain.com/$1 [R=301,L]
This solves only one part of the duplicate content issue. Take a look at the URLs below. They both point to the same article but Google sees these as two separate pages.
http://domain.com/section/article-title http://domain.com/section/article-title/
This one is easy to fix by adding a trailing slash to each url with this line:
RewriteRule ^(.+[^/])$ http://domain.com/$1/ [R=301,L]
Using mod_headers to set Cache Control
Aarron uses a cache control header of 1 day for HTML and PHP documents. This is fine for static HTML pages, but with dynamically generated pages this can cause a problem. This is the reason why I use 1 second.
<IfModule mod_headers.c>
# Default: 1 Week
Header set Cache-Control "max-age=604800, public"
# 1 Year: ICO, PDF, FLV
<FilesMatch "\.(ico|pdf|flv)$">
Header set Cache-Control "max-age=29030400, public"
</FilesMatch>
# 1 Month: JPG, PNG, GIF, SWF
<FilesMatch "\.(jpg|png|gif|swf)$">
Header set Cache-Control "max-age=2592000, public"
</FilesMatch>
# 1 Month: XML, TXT, CSS, JS
<FilesMatch "\.(xml|txt|css|js)$">
Header set Cache-Control "max-age=2592000, public"
</FilesMatch>
# 1 Second: HTML, PHP
<FilesMatch "\.(html|htm|php)$">
Header set Cache-Control "max-age=1, public"
</FilesMatch>
</IfModule>
Using mod_expires
Be very careful with this one, it can ‘break’ the built in commenting system if you send an expires header of a day for dynamically generated pages.
<IfModule mod_expires.c>
# Enable expirations.
ExpiresActive On
# Cache all files for 1 week
ExpiresDefault A604800
# 1 Year: ICO, PDF, FLV
<FilesMatch "\.(ico|pdf|flv)$">
ExpiresDefault A31449600
</FilesMatch>
# 1 Month: JPG, PNG, GIF, SWF
<FilesMatch "\.(jpg|png|gif|swf)$">
ExpiresDefault A2592000
</FilesMatch>
# 1 Month: XML, TXT, CSS, JS
<FilesMatch "\.(xml|txt|css|js)$">
ExpiresDefault A604800
</FilesMatch>
# 1 Second: HTML, PHP
<FilesMatch "\.(html|htm|php)$">
ExpiresDefault A1
</FilesMatch>
</IfModule>
The final .htaccess file
Your final .htaccess file looks like this. Download this file to use on your own website.
#DirectoryIndex index.php index.html
#Options +FollowSymLinks
#Options -Indexes
<IfModule mod_rewrite.c>
RewriteEngine On
#RewriteBase /relative/web/path/
# Force Apache to use WWW
RewriteCond %{HTTP_HOST} ^yourdomain.com$ [NC]
RewriteRule ^(.*) http://www.yourdomain.com/$1 [R=301,L]
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.+) - [PT,L]
#add trailing slash
RewriteRule ^(.+[^/])$ http://www.yourdomain.com/$1/ [R=301,L]
RewriteRule ^(.*) index.php
RewriteCond %{HTTP:Authorization} !^$
RewriteRule .* - [E=REMOTE_USER:%{HTTP:Authorization}]
</IfModule>
#php_value register_globals 0
AddDefaultCharset utf-8
<IfModule mod_headers.c>
# Default: 1 Week
Header set Cache-Control "max-age=604800, public"
# 1 Year: ICO, PDF, FLV
<FilesMatch "\.(ico|pdf|flv)$">
Header set Cache-Control "max-age=29030400, public"
</FilesMatch>
# 1 Month: JPG, PNG, GIF, SWF
<FilesMatch "\.(jpg|png|gif|swf)$">
Header set Cache-Control "max-age=2592000, public"
</FilesMatch>
# 1 Month: XML, TXT, CSS, JS
<FilesMatch "\.(xml|txt|css|js)$">
Header set Cache-Control "max-age=2592000, public"
</FilesMatch>
# 1 Second: HTML, PHP
<FilesMatch "\.(html|htm|php)$">
Header set Cache-Control "max-age=1, public"
</FilesMatch>
</IfModule>
<IfModule mod_expires.c>
# Enable expirations.
ExpiresActive On
# Cache all files for 1 week
ExpiresDefault A604800
# 1 Year: ICO, PDF, FLV
<FilesMatch "\.(ico|pdf|flv)$">
ExpiresDefault A31449600
</FilesMatch>
# 1 Month: JPG, PNG, GIF, SWF
<FilesMatch "\.(jpg|png|gif|swf)$">
ExpiresDefault A2592000
</FilesMatch>
# 1 Month: XML, TXT, CSS, JS
<FilesMatch "\.(xml|txt|css|js)$">
ExpiresDefault A604800
</FilesMatch>
# 1 Second: HTML, PHP
<FilesMatch "\.(html|htm|php)$">
ExpiresDefault A1
</FilesMatch>
</IfModule>
Commenting is closed for this article.
2 Opinions posted so far. Now go post your own. To the comment form!
From:John
Date:22 October 2008, 23:03
Thanks for writing this article. This will be very useful for a new project I’m working on.
Top · Permanent link to this comment
From:Jan Vantomme
Date:27 October 2008, 22:28
No problem. I hope this will be useful for everybody using Textpattern.
Top · Permanent link to this comment