The .htaccess or "Hyper Text Protocall Access" file controls how browsers receive and interact with a website. This file can do a few amazing things to help direct users and search engines to the most meaningful content on your server. In general, using it wisely will help you build on a websites SEO or "Search Engine Optimization" score.
I just recently discovered these techniques as I was installing Drupal and wanted to be able to utilize the clean URL functionality that is built into Drupal 6. The first challenge I faced was that I couldn't find the .htaccess file on my Mac that was included with the Drupal install because it was hidden by the system. For all you Mac users, I have attached the code below so you can manually create a new .htaccess file to use with your site, just copy the code into a new file and name it .htaccess once it is uploaded to the server.
One of the real challenges with .htaccess files is that some operating systems and programs won't let you save them as a pure extension, that is you will end up with .htaccess.rtf or something similar. This can be avoided by first saving it as a .html file such as htaccess.html and then uploading it to the server where you rename it .htaccess Furthermore, earlier versions of Dreamwaver (MX 2004 and earlier) don't upload the .htaccess file correctly and it will crash your site. I believe this has to do with not uploading it in Binary mode since Dreamweaver doesn't recognize the extension. The good news is that I also tested it with Dreamweaver CS3 and it does work!
In Summary:
On to the real benefits and opportuniteis of .htaccess files!
htaccess files can be configured to create 301 redirects which are useful for telling search engines and site users that a file or page has moved permanently. This is important to do whenever a site is being upgraded or moved because otherwise users may be directed to blank old pages or 404 errors. Furthermore, you are wasting old links (that contribute page rank to your site) if you don't have any files there for the links to attach to anymore. It's a waste of potential SEO power!
Most people know that you can do 301 redirects in the .htaccess file but there are many nuances to this feature. Not only can you redirect to new files, you can also tell browsers to always add or exclude the www to the beginning of a site URL. This helps to build SEO and Page Rank because it ensures that all visitors will be seeing, and therefore linking, to one version of your page! I have seen many sites in my day that have incoming links to both the www and non www version of the same pages. What this means is that your rank is split in half of what it could be!
One other benefit of configuring the www feature described above is that you can also direct completely seperate URL's to a different URL and force the www or exclude it. Sorry if that sounded complicated! Imagine you have dogs.com and browndogs.com but you no longer want to have two sites, or maybe you just have extra URL's? You can use domain forwarding which most hosts provide, but it is not always that simple. A couple of weeks ago I was helping a friend transfer her site to Sustainable Online Solutions and she had two old URL's both registered with different registrars. This was a bummer, we were trying to organize the account and direct both of the old URL's (which pointed to separate sites) to one new website. As I began the process of trying to transfer the URL's from Gate.com and Network Solutions over to Godaddy I changed the DNS Servers to point to the new host for both URL's and had requested domain name forwarding for the second URL since the first was going to be the primary URL for the site, but I found out that Gate.com charges extra just for domain name forwarding! I was bummed out and didn't want to wast money forwarding the domain just for 10 days while the domain transfer was going through, but on the other hand, I didn't want Google to think my site was down by not forwarding it, and worse! I didn't want Google thinking that we had duplicate content because now both URL's were setup to point to the one site and all of the links were dynamically allowing all pages to use each URL. This was a nightmare... Fortunately I learned how to setup a redirect with the second URL in the .htaccess file as follows, note to inexperienced users - the # is used in .htpaccess files to "comment out" notes.
# To redirect all users to access the site WITH the 'www.' prefix,
# (http://example.com/... will be redirected to http://www.example.com/...)
# [NC] means no case (case doens't matter) [OR] means more (multiple domains)
RewriteCond %{HTTP_HOST} ^domain1\.com$ [NC,OR]
RewriteCond %{HTTP_HOST} ^domain2\.com$ [NC,OR]
RewriteCond %{HTTP_HOST} ^www.domain2\.com$ [NC]
RewriteRule ^(.*)$ http://www.domain1.com/$1 [L,R=301]
So what this does is to direct domain1 to www.domain1 and it also takes both versions of domain2 (the www version and the non www) and directs them to www.domain1 also, you notice the [NC] and [OR] tags, this tells the code that case doesn't matter when someone types in the URL and also that "or" there are more than one commands to be followed.
Below I have included a few examples of .htaccess simple 301 page redirects. This is a useful tool when you are upgrading or changing a site. I recently did this for a customer who had an old fashioned static html site and was upgrading to a Drupal based php site. He had page rank and links going to all of his old pages but none of those same paths or files were going to be used on the new site, the content management system was going to have it's own way of organizing the site, in addition, this person had a few files with capital letters in them and other structural pitfalls that we were happy to change with the new system. Here's how we did it:
#custom redirects RewriteRule ^index.html$ http://www.domain1.com [R=301,L] RewriteRule ^Contact.html$ http://www.domain1.com/contact [R=301,L] RewriteRule ^FAQ.html$ http://www.domain1.com/services [R=301,L]
And finally, as promised, here is the code that Drupal 6 users can copy and use if they lost their .htaccess file on their Mac as I did. Remember to save it as .html and then change to .htaccess once it's uploaded, both to keep from losing it and because some ftp clients don't recognize the .htaccess extension.
# # Apache/PHP/Drupal settings: # # Protect files and directories from prying eyes.# Don't show directory listings for URLs which map to a directory. Options -Indexes # Follow symbolic links in this directory. Options +FollowSymLinks # Customized error messages. ErrorDocument 404 /index.php # Set the default handler. DirectoryIndex index.php # Override PHP settings. More in sites/default/settings.php # but the following cannot be changed at runtime. # PHP 4, Apache 1.
Order allow,deny# PHP 4, Apache 2.
php_value magic_quotes_gpc 0
php_value register_globals 0
php_value session.auto_start 0
php_value mbstring.http_input pass
php_value mbstring.http_output pass
php_value mbstring.encoding_translation 0# PHP 5, Apache 1 and 2.
php_value magic_quotes_gpc 0
php_value register_globals 0
php_value session.auto_start 0
php_value mbstring.http_input pass
php_value mbstring.http_output pass
php_value mbstring.encoding_translation 0# Requires mod_expires to be enabled.
php_value magic_quotes_gpc 0
php_value register_globals 0
php_value session.auto_start 0
php_value mbstring.http_input pass
php_value mbstring.http_output pass
php_value mbstring.encoding_translation 0# Various rewrite rules.
# Enable expirations.
ExpiresActive On
# Cache all files for 2 weeks after access (A).
ExpiresDefault A1209600
# Do not cache dynamically generated pages.
ExpiresByType text/html A1# $Id: .htaccess,v 1.81.2.3 2007/09/21 12:24:22 drumm Exp $
RewriteEngine on
# If your site can be accessed both with and without the 'www.' prefix, you
# can use one of the following settings to redirect users to your preferred
# URL, either WITH or WITHOUT the 'www.' prefix. Choose ONLY one option:
#
# To redirect all users to access the site WITH the 'www.' prefix,
# (http://example.com/... will be redirected to http://www.example.com/...)
# adapt and uncomment the following:
# RewriteCond %{HTTP_HOST} ^example1\.com$ [NC]
# RewriteCond %{HTTP_HOST} ^example2\.com$ [NC,OR] <- this is optional
# RewriteCond %{HTTP_HOST} ^www.example2\.com$ [NC] <- this is optional
# RewriteRule ^(.*)$ http://www.example1.com/$1 [L,R=301]
#
# To redirect all users to access the site WITHOUT the 'www.' prefix,
# (http://www.example.com/... will be redirected to http://example.com/...)
# adapt and uncomment the following:
# RewriteCond %{HTTP_HOST} ^www\.example\.com$ [NC]
# RewriteRule ^(.*)$ http://example.com/$1 [L,R=301]
# Modify the RewriteBase if you are using Drupal in a subdirectory and
# the rewrite rules are not working properly.
#RewriteBase /drupal
#custom redirects
# RewriteRule ^oldsite-index.html$ http://www.example.com [R=301,L]
# RewriteRule ^2007/12/oldsite-archive.html$ http://www.example.com/node/10 [R=301,L]
#end custom redirects
# Rewrite old-style URLs of the form 'node.php?id=x'.
#RewriteCond %{REQUEST_FILENAME} !-f
#RewriteCond %{REQUEST_FILENAME} !-d
#RewriteCond %{QUERY_STRING} ^id=([^&]+)$
#RewriteRule node.php index.php?q=node/view/%1 [L]
# Rewrite old-style URLs of the form 'module.php?mod=x'.
#RewriteCond %{REQUEST_FILENAME} !-f
#RewriteCond %{REQUEST_FILENAME} !-d
#RewriteCond %{QUERY_STRING} ^mod=([^&]+)$
#RewriteRule module.php index.php?q=%1 [L]
# Rewrite current-style URLs of the form 'index.php?q=x'.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]