8 Mbits on the left lane
I've recently gotten to work a bit on optimizing my site for search-engines, and more specifically how to have the best URL possible for search engines. Making your site user-friendly and accessible is a good way to get high ranking in search engines, and having clean and well layed-out URL will certainly help both your users and your ranking. However after searching through the forums I learned a few tricks:
Dynamic URL: pages such as article.php?id=3 are properly indexed, however search engine prefer static ones like Article-3.html.
URL keywords: keywords in URL do count, even more so if they are inside the domain name itself (but nowadays all good domains with valuable keywords are taken...). That's why most blogging platforms use the article title to build the URL, eventhough it makes it very hard to type manually.
URL depth count: it seems search engines penalize pages that are too deep in a directory structure. That was an issue for me because like many people I was using mod_rewrite to cleanly pass parameters to scripts (such as script.php/param1/param2/param3/ ).
Dashes are the word separator: another problem was that I was separating keywords in URL with underscore. I went this route because it is easier to read in the address bar, however it turns out that to Google, a dash is seen as a keyword separator whereas the underscore is just another character.
Of course I could have just changed the URLs and be done with it, with the old one returning a "404 Not Found" error. However, besides breaking external links, this will hurt ranking in search engine because the new URL will be considered like new pages, and might be considered duplicate content.
The fix to this is a permanent redirect, which is understood properly by search engines as "the page you look for has moved, but is otherwise the same thing". mod_rewrite can do just that, using something like :
RewriteRule ^old_url.html$ new-url.html [R=301,L]
You would obviously use regular expressions to handle multiple redirects within this single line. In my situation, regular expressions were not flexible enough to do my rewrite. I needed to process the old URL in something like PHP to be able to find the new URL and do a proper redirect. I found two roads to get there: the first is to use the RewriteMap instruction in mod_rewrite, which gives you the option of using some external program of your choice to handle URL rewriting. The other was a bit simpler, it involved changing this in the .htaccess file:
RewriteRule ^regexp_for_old_url$ fixit.php [L]
Which would silently have all old URLs handled by fixit.php. This would be a simple PHP script to do the rewriting work:
<?php
$newURL=$_SERVER["REQUEST_URI"];
// Process $newURL here
header('HTTP/1.1 301 Moved Permanently');
header('Location: '.$newURL);
die();
?>
Job's done !