I'm aware this could be done with mod_rewrite, however, each server i have used didn't have it installed, when i asked them to install it, i got told flat out "NO". So, i came up with another option, here it is.
Firstly you'll want to modify your .htaccess file (if it doesn't exist, create it), there are many options you can set within this file, what we want to put in there is:
ErrorDocument 404 /404.php
Note: Windows will not allow you to rename a file .htaccess, you'll have to open notepad and save it as "c:\foo\bar\.htaccess" - note the "'s you'll need to include them in the save file dialogue box.
This will instruct the web server to serve 404.php when a 404 Not Found occurs, now for the magic 404.php, I'll break it up and explain as we go:
<?php
$svr_proto = $_SERVER['SERVER_PROTOCOL'];
header($svr_proto . " 200 OK");
header("Status: 200 OK");
When a 404 Not Found occurs and 404.php is served, the headers still contain the 404 status code, search engines will think that nothing is there, and will not index the page. For this reason the above code changes the status code back to 200 OK.
$ruri = explode("_", str_replace(".html", "", str_replace("/", "", strtolower($_SERVER['REQUEST_URI']))));
The URI format used is http://www.somesite.tld/file_vars.html as you can see the above code removes the .html and the forward slash from the REQUEST_URI (which would contain /file_vars.html) and explodes it to an array using an underscore as a delimiter.
So http://www.somesite.tld/products_var.html would store "products" in $ruri[0] and "var" in $ruri[1] with some modification you could also deal with $_POST but currently i have no reason to implement that (The idea would be to grab the $_POST variables and put them inside variables too)
switch ($ruri[0]) {
case "products":
$from_404_pg = $ruri[1];
include("products.php");
break;
case "gallery":
$from_404_loc = $ruri[1];
$from_404_img = $ruri[2];
include("gallery.php");
break;
The required file is then included which will also have access to the variables, which now reside in (using the gallery case as an example) $from_404_loc and $from_404_img, i could have simply used $ruri but the naming of the other variables makes more sense when you are working inside another file.
default:
header($svr_proto . " 404 Not Found");
header("Status: 404 Not Found");
include("true_404.php");
break;
}
?>
The final piece of code (default switch case) is a TRUE 404 error, this means that none of the required conditions where met and now we must generate a real 404 Not Found, as you can see the headers are adjusted back to 404 Not Found and a true_404.php is included, this just displays a friendly error page, i will leave that up to you to write, you could even write code to read the URL from this and make suggestions as to the correct page, then make it send the webmaster an e-mail with details about a broken link.
Lets look inside products.php to see how we handle the variables passed from 404.php:
if (isset($from_404_pg)) {
$cpage = $from_404_pg;
} else {
$cpage = $_GET['pg'];
}
As you can see we still support the $_GET method (In my case, i have to support it still), but if the variable comes from 404.php we'll use that instead, now $cpage will contain the required variable: http://www.somesite.tld/products_var.html which you use as normal.
That's it even though http://www.somesite.tld/products_ice-cream.html doesn't exist on the server, the server makes it appear as a perfectly valid page (With valid HTTP Status code header) which will in turn look much more friendly to users and spiders alike.
I use the same method to create dynamic auto updating sitemaps and robots.txt files
Note: Remember the true_404.php can come from any error... so say for example you're true_404.php uses the image "images/error_page/whatever.jpg" if someone causes that error by going to http://www.somesite.tld/1/2/3/4/5/6/7/8/9/hello.html then you're true_404.php will try to load it's image from "1/2/3/4/5/6/7/8/9/images/error_page/whatever.jpg" obviously that doesn't exist... so all resources in your true_404.php file must be absolute IE: <img src="http://www.somesite.tld/images/error_page/whatever.jpg"> and if you're including php files, use the absolute path:
include($_SERVER['DOCUMENT_ROOT']) . "relative/path/to/your/file.php";
|