Let’s start with the an overview of the basics. When you go to a web page, say, cnn.com , there is a lot going on behind the scenes to make that page show up in your browser. Remember, before anything can happen, a TCP/IP connection is established between your computer and the cnn.com webserver through port 80. Your browser will then start requesting “assets” to build the page. Some of these assets include HTML, CSS, Javascript and Image files, etc. They allow the browser to render the web page visible to you.
The Webserver and your browser talk to each other in a protocol called HTTP (over TCP/IP). The two most common ways of communicating over HTTP are GET and POST. Without going into the gory technical details, the difference between a “GET” and a “POST” is that the “GET” method passes form data on the URL which is visible in the browser location window. In the POST method, the form data is sent along with the body of the request and is not visible in the URL. The following is an example of HTTP communication showing a GET request for the cnn.com logo which was sent to the Webserver by my browser.
GET /cnn/.element/img/2.0/nav/header_cnn_com_logo.gif HTTP/1.1
Host: i.cdn.turner.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14
Accept: image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Proxy-Connection: keep-alive
Referer: http://www.cnn.com/
Below is the response that is sent back by the Webserver to my browser.
HTTP/1.x 200 OK
Proxy-Connection: Keep-Alive
Connection: Keep-Alive
Date: Mon, 16 Jun 2008 13:21:24 GMT
Content-Type: image/gif
Expires: Mon, 16 Jun 2008 14:21:24 GMT
Last-Modified: Sat, 30 Jun 2007 09:29:49 GMT
Accept-Ranges: bytes
Server: Apache
Content-Length: 1607
Cache-Control: max-age=3600, proxy-revalidate
Age: 0
Notice how the Webserver responded to the request from my browser with a status code of 200. This means my request has been granted and it gives me the image. My browser received the CNN logo along with other files. For each of the assets there is a request sent and a response received. You can learn more about the HTTP status codes at the official w3.org site. If you use the Internet you most likely have come across the status code 404, which means the request can’t be found. In this post I’d like to discuss the status code 301.
A 301 status code returned by the Webserver my the browser that the requested URL has moved and passes a parameter containing the new location back to my server. So, included in the response is a “Location header” with the new location. My browser then makes another request for the new URL.
Here is an example of a status code 301 sent by the Webserver:
HTTP/1.x 301 Moved Permanently
Date: Mon, 16 Jun 2008 23:42:26 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Location: http://www.somehost.com/thenewpage.php
Cache-Control: private
Content-Length: 0
Keep-Alive: timeout=5, max=98
Connection: Keep-Alive
Content-Type: text/plain
Note that the browser makes 2 requests for the desired URL with the 301 redirect. The new URL can also be 301 redirected, this is called chaining and should be avoided if you want play nice with the search engine spiders.
The 301 redirect is search engine friendly to some extent. This means that if a particular URL has “link juice” (it’s a popular link for some keywords) a 301 redirect should transfer the popularity to the new URL. From what I read in Search Engine Optimization with PHP (a fantastic SEO book for the technically savvy) this can take some time for the transfer to occur, so 301 redirects should be applied with caution to high ranking URL’s.
PHP:
<?php
header(’http/1.1 301 Moved Permanently)’
header(’Location: http://www.mydomain.com/seo_consultant.php’);
?>
JSP:
<%
response.setStatus(301);
response.setHeader( “Location”, “http://www.mydomain.com/seo_consultant.php’ );
response.setHeader( “Connection”, “close” );
%>
Javascript:
<Javascript (client side):
script type=”text/javascript”>
<!–
window.location = “http://www.mydomain.com/seo_consultant.php”
//–>
</script>
The Javascript method is not a recommended. At some point spammers were using this method and, as a result, search engines are not in favor of client side redirection using Javascript.
The other methods for redirecting traffic is to use the .htaccess file. The nice thing about the .htaccess file is that you can use regular expressions when creating your redirection rules. With regular expressions you can do a mass 301 redirect if certain conditions are met. Here is a good .htaccess file tutorial.
Here is an example for redirecting a single page in your .htaccess file:
Redirect 301 oldpage.html to http://www.mydomain.com/newpage.html
The following lines show canonical redirecting; that is when someone requests your domain without the www, they will be 301 redirected with the www.
RewriteEngine On
RewriteCond %{HTTP_HOST} ^mydomain\.com$ [NC]
RewriteRule ^(.*)$ http://www.mydomain.com/$
Thank you, Shimon for giving me the opportunity to post here. I hope to share additional SEO posts here with a technical twist.
Ron Tovbin is a senior-level programmer. Read more of his writing on his blog ITsecPackets.