You are here: Home » Main » Do phpbb Pages get Indexed by Google?
24th October 2006
If you are running a phpbb forum, and you’re wondering why your webpages aren’t indexed by Google, it’s most likely because they have Session ID’s on the URL. Session ID’s are the “sid” part of the URL string. A typical URL looks like this: www.mysite.com/forum/viewforum.php?f=7&sid=c76b2a
The phpbb URL’s that have session ID’s won’t get indexed by Google. Google will crawl & index dynamic URL’s with several parameters. However, Google won’t index those dynamic URL’s that contain Session ID’s. The Search Engines have a problem with things like ?SID=abcdefghijklmnop where the string is randomized each visit, but if the content is always the same (i.e., viewtopic.php?t=100 is always viewtopic.php?t=100), then bots (at least Google’s) shouldn’t have any problem whatsoever, which is why there are thousands of phpBB boards all over Google without having to change to so-called static URLs.
To get phpbb pages indexed by Google, the only change that needs to be done is modifying the URL to remove the Session ID. That’s it. That’s because Google will crawl and index dynamic urls. But, ideally, it’s best to write a MOD Rewrite that changes all the dynamic pages to a static URL, and write a MOD to remove the Session ID from the URL for each user-agent ( Eg: Search Engines like Google, Yahoo, MSN, etc ).
Additional resources:
See Feb 15th. Disabling Session IDs in PHPBB Forums.
MOD to remove Session ID’s
Session ID removal, and static URLs
User-agents & Robots.txt
Webmaster Tool
Related posts: - Splash Pages: Bad for Usability, Bad for SEO
|
This entry was posted on Tuesday, October 24th, 2006 at and is filed under Main, SEO.
You can follow any responses to this entry through the RSS 2.0 feed.
You can leave a response, or trackback from your own site.
November 10th, 2006 at
If a bot is visiting a page, would that even generate a session ID? (Forgive me if that’s a stupid question - I’m not a techie)
At the end of last month Vanessa Fox posted on the Google webmaster blog that &id= is fine.
http://googlewebmastercentral.blogspot.com/2006/10/update-to-our-webmaster-guidelines.html … does that address your recommendation here?
Would removing the SID really help get your pages reindexed? (not that I think session IDs should be carried in the URL string in any case)