You are here: Home » Main » Getting your Website Deep Crawled
10th April 2006
I’ve been asked: “Why aren’t all my pages indexed by Google?�
Here are my thoughts:
Potential problems:
1) Dynamic URL’s
2) Internal links pointing to your dynamic URLs.
3) Source code has spider traps such as a robots.txt.
4) Duplicate Content such as the same page content, or the same Title & Description on all pages can throw you into the Supplemental index.
Potential fixes:
1) Create a Sitemap on your website.
2) Create and use a Google Sitemap.
3) Buying Links from authority sites, and/or higher PageRank sites. (basically, you’re buying PageRank). PageRank is a great indicator on how deeply Google is crawling your site.
4) URL rewrites to change your URL structure to static, not dynamic. Static sites typically get crawled deeper than dynamic sites.
5) Change and/or Write unique Titles & Descriptions.
In case you’re wondering what Matt Cutts has to say about this issue:
One of the classic crawling strategies that Google has used is the amount of PageRank on your pages.
In general, getting good quality links would probably help us know to crawl your site more deeply. You might also want to look at the remaining unindexed urls; do they have a ton of parameters (we typically prefer urls with 1-2 parameters)? Is there a robots.txt? Is it possible to reach the unindexed urls easily by following static text links (no Flash, JavaScript, AJAX, cookies, frames, etc. in the way)?
Related posts: - Splash Pages: Bad for Usability, Bad for SEO
- How Much Work Should You Giveaway in a SEO Proposal
|
This entry was posted on Monday, April 10th, 2006 at and is filed under Main, SEO.
You can follow any responses to this entry through the RSS 2.0 feed.
You can leave a response, or trackback from your own site.
April 10th, 2006 at
The amount of pages you have on your site / the amount of pages indexed by a search engine is called your inclusion ratio. If your site has 1000 pages, and google has 500 pages indexed (site:yoursite.com), you have a 50% inclusion ratio. Not too good. You most likely want all your content indexed to maximize traffic and qualified leads. You wait and wait, submit a sitemap, and still, your content is not being indexed. What do you do? Assuming no spider trap is present, you want to focus on your external links. Create good content which will act as link bait (natural linkage). Buy static text or image ads. List in directories. You need those links to get the bots to deep crawl your site regularly!
April 10th, 2006 at
In terms of deep crawl I began strategically placing IBL/OBL trades with websites on relevent content pages versus the typical directory style resource format. In other words if I received a link request from a reputable site that developed logos, I match them to any number of original or republished articles on my web site relating to logo design; http://www.visionefx.net/articles/redesign-your-logo.htm
The index PR went from 0 to 5 in a few months, but PR wasn’t my goal. I wanted to provide ‘users’ good content and links versus building links for search engines. I have my own opinions about PageRank though… but that’s another rant.
June 11th, 2006 at
[...] Inbound Links from other websites. Inbound links to your missing pages from authority sites, or really any other site that gets crawled regularly. It’s important to get your website deep crawled by the search engine spiders. Remember, your inbound links should be static text links. And if you are buying or trading links, there are a few things to be careful about. [...]