We've been doing a little search engine optimisation for Freedom Recipes recently, our archive of about 31,000 special diet recipes. Google had indexed near 80,000 pages which clearly didn't make sense.
The site is powered by Wordpress and we found that the extra pages were due to the recipes being assigned to multiple categories and tags. Wordpress creates archive pages of all posts within each category, tag, author, date, etc. and we were showing 6 results per page. To cut to the point, this comes to about 45,000 archive pages.
The problem becomes apparent when you consider that each of these listings includes a snippet of text from the post itself, and the same listing may occur on multiple archive pages. That's a LOT of duplication, and Google doesn't like sites containing content duplication.
The fix is simple - tell Google not to index any archives. Thankfully the fantastic Ultimate Noindex Nofollow Tool II plugin does just that. We highly recommend installing this, even if your website is only small. Search engine optimisation is important for site and this is one of those quick fixes that really helps.
Blog posts written by former QWeb employees are not necessarily an accurate indication of the current opinions of QWeb Ltd and the information provided in tutorials might be biased or subjective, or might become out of date.
This article was migrated from an older version of our website as it may still be useful to some people, but the referenced Freedom Recipes website is no longer online.
Your email address is used to notify you of new comments to this thread, and also to pull your Gravatar image. Your name, email address, and message are stored as encrypted text and you won't be added to any mailing list and your details won't be shared with any third party.