Indexing
In order for web pages to be included within search results, they must be in Google’s index. Search engine indexing is a complex topic and is dependent on a number of different factors. Our SEO Office Hours Notes on indexing cover a range of best practices and compile indexability advice Google has released in their Office Hours sessions to help ensure your website’s important pages are indexed by search engines.
URLs Removed With the URL Removal Tool Will Be Crawled and Indexed But Stripped From Search Results
If you use the URL removal tool, they should disappear from results within a day. However they are still technically indexed, and recrawled periodically, so when the removal expires, they will appear again based on the latest crawl data. Later he clarifies that these URLs are just stripped out of the search results at the last minute, and is still included in other calculations.
RSS + PubSubHubbub is better than XML sitemaps
John recommends using RSS with PubSubHubbub as the fastest way to get new content indexed.
Disallow doesn’t prevent indexing
A disallowed URL will be indexed and shown in results if Google has sufficient external signals.
Google Can Index Without Looking At Content
Google can index pages without ever looking at the content.
Canonicalised Pages Show up in Google’s Index
Canonicalised pages will still show up for site: searches, but that doesn’t mean the canonical tags aren’t working.
Good Quality Category and Tag Page Can Be Indexable
John supports good quality ‘category’ type tag pages which are valuable to users and worth having indexable.
Indexing Paginated and Search Results Pages
Search results pages can be made indexable. Including only the 1st page of a paginated set is an option, provided you make sure that all the details/product pages can still be reached.
Canonicalise Product Variants
A discussion around when to canonicalise pages to other pages. e.g for colour variations of product pages.
Break XML Sitemaps into Small Chunks
Breaking up XML Sitemaps into smaller groups can give you more feedback on indexing issues, which are reported separately for each Sitemap in Webmaster Tools.
Google Will Rewrite Title Tags
Sometimes Google will rewrite your title tags if they have a lot of irrelevant keywords (keyword stuffed), if they are heavy duplicated or if they are too long to be displayed.