Home / SEO Office Hours / Indexing / Page 3

Indexing

In order for web pages to be included within search results, they must be in Google’s index. Search engine indexing is a complex topic and is dependent on a number of different factors. Our SEO Office Hours Notes on indexing cover a range of best practices and compile indexability advice Google has released in their Office Hours sessions to help ensure your website’s important pages are indexed by search engines.

If a Robots.txt File Returns a Server Error for a Brief Period of Time Google Will Not Crawl Anything From the Site

If a robots.txt file returns a server error for a brief period of time Google will not crawl anything from the website until they are able to access it and crawl normally again. During the period of time where they are blocked from reaching the file they would assume all URLs are blocked and would therefore flag this in GSC. You can use the robots.txt request in your server logs to identify where this has occurred by reviewing the response size and code that was returned during each request.

31 Jan 2020

Google is Able to Display Structured Data Results as Soon as the Page Has Been Re-crawled

After configuring pages to send structured data to Google, it will be able to display the structured data results the next time it crawls and indexes that page.

10 Jan 2020

Technical Issues Can Cause Content to be Indexed on Scraper Sites Before Original Site

If content on scraper sites is appearing in the index from those sites before the original site, this could be due to technical issues on the original site. For example, Googlebot might not be able to find main hub pages or category pages or may be getting stuck in crawl traps by following excess parameter URLs.

7 Jan 2020

Google Doesn’t Show Preference to Multi Page Websites Over Single Page Websites in Rankings

Google doesn’t have a preference for ranking websites with lots of pages over single page websites, the latter can rank well.

12 Nov 2019

Make Category Pages Indexable & Internal Search Pages Non-indexable

To get around URL duplication and index bloat issues, focus on providing high quality category pages and making sure that these are indexable, and noindex internal search pages as the different search combinations often create low-quality pages.

12 Nov 2019

Use View Source or Inspect Element to Ensure Hidden Content is Readily Accessible in the HTML

If you have content hidden behind a tab or accordion, John recommends using the view source or inspect element tool to ensure the content is in the HTML by default. Content pre-loaded on the HTML will be treated as normal content on the page, however, if it requires an interaction to load, Google will not be able to crawl or index it.

1 Nov 2019

“Discovered Not Indexed” Pages May Show in GSC When Only Linked in Sitemap

Pages may show as “Discovered Not Indexed” in GSC if they have been submitted in a sitemap but aren’t linked to within the site itself.

29 Oct 2019

Google Checks Status Code Pages Before Attempting to Render

Google checks the status code of a page before doing anything else, such as rendering content. This helps to identify which pages can be indexed and which pages it shouldn’t render. For example, if your page returns a 404, Google won’t render anything from it.

18 Oct 2019

Only One Version of Same Content On Different Country Sites will be Indexed & Appear in GSC Performance Reports

If you have the same content on multiple language variation sites, Google will pick one to index but will use hreflang attributes to swap out versions of the page based on a user’s location. However, only the page that has been chosen to be indexed, and used as the canonical, will be displayed in the GSC performance report.