A sitemap is a list of all of the live URLs which exist on a site and is used to inform search engine crawlers of the most important pages and therefore which ones should be crawled and indexed. There are several things to consider when creating sitemaps, as well as understanding how search engines view them. We cover a range of these topics within our SEO Office Hours Notes below, along with best practice recommendations and Google’s advice on sitemaps.

Google Treats XML Sitemaps Differently From HTML Pages

January 11, 2019 Source

Google treats XML sitemaps differently from HTML pages, as they are a machine-readable file and not meant to be indexed by search engines.

Image Sitemaps Help Google Understand Which Images You Want to Be Indexed

December 21, 2018 Source

Google can find images to index in the source code, but Sitemaps can help them to confirm which images you want to be indexed.

Google News Sitemap is Fastest Way to Get Pages Crawled for Publishers

December 11, 2018 Source

Submitting a Google News sitemap is the fastest way to get Google to crawl pages for publishing sites.

Nested Sitemap Index Files Aren’t Supported by Google

December 11, 2018 Source

Google doesn’t support nested sitemap index files, where one sitemap references another one. Instead set up separate sitemap index files and submit them separately.

You can use non-English Language Locations Names in Image Sitemaps Geolocation Tags

November 30, 2018 Source

You can use non-English languge names for locations in Image Sitemaps geolocation tags. You can test them by searching for the location in Google Maps and see if Google can figure out where the location is. John thinks that Google’s algorithms may not use the information.

It’s Normal for Google to Index XML Sitemap Files

November 27, 2018 Source

If you see an XML sitemap file showing in the search results when you search for a specific URL on your website, this is normal and won’t cause any issues. If you don’t want XML sitemaps to be indexed, then add an x-robots tag in the HTTP header.

Only Use Sitemap Files Temporarily for Serving Removed URLs to be Deindexed

November 16, 2018 Source

Sitemap files are a good temporary solution for getting Google to crawl and deindex lists of removed URLs quickly. However, make sure these sitemaps aren’t being served to Google for too long.

Use X-Robots-Tag HTTP Header to Noindex Indexed Sitemap Files

October 19, 2018 Source

If sitemap files are indexed for normal search queries, then you can use the X-Robots-Tag HTTP header to noindex all pages ending in .xml or .gz.

Compressing Sitemaps Saves Bandwidth But Doesn’t Reduce Processing Time

October 19, 2018 Source

Compressing sitemap files using Gzip can save bandwidth but doesn’t impact the speed that Googlebot processes these files.

Related Topics

Crawling Indexing Crawl Budget Crawl Errors Crawl Rate Disallow Directives in Robots.txt Last Modified Nofollow Noindex RSS Canonicalization Fetch and Render