Sitemaps

A sitemap is a list of all of the live URLs which exist on a site and is used to inform search engine crawlers of the most important pages and therefore which ones should be crawled and indexed. There are several things to consider when creating sitemaps, as well as understanding how search engines view them. We cover a range of these topics within our Hangout Notes, along with best practice recommendations and advice from Google.

Google Has a Separate User Agent For Crawling Sitemaps & For GSC Verification

October 1, 2019 Source

Google has a separate user agent that fetches the sitemap file, as well as one to crawl for GSC verification. John recommends making sure you are not blocking these.


Internally Link Pages Together to Increase Discoverability & Reduce Reliance on XML Sitemap

September 3, 2019 Source

Internally linking pages together helps Googlebot to discover the pages on your site more easily, and reduces the reliance on using XML sitemaps for URL discovery.


XML Sitemaps Should Include URLs on Same Path Unless Submitted Via Verified Property in GSC

August 23, 2019 Source

XML sitemaps should contain URLs on the same path. However, URLs in sitemaps submitted via GSC can be for any valid property within your GSC account.


Missing Sitemap Data in GSC API is a Known Error

August 9, 2019 Source

When switching over to the new GSC UI for sitemap reporting, which took place early April 2019, an issue occured within the API where data stopped updating. The team are looking into this and John expects they will document the error soon, with advice for those affected.


Google May Index Redirected URLs if Served in Sitemap Files

June 28, 2019 Source

Redirects and sitemaps are both signals that Google uses to select preferred URLs. If you redirect to a destination URL but the source URL is in a sitemap, this is giving Google conflicting signals about which URL you want to be shown in search


If One Sitemap URL Has an Error This Shouldn’t Impact the Rest of the XML Sitemap

June 14, 2019 Source

If one individual URL element within an XML sitemap has an error, this will not impact the way Google is able to parse and read the sitemap as a whole. However, if the element is broken in a way that impacts the parsing of the rest of the sitemap, then the XML file becomes unreadable and will not be usable as a sitemap.


Use Accurate Last Modified Dates For Individual Pages in Sitemaps For Faster Recrawling

April 5, 2019 Source

Make sure each individual page in an XML sitemap has its own last modified date so Google can trust that the information is accurate and recrawl updated pages where necessary.


Use Structured Data & Video Sitemaps to Give Google More Context on Videos

March 19, 2019 Source

You should use structured data to tell Google whether a video was streamed or recorded, and you can also use video sitemaps to tell Google which countries a particular video is available in, for example.


A Sitemap File Won’t Replace Normal Crawling

February 5, 2019 Source

A sitemap will help Google crawl a website but it won’t replace normal crawling, such as URL discovery from internal linking. Sitemaps are more useful for letting Google know about changes to the pages within them.


Related Topics

Crawling Indexing Crawl Budget Crawl Errors Crawl Rate Disallow Last Modified Nofollow Noindex RSS Canonicalization Fetch and Render