Google Will Crawl Sitemaps That Have Been Removed from GSC
It’s not enough to remove an old sitemap file from GSC to prevent it from being crawled, you need to remove it from the server to prevent Google from finding and crawling it. John recommends fixing the sitemap file if possible though.
Large Hreflang Sets Should be Included in Sitemap Files
If you have a large set of hreflang tags then John recommends putting these in your sitemap files as this makes them easier to maintain.
Sitemaps Submitted Through GSC Will be Remembered for Longer
Google’s memory for sitemaps is longer for those submitted through Google Search Console. Sitemaps that are submitted through robots.txt or are pinged anonymously are forgotten once they are removed from the robots.txt, or if they haven’t been pinged for a while.
Sitemap Files Returning 404s Don’t Cause Issues for Google
Sitemap files that return 404s don’t cause any issues for Google from an SEO perspective, they will just be left as 404s.
Sitemaps Are More Critical for Larger Sites with High Churn of Content
Sitemaps are more useful for larger websites that have a lot of new and changing content. It is still best practice to have sitemaps for smaller sites that largely have the same content, but they are less critical for search engines to find new pages.
Static Sitemap Filenames Are Recommended
John recommends having static site map filenames that don’t change every time they are generated so they don’t waste time crawling sitemaps URLs which don’t exist any more.
Canonicals Are Chosen by Google Using XML Sitemap URLs
XML sitemap URLs are used to help inform Google’s decision on which URL is chosen to be the canonical.
Make Sure There is a Clear Connection Between Your Mobile & Desktop Sites
It’s possible to include m. pages in your main sitemap file to help Google discover and crawl them for mobile-first, but if there is a clear connection between the desktop and mobile sites then this won’t be necessary.
Crawl Frequency Attribute in XML Sitemaps Doesn’t Impact Crawl Rate
Google takes no notice of the crawl frequency attribute in XML sitemaps or any priority set. Only the last modification timestamp will impact crawl rate.