Deepcrawl is now Lumar. Read more.
DeepcrawlはLumarになりました。 詳細はこちら

A Guide to Hreflang Best Practice & Implementation

Chapter 3: Getting Your International Web Structure Right

In order to crawl, index and show the right country or language version of your site to users in a particular market, first a search engine needs to know about the different variations you have and how they’re connected. Hreflang is one of the best ways to communicate this, as it describes the intended language of a page and even the country it is intended for, helping you map out the different language versions available for a page.

If you have multiple versions of a page for different languages or regions, tell Google about these different variations. Doing so will help Google Search point users to the most appropriate version of your page by language or region.

Google Search Console Help

With hreflang, you can:

  • Target specific countries
  • Target specific languages
  • Target specific countries and languages
  • Target the same country with different languages

With hreflang, you can’t:

  • Target entire regions such as Europe, Asia and Latin America

Hreflang tags are needed because of businesses’ desires to target the whole world, and the issues of duplication that can arise when you have a particular language version targeting many different countries. These tags are a technical solution for helping Google decide which page should appear where, so they shouldn’t contribute to cannibalisation issues. With the correct implementation, Google will recognise a set of pages as alternates and swap out the URLs between the set depending on the SERP language, so rankings won’t be changed. A page set will share the same collective ranking position.

Hreflang swaps URLs in the search results but doesn’t affect rankings.

-John Mueller, Google Webmaster Hangout

Here’s an example of what hreflang tags should look like across 3 different pages in a cluster:

Hreflang configuration example

All pages in a cluster should be listed within hreflang tags on a page. This is why for an example where there are US, UK and Australian versions, each page version needs to reference the URLs of the other language pages as well as itself.

A correct hreflang configuration forms a strong, technical platform from which your content will reach its intended audience and increase potential conversions.

There’s nothing more frustrating for users, and damaging to brand reputation than providing a negative user experience. The core function of hreflang is to make sure that the content you’ve optimised, translated, and invested in is shown to the right users, to provide the best brand and user experience possible.

Only once the unsung technical is working, will the marketing campaigns and creative messaging yield results for the business.

Hreflang is certainly a useful technical tool to have at your disposal for your internationalisation efforts. However, hreflang is arguably one of the most complicated elements of internationalisation, and is, in Google’s John Mueller’s words, one of the most complicated thing about SEO full stop.

Despite this complexity, it’s important to fully understand hreflang in order to ensure that it lives up to its full potential in helping your business succeed globally. So, let’s explore the details, from configuration through to best practice and testing.

How to implement hreflang

These are the three main methods you can use to submit hreflang to search engines:

  1. XML sitemaps
  2. HTTP headers
  3. HTML <head>

XML sitemaps

You can add hreflang child sections to each of your sitemap URLs. This is a useful method for sites with many language variations that want to avoid additional on-page code.

Hreflang example configuration in an XML sitemap

Organising your sitemaps in a granular, clear-to-understand way can help in easily identifying problem areas. Make sure your sitemap implementation allows you to clearly analyse country, language, category and page type.

HTTP headers

HTTP headers are returned with your page’s GET response, and you can include hreflang within them. This is useful for non-HTML pages such as PDFs.

Hreflang example configuration in a HTTP header

This is a good option if you have a development team you can rely on, however, it can be more difficult and time-consuming to validate hreflang in the HTTP header compared to the HTML head or XML sitemaps.

HTML head

Hreflang HTML tags can be added into the head section of a page. This is useful if you don’t have a sitemap or are unable to configure a site’s HTTP headers.

Hreflang configuration example in the HTML head

There is more coding flexibility with this method, but make sure theisn’t being broken by any iframes or div tags, meaning your hreflang could be ignored by search engines. Ideally you should have hreflang above these tags and any JavaScript that modifies the head, but the best solution is to not have these types of scripts or tags in theat all.

Put hreflang tags higher up in the head.

-John Mueller, Google Webmaster Hangout

Page weight and code file size is a consideration for hreflang on-page implementation through the HTML head or HTTP header. These methods are manageable if you have a few language versions, but it can get messy if you have many different variations.

The problem we have with any hreflang implementation other than XML sitemaps is, and this is the big irony I’ve asked Google – “If we’re telling people to reduce code from our pages, why are we also enabling people to put what can be hundreds lines of code on a page to manage hreflang language?” If you have 5 language variations or fewer, then go for it, but if you have more than that you need to think about how you can map and cross-identify these pages, and there is no better way than an XML sitemap to do that.

Bill Hunt, President of Back Azimuth Consulting

Use whichever hreflang configuration makes most sense for your business, but XML sitemaps are often the easiest to manage, especially for large sites.

Whichever method you choose for implementing hreflang, make sure you use that one consistently. Avoid combining different methods so you’re giving the search engines the clearest signals possible with your hreflang configuration. For more details and assistance on methods for indicating alternate pages, take a look at Google’s guidelines.

Hreflang best practice

When configuring hreflang, these are the top 10 things you need to include in your audit checklist:

  1. Scope out implementation methods and which one works best for your business.
  2. Map out the variations you actually need, rather than implementing all of them.
  3. Language variations have been included for all page versions including the current page.
  4. Reciprocal tags are in place on other pages in a cluster.
  5. The correct region and country codes have been used and in the right order.
  6. The tags match the language of the content on the target page.
  7. Only absolute, canonical URLs have been used.
  8. Submit configurations for both desktop and mobile sites if they are separate.
  9. The configuration has been validated before launch.
  10. The configuration has been tested for any errors once live.

If hreflang isn’t implemented correctly, Google may simply ignore the tags. This is why it’s essential to get even the smallest details right.

Google ignores incorrect language tags.

-John Mueller, Google Webmaster Hangout

Here are some of the details to pay attention to in order to make sure your hreflang tags are respected by Google:

  • Hreflang codes must be separated by dashes rather than underscores.
  • The language code must be followed by the country code, not the other way around.
  • The href attribute needs to include the full protocol.
  • Hreflang tags should match the target language of the page content.
  • There must be reciprocal links between hreflang tags (so, if a French page points to a German page, the German page must also point to the French page).

Hreflang tags without a reciprocal tag will be ignored.

-John Mueller, Google Webmaster Hangout

Google explains that reciprocal tags are needed “so that someone on another site can’t arbitrarily create a tag naming itself as an alternative version of one of your pages.”
Google Search Console Help

Hreflang is only used as a signal by Google for determining language versions, and isn’t followed without the correct implementation.

In order to validate and actively use the URLs within hreflang sets, Google needs to be able to crawl them in the first place. It needs to crawl them at least twice, in fact.

Each language version has to be crawled and indexed at least twice for hreflang to work.

-John Mueller, Google Webmaster Hangout

This is why they have to be final destination, canonical URLs serving 200 status codes that aren’t blocked in the robots.txt file, so that search engines are being sent clear signals on your language versions and can crawl and process these pages in the first place.

Hreflang should be included between the canonical versions of pages.

-John Mueller, Google Webmaster Hangout

However, you might be surprised to learn that it is possible to canonicalise URLs to another page of the same language, which Glenn Gabe discovered:

Typically, you should have each page that’s part of an hreflang cluster indexable with self-referencing canonical tags. But some site owners choose to canonicalize alternative URLs in the same language to one, even when the URLs target different countries. With that setup, you would think that since Google is not indexing the canonicalized URLs, then they would never surface in the SERPs. That’s not true, actually! Google can still surface those URLs when it sees users searching from the other countries.

Here is an example of what can happen. The /uk/ url is being canonicalized, but still appears in the SERPs for users searching in the UK:

Glenn Gabe hreflang canonicalisation example

So even if the /uk/ version isn’t indexed, it can still show up in the SERPs when Google sees hreflang tags properly set up and a user searching from England. I ended up asking Google’s John Mueller about this mystery during a webmaster hangout and he confirmed this was the case. John explained that Google can follow hreflang tags even when it chooses one version as the canonical URL (for multiple urls in the same language).

So for international SEO, just remember that if you are using hreflang for URLs in the same language but targeting different countries, Google has a few tricks up its sleeve. The right URLs can appear in the SERPs by country, even when they are being canonicalized to other urls (and not indexed). Strange, but true.

Glenn Gabe, President of G-Squared Interactive

For page alternatives like separate m-dot sites, make sure you submit hreflang for these pages too. These pages especially need to be found following the rollout of Google’s mobile-first index. Not only do separate mobile sites need to have their own hreflang tags, but the tags should only point to other pages within that particular configuration type. This separate configuration was confirmed by John Mueller on Twitter.

This is how hreflang tags should be mapped out across mobile and desktop, as demonstrated by Ashley Berman Hale, Technical SEO Lead at Lumar:

Hreflang mobile and desktop configuration diagram

Ashley Berman Hale, Technical SEO Lead at Lumar

Language and country codes

Hreflang tags are made up of one or two codes in combination: the language code and the country code, e.g. ‘en-GB’. The language always needs to be specified as the foundational element. You can target just by language, but can also refine this by adding a country. However, you can’t just target a country with hreflang.

There are simple solutions out there for finding the right codes you need. Aleyda Solis’ Hreflang Tag Generator Tool is a must use, which automatically shows you the code combination you need after you input the language and country you want to target. Another tool for managing hreflang annotations at scale is Bill Hunt’s Hreflang Builder tool.

Make sure you double check Google’s official specifications on hreflang codes before implementing them. There are many instances of people mistakenly using ‘uk’ instead of ‘gb’ or using ‘eu’ to target the whole of Europe, for example, neither of which are supported country codes.

Hreflang accepts the ISO 639-1 language codes and the ISO 3166-1 Alpha 2 format country codes, so make sure to check the official documentation to get your codes right from the offset. If the language you want to target has different script variations (such as Chinese which has a traditional and simplified version), then you can use ISO 15924 codes.

In cases where you want to set a default language version for pages not explicitly targeted with a language or country code, you can use the x-default hreflang attribute. This works well in the instance of a homepage with a banner for choosing a language.

The x-default hreflang attribute value signals to our algorithms that this page doesn’t target any specific language or locale and is the default page when no other page is better suited.

Google Webmaster Central Blog

How to test your hreflang configuration

Generating hreflang tags is one thing, but testing them is a different story, especially when dealing with legacy tags.

These are the main errors you’ll need to watch out for when testing the pages within your hreflang configuration:

  • The page isn’t indexable
  • The page isn’t a canonical URL
  • The page is returning an error code
  • The page is redirecting
  • The page is blocked

Luckily, these are elements that Lumar reports on. Our crawling tool helps you analyse pages without hreflang tags, the different hreflang combinations for each page, broken hreflang links, hreflang links pointing to non-indexable pages, and more.

Lumar's hreflang report

Lumar's hreflang changes report

A table of hreflang variations per page in Lumar

If you want more insights into your hreflang configuration, take Lumar for a spin and see what you can discover for your international SEO auditing.

Try out Lumar

While hreflang is undoubtedly a useful method in international SEO, be mindful that it shouldn’t be the defining element of your strategy. It should be used as one of many correctly implemented signals to point search engines to your different language variations.

Some people get obsessed with hreflang annotations, overlooking that they’re part of a higher number of signals that Google uses to correctly identify a page target, such as unique, better-targeted and localised content, more links from local sites, etc.

Trying to tag every single page for sites with millions of URLs is very time-consuming, when you could start by prioritising the pages that have a higher risk of ranking in non-relevant SERPs: those that share the same language, for example. A good way to identify this is by going to your GSC or GA account and seeing which countries shouldn’t be served by each web version, and which pages are attracting non-relevant rankings and visits and see if you already offer that same content in another relevant website version.

Aleyda Solis, International SEO Consultant at Orainti

Chapter 5: The Elements of a Successful International SEO Strategy

Newsletter

Get the best digital marketing & SEO insights, straight to your inbox