Google helps users to find exactly what they are looking for online. Every website owner should optimize their site in order to be placed as high up as possible in the search results. This is beneficial to the reader and customer as well as the search engine itself. Optimizing is not only about content; it also includes structural and technical aspects.Search engine optimization: basics for beginners
hreflang: The HTML attribute for multilingual websites
Anyone who wants to internationalize their website and offer different language versions of it has several tasks to face: For example, they will have to translate all relevant content and adapt the offer to the respective market to find a suitable (possibly country-specific) solution for the individual domains and domain names, or create the correct server structures for a constant uploading time for the website. Additionally, the search engine optimization for numerous language versions proves to be a challenging, but necessary aspect. Search engine crawlers can only categorize different language versions of a website correctly based on language- or country-specific SEO measures, and then direct the appropriate audience to them.
One of the most effective SEO options available when dealing with this is the link attribute hreflang.
- What is hreflang?
- Structure of the link element with the hreflang attribute
- What is behind hreflang="x-default"?
- How does hreflang implementation work?
- Frequent errors when embedding the hreflang tag
- hreflang not being implemented at the URL level
- One or more of the language versions do not refer to themselves.
- Incorrect ISO codes
- Forgoing the hreflang x-default
- Reference to old, or non-existant URLs
- Contradictory use of the canonical and hreflang tag
- Settings in Google Webmaster Tools send conflicting signals
- Useful hreflang tools
- Why it’s worth using hreflang
What is hreflang?
In December 2011, Google introduced hreflang as a simple and effective solution when it comes to informing search engines about the relationships between alternative website variants. The attribute signals to search application crawlers that the current page’s content is available in different language versions. To do this, the attribute is set within a link element in the HTML document, including the respective language abbreviation. For example, if you want to characterize a Spanish website, you use the reference ‘es’. The complete element looks as follows:
<link rel="alternate" hreflang="language code-country codel" href="website URL" />
If this element were to be integrated into an English language page, for example, the search engine would automatically identify users with Spanish speaking IP addresses and send them to the relevant language version of the site.
hreflang can also be used to distinguish variants of a single language. In this case, the attribute is simply extended by the specification of the target region. For the previously mentioned Spanish variant, it is possible to divide users into regional categories such as Spain ("hreflang="es-ES") and Mexico ("hreflang="es-MX"). The list of possible language and regional abbreviations are defined in the ISO 639 Standards and ISO 3166
hreflang cannot be equated with forwarding and therefore can be punctuated by other metadata. Other areas of international search engine optimization should be adequately covered so that the wrong language version does not appear in users’ search results.
Structure of the link element with the hreflang attribute
The link header tag can be implemented in three ways: most often, it is done via integration into the head area of the respective HTML document. For documents that are not in HTML format (e.g. a PDF file), the element can be used in the HTTP headerinstead. Finally, there is also the option to integrate the language- or country-specific attribute into the sitemap. The structure of the link element, which is used in principle to recognize relationship structures, has already been briefly presented in the above example for a Spanish language version. To clarify the general structure, here is a nonspecific form of the code:
<link rel="alternate" hreflang="Sprachkürzel-Länderkürzel" href="URL der Website" />
The link <link /> is a blank element and is simply intended to implement the corresponding attributes. It can only be used in the header area, but can be used as many times as necessary. To link the different language versions to hreflang, it is necessary to use the two attributes rel and href. The three different components have the following function:
- rel: rel is a compulsory attribute that specifies the relationship between the underlying document and the linked document. The value ‘alternate’ tells the search engine that the external document contains an alternative version of the website.
- hreflang: hreflang itself describes the language that the linked document is written in, and can sometimes also indicate which country it is particularly relevant to. The values for the region and language are also permitted in accordance with ISO-Standards 639-1 and 3166-1. If both statements are made, they should be separated using a hyphen. A clear listing of all possible combinations can be found at lingoes.net.
- href: The href attribute specifies where the alternative language version can be found, and then by default reveals the external document’s absolute URL.
The optional regional abbreviation is usually given in uppercase letters. However, Google also accepts the lowercase, which is why there are no explicit spelling boundaries.
What is behind hreflang="x-default"?
The main purpose of using hreflang is to guide users to a linguistically-appropriate version of your website. However, even with a large selection of languages, it is also possible that users cannot be assigned a specific available variant due to language or country codes. If these users encounter your website via a search engine, the ranking decides which of the language variants are displayed. In this case, you may end up losing potential readers after just a few seconds, when they land on the wrong language version. However, with ‘x-default’, Google provides an option which you can use to diffuse the situation.
The value you can set as an alternative to the ISO codes lets the search engine know that the linked URL is the default option for all users who do not have an explicitly named language version. It would then be possible to present a language overview on this default page, with a selection option so that the visitor can select the language themselves. To do this, you will need to add the appropriate code line to the alternative version’s header, which is as follows:
<link rel="alternate" hreflang="x-default" href="URL of the default site“ />
How does hreflang implementation work?
In order to understand exactly how hreflang works, there is one element of the attribute that is of central importance: hreflang links two or more documents bidirectionally, not unidirectionally – as is the case in forwarding. It is therefore insufficient for, say, a German language website to contain a hreflang link to the Spanish language variant, without also referring to the German language page. The search engine can only recognize the structure of your website and adjust the searching users’ results accordingly, if the hreflang annotation is set in all available directions (language versions) in all documents.
The following example code for the website zBdomain.de with the language versions German, Italian, Spanish, and English would have to be entered into the header of all four HTML documents:
<link rel="alternate" href="http://exampledomain.de/" hreflang="de" /> <link rel="alternate" href="http://exampledomain.de/it/" hreflang="it" /> <link rel="alternate" href="http://exampledomain.de/es/" hreflang="es" /> <link rel="alternate" href="http://exampledomain.de/en/" hreflang="en" />
Integrating the code into a HTTP header, in principle, works according to the same pattern because it is also there to reference all existing language patterns. There are only a few minor differences in the syntax. For example, if you were to offer manuals in PDF format for the four previously listed language examples, and would like to inform the search engines of this, the code would look as follows:
Link: <http://exampleldomain.de/downloads/manuals.pdf/>; rel="alternate"; hreflang="de" Link: <http://exampledomain.de/it/downloads/manuals.pdf/>; rel="alternate"; hreflang="it" Link: <http://exampledomain.de/es/downloads/manuals.pdf/>; rel="alternate"; hreflang="es" Link: <http://exampledomain.de/en/downloads/manuals.pdf/>; rel="alternate"; hreflang="en"
The distinction of the hreflang attribute in an XML site map is a particularly sensible alternative to the simple distinction in each individual page, if you are running a large, international web project. If you offer multilingual content on a large scale, the effort involved in the usual implementing of HTML would be relatively high. In a sitemap, all language versions must also be listed individually by specifying the respective URL. Each URL is additionally specified by an xhtml link element that references the other available variants:
<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml"> <url> <loc>http://exampledomain.de/</loc> <xhtml:link rel="alternate" hreflang="de" href="http://exampledomain.de/" /> <xhtml:link rel="alternate" hreflang="it" href="http://exampledomain.de/it/" /> <xhtml:link rel="alternate" hreflang="es" href="http://exampledomain.de/es/" /> <xhtml:link rel="alternate" hreflang="en" href="http://exampledomain.de/en/" /> </url> <url> <loc>http://exampledomain.de/it/</loc> <xhtml:link rel="alternate" hreflang="it" href="http://exampleldomain.de/it/" /> <xhtml:link rel="alternate" hreflang="de" href="http://exampledomain.de/" /> <xhtml:link rel="alternate" hreflang="es" href="http://exampledomain.de/es/" /> <xhtml:link rel="alternate" hreflang="en" href="http://exampledomain.de/en/" /> </url> <url> <loc>http://exampledomain.de/es/</loc> <xhtml:link rel="alternate" hreflang="es" href="http://exampledomain.de/es/" /> <xhtml:link rel="alternate" hreflang="de" href="http://exampledomain.de/" /> <xhtml:link rel="alternate" hreflang="it" href="http://exampledomain.de/it/" /> <xhtml:link rel="alternate" hreflang="en" href="http://exampledomain.de/en/" /> </url> <url> <loc>http://exampledomain.de/en/</loc> <xhtml:link rel="alternate" hreflang="en" href="http://exampledomain.de/en/" /> <xhtml:link rel="alternate" hreflang="de" href="http://exampledomain.de/" /> <xhtml:link rel="alternate" hreflang="es" href="http://exampledomain.de/es/" /> <xhtml:link rel="alternate" hreflang="it" href="http://exampledomain.de/it/" /> </url> </urlset>
Frequent errors when embedding the hreflang tag
The listed hreflang examples for HTML show that the implementation of a practical, automatic user allocation for multilingual website should by no means to be considered witchcraft. However, they also show the effort and the error potential (which should not be underestimated) which is accompanied by the numerous, interchanging references. This sometimes results in smaller or larger errors, which only somewhat affect individual pages, but also endanger the functionality of hreflang for the entire website. For this reason, we have summarized some of the most common sources of error:
hreflang not being implemented at the URL level
Since the hreflang award is always linked to a specific URL, it must always be created at that level. In other words, if you only mark the output URL of your different language versions with the hreflang attribute, the automatic user assignment will only work for these start pages, and not for the entire website. It is therefore your task to implement the link element individually for all multilingual URLs, or alternatively, to work with the mentioned sitemap variant.
One or more of the language versions do not refer to themselves.
Many site owners commit an error by correctly drawing the other language version URLs with the hreflang attribute, but they forget that the page must refer to itself. In this case, the linking structure is incomplete and cannot be interpreted for Google and other search engines.
Incorrect ISO codes
When it comes to country and language shortcuts, many SEOs tend to become creative. However, this is often unsuccessful and can lead to the implementation of faulty hreflang code. The combination ‘en-uk’ would appear to be a good choice, if the content of the page is explicitly addressed to the UK public – however, the correct link is actually ‘en-gb’. When addressing Danish users, the code ‘dk-DK’ also seems to be the obvious choice – however, while the ISO country code for Denmark is ‘DK’, the Danish language code is ‘da’, as defined by the ISO standard 639-1, thus making the correct prefix ‘da-DK’.
Forgoing the hreflang x-default
Whether you are working with an initial language selection menu on the home page, redirections based on the IP address or automatic forwarding based on the visitor’s browser language, there is in principle no reason to forgo a default page. As a quasi-placeholder, it will help you create an excellenct site, winning over users whose language or country are not represented on your website. Google also recommends using this display version.
Reference to old, or non-existant URLs
A common reason for faulty hreflang code is using URLs that do not exist, or no longer exist. It often happens in the first line, if an award is automatically integrated on all sub-pages, but not all the sub-pages are available in all offered language variants. Webmasters often forget to incorporate the corresponding logic element so that only the URLs which actually exist are assigned as alternative destinations. Deprecated URLs occur automatically whenever you make changes to the URL structure and forget to also make these changes to the link elements.
Contradictory use of the canonical and hreflang tag
Many international websites access the canonical tag so that search engine crawlers do not double-index pages with the same content. Although this approach is a great option in terms of working around the duplicate content issue, it is not compatible with the hreflang attribute. If a page contains both tags, the search engines get the following contradictory information:
- Ignore this page and use the following page instead (Canonical-Tag).
- There are alternative pages, which may be better for the user. This page should still be captured and indexed as a possible option (inter-tag).
On the one hand, this kind of URL refers to itself, but on the other hand it also refers to a different URL, which in turn contains a back-reference. As a result, the search engine ignores both signals and attempts to capture the structure in a different way. Therefore, the hreflang markup should only be used for pages that do not refer to a different page via the canonical attribute.
Settings in Google Webmaster Tools send conflicting signals
Anyone who logs into Google’s Webmaster Tools (Search Console) can define the international orientation of a domain or URL, provided that top-level generic domains are used. In country-specific transmissions, Google even does this itself. There is no doubt that you, as a website operator, can benefit from this feature: Google uses this information to make the best categorization of your site. However, you should never forget to include these settings in your SEO measures. If you add additional pages with the hreflang attribute, there should be no contradictions or mistakes. For example, it can easily happen that a page is only awarded a language attribute with no regional specification, while at the same time, a specific country has been assigned in the Webmaster Tools.
Useful hreflang tools
It has already become clear that the integration of the hreflang tag into all relevant HTML pages does require a certain amount of effort, and a lot of code. The more complex your multi-lingual website is, the more likely mistakes will creep into the hreflang implementation, even if you are aware of the potential sources of the problem and keep them in mind the entire time. For this reason, it is recommended to use tools to create the tags and check their functionality at regular intervals. Some interesting options can be found in the following list:
- SISTRIX hreflang Generator: With the free hreflang generator from SISTRIX, you can easily create link elements with a hreflang attribute for the HTML header of your multi-lingual content. To do this, you simple specify the URLs and the desired country and language slashes, and then generate the code by clicking on ‘Create Code’. It is also possible to define a default page.
- SISTRIX hreflang Validator: If you have implemented references for different language versions in your web project, the SISTRIX hreflang validator may be useful to you. The free web service checks whether the set hreflang tags are correct for a given URL.
- flang: The marketing company Dejan SEO provides a free option for verifying your hreflang elements. After a short waiting period, you will receive the defined target languages and regions for all entered alternatives and, in the event of a problem, also provide optimization suggestions for the desired URL.
- Google Search Console: Signing into Google makes your site easier to track for the search engine, but also allows you to access various analysis tools to optimize your web project. In the ‘international orientation’ section, the dashboard also provides information on the used tags, including a list of missing backlinks.
Why it’s worth using hreflang
The main argument for using attributes such as canonical or hreflang is to avoid duplicating content in multilingual web projects. You are often trying to serve multiple markets without fundamentally changing the content – except with the actual translation. For countries where the same language is spoken, the situation is even more complicated: a few changes due to cultural or regional differences (vocabulary, currency, contact information, etc.) are problems for content that is otherwise identical. Since the same domain is generally used, it important to send search engines unique signals to prevent a negative classification.
While the canonical attribute explains a URL to the dominant variant and excludes all alternative versions from the indexing, the hreflang attribute signals which version of a specific target group (language and/or country) should be presented at the top of a search engine search. Because of this, the HTML attribute is always appropriate if you want to run an internationally successful website and have the appropriate multilingual content available to your visitors. Even if the attribute has no direct influence on search engine rankings, the correct use will pay off in the long term, because both crawlers and users in different countries will have easier access to your website.
Not all search engines use the hreflang attribute. Bing captures the language version of a page by using the content-language attributes in the meta tags.