Rel=canonical: what is it, why do you need it and how to use it
21st June 2019
In our previous article we’ve explained the importance of using hreflang tags for multilingual websites together with the implications of incorrect implementation.
Hreflang tags are usually used in concurrence with rel=canonical attributes – another important HTML element.
What is a canonical URL and why do you need it?
A canonical URL is the URL of the page that Google thinks is most representative from a set of duplicate pages on your site.
If you have a page on your website which can be accessed via different URLs, or different pages with similar or even identical content (for example the same content translated into several languages), Google sees these as duplicate versions of the same page.
Google will select one URL as the canonical (preferred) version and crawl and index it. The other URLs will be considered duplicate URLs and crawled less often.
The issue here isn’t just with the duplicated content. If you don’t tell Google which URL is your canonical URL (the preferred version), Google will make that choice for you. The page Google picked up may not be the page you want to rank in organic search or page you want users to see. If search crawlers have to wade through too much duplicated content, they may miss some of your unique content. Additionally websites with a high percentage of duplicated content may suffer from diluted ranking ability. This can be problematic for website owners and lead to loss of traffic and revenue.
Google selects the canonical page based on a number of factors including whether the page is unsecured: http or secured: https; your declared preferred domain; page quality; presence of the URL in a sitemap; and any “rel=canonical” labelling. The best thing to do is not to let Google choose for itself but tell them which pages are most important.
Examples of duplicate content include:
- Different device types
- Dynamic URLs with search parameters and session IDs
- If your blog automatically generates multiple URLs as you position the same post under multiple categories.
- If your server is configured to serve the same content for http and https variants.
- If you are serving translated content on different URLs.
Why do you need canonicals URLs?
There are many reasons why you should use canonical tags:
- To tell Google which URL you want users to see in search results.
- To consolidate link signals for similar or duplicated pages.
- To avoid wasting crawl budget on duplicate pages.
- To manage syndicated content.
How to use it – best practises?
- Add self-refencing canonical tags to all of your unique landing pages. These will include your homepage, service pages and blogs/news pages.
- Canonicalise your homepage to address any duplicates.
- Avoid mixed signals – don’t canonicalise pages to broken or redirecting URLs.
- Don’t use canonical on duplicate pages.
- Ensure that each language version has a canonical link that points back to itself. It should be used in conjunction with hreflang directives.
You can add canonical tags in various ways:
- You can specify your preferred domain in Google Search console. For example, example.com rather than www.example.com. Use this only when you have two similar sites that differ only by subdomain. Don’t use this for http/https counterpart sites.
- Add a <link> tag in the code for all duplicate pages, pointing to the canonical page. This method means you can map infinite number of duplicate pages but it can add to the size of the page and it can be difficult to maintain on large websites. It only works for html pages, not for files such as pdfs.
- Send a rel=canonical header in your page response. This method doesn’t increase page size and allows you to map infinite number of duplicate pages but can be complex to maintain on larger sites.
- You can specify your canonical pages in a sitemap but it’s a less powerful signal to Googlebot than the rel=canonical mapping technique.
- Finally you could use 301 redirect to tell Googlebot that a redirected URL is a better version that a given URL.
Website localisation is not just about translation. In fact your translation will be worthless if your customers can’t find you online.
Hreflang tags and canonical links help search engines understand your content and will boost your SEO, as long as they are implemented the right way.