Looking to learn what canonical tags are, and how to use them to avoid dreaded duplicate content issues?
Canonical tags are nothing new. They’re been around since 2009—the best part of a decade.
Google, Microsoft and Yahoo united to create them. Their aim? To provide website owners with a way to solve duplicate content issues quickly and easily.
Do they work? Yes, perfectly… but only if you know how to use them!
In this guide, you’ll learn:
- What a canonical tag
- What a canonical tag looks like
- Why canonical tags are important for SEO
- How to implement canonical tags
- How to avoid common canonicalization mistakes
- How to find and fix canonicalization issues
What is a canonical tag?
A canonical tag is a snippet of HTML code that defines the main version for duplicate, near-duplicate and similar pages. In other words, if you have the same or similar content available under different URLs, you can use canonical tags to specify which version is the main one and thus, should be indexed.
What does a canonical tag look like?
Canonical tags use simple and consistent syntax, and are placed within the <head> section of a web page:
<link rel="canonical" href="https://example.com/sample-page/" />
Here’s what each part of that code means in plain English:
- link rel=“canonical”: The link in this tag is the master (canonical) version of this page.
- href=“https://example.com/sample-page/”: The canonical version can be found at this URL.
Why are canonical tags important for SEO?
Google doesn’t like duplicate content. It makes it harder for them to choose:
- Which version of a page to index (they’ll only index one!)
- Which version of a page to rank for relevant queries.
- Whether they should consolidate “link equity” on one page, or split it between multiple versions.
Too much duplicate content can also affect your “crawl budget.” That means Google may end up wasting time crawling multiple versions of the same page instead of discovering other important content on your website.
Forcing Google to waste time crawling duplicate content is, of course, something that should be avoided if possible. However, Google states that it isn’t an issue for most sites.
If new pages tend to be crawled the same day they’re published, crawl budget is not something webmasters need to focus on. Likewise, if a site has fewer than a few thousand URLs, most of the time it will be crawled efficiently.
Canonical tags solve all these issues. They let you tell Google which version of a page they should index and rank, and where to consolidate any “link equity.”
Fail to specify a canonical URL, and Google will take matters into their own hands.
If you don’t indicate a canonical URL, we’ll identify what we think is the best version or URL.
Relying on Google like this isn’t a great idea. They may select a version of your page that you don’t really want to be canonical.
Google states that they usually respect the canonical URL you set, but not always.
Note that even if you explicitly designate a canonical page, Google might choose a different canonical for various reasons, such as performance or content.
Using canonical tag best practices will help mitigate the risk of Google seeing an undesirable version of the page as canonical.
But I don’t have duplicate content, do I?
Given that you probably haven’t been publishing the same posts and pages multiple times, it’s easy to assume that your website has no duplicate content.
But search engines crawl URLs, not web pages.
That means that they see example.com/product and example.com/product?color=red as unique pages, even though they’re the same web page with identical or similar content.