Earlier this month, we explored how e-commerce and online catalog search engine rankings are affected by offsite duplicate content, which is content that is taken from one source and repeated on other domains. This article, in turn, focuses on correcting a content challenge you have more control over — onsite e-commerce duplicate content.
Onsite duplicate content — i.e., identical or nearly identical pages of content within the same website — makes it difficult for Google to decide which pages should rank for a particular keyword. Not letting Google know what is and isn’t duplicate content can cause all kinds of chaos for your search rankings. For example:
- Google may crawl your site less often because its takes more resources to crawl your duplicate content
- Your website may not rank as quickly when you add new products
- Google may choose to rank the wrong page
- Google may omit some of your pages from search results
The good news is: Onsite duplicate content is generally easier to fix than offsite duplication because onsite duplicate content happens between pages on your website. When it comes to e-commerce onsite duplicate content, it is often the result of your content management system generating identical product pages with different URLs.
Do You Have Duplicate Content?
There’s a free tool called Siteliner that scans your website for duplicate content. Go ahead and run your website through it. The test should only take a few seconds.
How You Got Duplicate Content
We love e-commerce content management systems. Platforms like Magento and WooCommerce make it easy for non-technical people to add new products and categories to their e-commerce websites. But, while content management systems are great for managing content, they are notorious for generating duplicate product categories.
Let’s pretend your online catalog includes the popular Husqvarna 455 chainsaw. You put the chainsaw’s URL in a category called “outdoor equipment,” but you also want it to show up in the categories “bestsellers” and “new products.” Your website would likely generate three different URLs for the same product:
All three URLs refer to pages with the same product with the same content. Now Google has to decide which one is going to show up when a user searches for Husqvarna 455.
How Does Google Address Duplicate Content?
One school of thought believes product categorization duplication isn’t a big deal — Google will just pick one page as primary and ignore the others. Problem solved. But your CMS could be creating two, three, even ten duplicates. All of a sudden your catalog of 10,000 products has 100,000 pages.
Google’s job is to index the pages on your website to determine which ones it will show in search results. Unless your website is Amazon.com or Grainger.com, Google is only going to devote only so many resources to crawling your site. Every page containing duplicate content increases Google’s indexing burden as well as the risk that Google will not index all the pages on your website.
Fixing Onsite Duplicate Content
We’ve got two options for solving the problem of duplicate product categorization. The first, using one master URL for each product, is our preferred method and is commonly used by large e-commerce sites like HomeDepot.com.
The second option is to use “canonical” URLs, which is more of a workaround. Why is that? Well, when you introduce multiple navigation paths, each path will have its own URL. That means Google has to index two pages that include the same product and the same content. Many developers solve this by designating one page as the primary or “canonical” URL. Designating a canonical URL tells search engines, “Yes, I know there are multiple pages but I want you to use this one.”
Option 1 – Master URL
Let’s look at an example from HomeDepot.com, which we’ll use to illustrate two variations of our preferred solution. Pretend you’re willing to spend $3,000 on an oven and are considering the Samsung Chef Collection Electric Range. There are multiple paths to get to this product on HomeDepot.com:
- Home > Appliances > Cooking > Ranges > Electric Ranges
- Home > Appliances > Cooking > Ranges > Electric Ranges > Double Oven Electric Ranges
Option 1.1 – Master URL without Category Path
If you open the two Home Depot URLs, you will see that they have eliminated the category path at the product-level URL. No matter how you navigate to a product, it always shows the same URL without the parent categories. This eliminates the risk of duplicate content and prevents people from creating backlinks to multiple URL instances of the same product.
How does one create a single master URL? It varies by content management system, but Amasty has an excellent blog post that explains how to handle Magento Duplicate Content. Go to System > Catalog > Search Engine Optimization and change “Use Categories Path for Product URLs” to “No” and both “Canonical Link Meta Tag” fields to “Yes.”
All of your breadcrumbs, pagination, and filters will still carry over depending on how you navigated to the page, but the URL will be the same no matter which path you take.
Option 1.2 –Master URL with Category Path
Let’s go back and look at the Home Depot URL again:
Notice that the URL does not provide any category reinforcements such as “appliances,” “cooking,” or “ranges.” Many SEO experts recommend keeping these category-level keywords in the URL to help search rankings and to give shoppers ready access to the parent categories so they can browse additional products.
To do so, designate the master category, or categories, for each product so that they can be incorporated into product-page URLs. You can still assign products to multiple categories and give visitors multiple navigation paths to get to those products, but once they arrive at a specific product page, the URL will always show the categories you designated as master categories for that product.
For all you Magento fans, there is a paid extension called Unique Product URL that allows you to set a master path for every product on your Magento site so that you can keep those category-level keywords.
Option 2 – Canonical URLs
Some content management systems don’t allow for master categories or single URLs for products. In those situations, you can either upgrade to a more robust content management system or apply canonical tags to your URLs.
Canonical tags work much the same way as creating a master category — i.e., the canonical tag designates one page as the official master URL, thus telling Google to ignore any non-canonical versions of a page with duplicate content.
Canonical tags are pretty easy to implement. Simply go into the code of every non-canonical URL and add this tag:
<link rel=“canonical” href=http://www.yourwebsite.com/category/product/>
Now search engines know to ignore any non-canonical pages and forward those links to the official canonical version.
At least that’s the theory. The problem with this solution is that Google is under no obligation to follow your convention. If multiple URLs exist for a page, Google may or may not apply the canonical URL attribution to your non-canonical links.
All content being equal, if your competitor is using master categories that point to a single URL and you’re using canonical tags for multiple URL instances of the same page and pointing non-canonical pages to your canonical page, your competitor’s site will likely perform better with search engines.
More Ways to Optimize Your E-Commerce Site
No matter which option you choose to solve onsite e-commerce duplicate content, you should not let Google be the exclusive determinant of your website’s architecture. Solving duplicate content issues is just one of the many ways you can get your catalog to perform better with search engines.
For more comprehensive tips, download “Update Your Online Catalog Already – 7 Deadly Catalog Sins That Are Costing You Leads.” — a free e-book that will help you optimize your e-commerce site.