When creating a site, or maintaining one that we already manage, we should devote some time to its analysis and proper optimization. It is the perfect moment to avoid errors or simply to get rid of them. Google’s search bot is not omniscient. It can deduce certain things, but for it to obtain the proper understanding of the situation, it needs our help.

The most common mistakes

Duplicate Content is one of the most frequent mistakes appearing on both large and small websites. What is it all about? As the name suggests – its about repeated content, that may appear in multiple variants.

The subject of quality of content has been discussed by Google’s staff, including Matt Cutts:

 

„If the vast majority or all of your content is the same content that appears everywhere else, and there’s nothing else to really distinguish it or to add value, that’s something I would try to avoid if you can.”

Duplication in website structure

First of all: duplication of the homepage, that we can visit under different URLs. What we mean here is a situation when a site is active under the address www.example.com and example.com, as well as www.example.com/index.html or example.com/index.html. What we can see here are several DIFFERENT addresses, but the very same website. Google will not infer that, for example, our main “original” URL is example.com and instead of that, the site will appear in search results as www.example.com/index.html.

Solution

If our site appears under several addresses, we must decide on a single one. All remaining URLs should be redirected. There are several ways we can achieve that, but the easiest one is to setup a 301 redirect.

Code 301 stands for “Moved Permanently” and informs a search bot that the site was moved to another location or address. Below you will find an example of code that should be added to the .htaccess file:

Redirects from the URL without www to a URL with www

RewriteCond %{HTTP_HOST} !^www.

RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]
Redirects from the URL with www to a URL without www

RewriteEngine On

RewriteCond %{HTTP_HOST} ^www.example.com.pl$ [NC]

RewriteRule ^(.*)$ http://example.com.pl/$1 [R=301,L]

Redirects z from the address with index.html to an address without index.html

RewriteEngine On

RewriteRule ^index.html$ http://www.example.pl [R=301,L,NC]

Kinds of redirects

In fact, there are several types of redirects. The aforementioned redirect 301 moves the site to a new address provided. This kind of redirect also de-indexes the old URL, and transfers the majority of its ‘link power’.

Let us look at redirect 302, which is a temporary solution. If we know that our URL will be “out of order” for a limited time, this method is worth considering. Using it, we inform Google that the old URL does not function anymore, but that everything will be back to normal soon. This kind of redirect does not de-index the URL from the search engine, which is useful during temporary moving of a site.

One should remember about the differences between redirects 301 and 302. If we are migrating to a new website and wish to eliminate duplication of content, we should use permanent redirection 301.

One must also be aware of the fact that an excessive number of redirects is counterproductive. Google follows three, at best four redirects. This is worth considering when we are e.g. migrating a new portal for the third time and more. In such case, we should edit the previous address mappings and redirect the oldest addresses to the newest ones, in order to maintain a low number of redirection steps.

Testing/production pages

It may happen that duplicate content appears when a given site appears in two versions: an official one and one built for testing. However, we sometimes forget to exclude testing/temporary sites from indexing. If this happens, we will be dealing with another version of duplicate content.

Solution

In such case, we should exclude testing pages from indexing by means of a code in the <head> section. Simply use a proper metarobots tag:

<meta name=”robots” content=”noindex, nofollow”>

Another method is to add code to the robots.txt file:

User-agent: *

Disallow: /

 

As we can see, this code consists of two lines. The first one informs which websites and their robots are mentioned by the code. If we place an asterisk, we will let all robots access the second line of our code.

The second line – Disallow – enables us to exclude an entire site from indexing. By using the “/” character after the colon, we inform that the entire site should not be indexed by robots. If we wish to revert this, we must leave an empty space after the colon. There is one other way – we may simply limit access to the site to specified IP addresses or secure it with a password.

Duplication of meta content
Google favors unique content, and this also pertains to meta content, which must be constructed properly. A small change in titles and descriptions may help us in positioning our sites in the search engine, so why not try it?

Solution

For Google, page titles and their descriptions are among the most important elements of site structure, hence we should avoid any errors in Webmaster Tools (Google Search Console). First of all, let’s look at the tab “Search engine status” and check on our site there are duplicate meta descriptions, such that are too long or too short. Let’s also remember to make sure that each subpage has its unique Title and Description. Thanks to this, Google will obtain precise information about the site’s content, and its users will reach our portal quicker.

Pagination

Pagination stands for dividing content into subpages. This brings us to a problem: which pages should appear in search results?

Let’s imagine an online clothing store that contains different categories: women’s and men’s clothes, sweaters, pants etc. Entering a category – let’s say Skirts, we display a long list of products in different colors and stylings. We are provided with a list of items, sometimes divided into several pages (1,2,3…10). This is what pagination of listings is. From the perspective of SEO, the first page is the most important one.

Potential problems with pagination

Pagination itself is NOT a problem! Let’s remember that. What causes trouble is duplication of titles and descriptions, as well as of content – this is because a SEO-oriented piece of text may appear on each subpage. Such text is often a place where the SEO specialist places keywords. Moreover, the list of products with subpages should also contain unique meta content, so that Google will not ”complain” about duplication of such elements.

Solution

The solution to the lack of clarity caused by pagination as perceived by Google’s robot is to use new metatags: rel=”next” i rel=”prev”, which inform about the relation between subpages. The ”next” element informs the robot which website comes next, the ”prev”, naturally, which one is preceeding. This way we help it identify the first page – i.e. the most important one.

Let’s look at an example published on Google’s official blog.

Here’s an article divided into 4 paginated pages:

On the first subpage, we add the following to the <head> section:

<link rel=”next” href=”http://www.example.com/article?story=abc&page=2/>

On the second one, two lines of code:

<link rel=”prev” href=”http://www.example.com/article?story=abc&page=1″ />
<link rel=”next” href=”http://www.example.com/article?story=abc&page=3″ />

We repeat it on the third one:

<link rel=”prev” href=”http://www.example.com/article?story=abc&page=2″ />
<link rel=”next” href=”http://www.example.com/article?story=abc&page=4″ />

And on the last subpage:

<link rel=”prev” href=”http://www.example.com/article?story=abc&page=3″ />

Using this simple method we add hierarchy to all subpages within a category.

If we do not wish Google’s robot to index other pages within a pagination, we should use the directive:

robots=”noindex, follow”

As for duplication of meta content, one of the easiest methods is to add an indication for each website in Title, e.g.:

Skirts| Women’s clothing | Page 2 | ABC store

Let’s not forget to place similar information in descriptions of our subpages, e.g.:

<meta name=”description” content=”Shoes – Page 4 – The best offers in one place. Satisfaction guaranteed! “/>

 

SEO-supporting text should be found only on the first page. It should be well-written and natural, as well as eye-catching from the marketing perspective. Its duplication on other pages will be perceived by Google as duplication of content within a site.

Duplication of product description

On pages of various stores we may find the same data pertaining to similar products. Owners of such websites often copy them from manufacturers’ websites. On one hand, it’s understandable. After all, it is difficult to rephrase the description of a blue cotton sweater.

Solution

We cannot change a product’s parameters but we can add a unique and valuable description to each of them. This is also a great way to improve your performance in long tail keywords. Adding 500-700 additional characters to content copied from a manufacturer’s site will minimize effects of content duplication.

However, if changing descriptions is impossible, another solution is to allow users to voice their opinions. Thanks to product reviews posted by users, we obtain unique, free content.

It is also the most important step towards obtaining Stars in search results:

Naturally, in order to achieve this effect, we must implement additional structural data – but we will already have what is essential: the content.

Tracking parameters in URLs

What is meant here is affiliation, which consists in adding a proper parameter to the URL address:
www.example.com.pl/?partnerid-7653, with a ”?partnerid-7653”.

Creating such an affiliation link may lead to content duplication if Google starts indexing such URLs.

Solution

One of the easiest solutions is to use the # character instead of ?. We should remember that everything that is present in a URL address after the # character is skipped from indexing by Google. Therefore, the search engine will display addresses until the # sign.

Duplication of content is one of the most frequent errors appearing on websites themselves and between them. In order for Google to treat our site as valuable, we must make sure that it contains unique text. There are many ways to combat duplication, all of which are quite simple. The biggest challenge lies in noticing the problem – this is why a proper audit of a website is so important.

Do you want to order an audit of your website? Do you need assistance in SEO? Contact us!

Rate this article
(13)
Comments
Tell us, what you think.
We use cookies to provide you with a better experience. Carry on browsing if you're happy with this, or learn more about opting-out the following here.
Follow MakoLab on Social Media
Want to be up-to-date with our MakoNews? Sign up to our newsletter.