How to prevent Google from crawling our product filter?

footsteps

Hi All,

We have a crawler problem on one of our sites www.sneakerskoopjeonline.nl.

On this site, visitors can specify criteria to filter available products. These filters are passed as http/get arguments. The number of possible filter urls is virtually limitless.

In order to prevent duplicate content, or an insane amount of pages in the search indices, our software automatically adds noindex, nofollow and noarchive directives to these filter result pages. However, we’re unable to explain to crawlers (Google in particular) to ignore these urls.

We’ve already changed the on page filter html to javascript, hoping this would cause the crawler to ignore it. However, it seems that Googlebot executes the javascript and crawls the generated urls anyway.

What can we do to prevent Google from crawling all the filter options?

Thanks in advance for the help.

Kind regards,

Gerwin

footsteps

The following is added to our robots.txt .. now lets wait and see the results

User-agent: * Disallow: /admin/
Disallow: /?
Allow /?product_date=&product_date2=*
Disallow /?product_date=&product_date2=&

To check the working of the robots.txt i found a handy website;

http://phpweby.com/services/robots

footsteps

The url looks like this;

http://www.sneakerskoopjeonline.nl/herensneakers?product_brand=

So just adding;

User-agent: *
Disallow: /*?product_brand

Should do the trick?
Most important is that herensneakers itself should be indexed, followed and crawled

alexhoug

I would use your robots.txt file to prevent them from crawling the specific strings / pages. Go into your Google Webmaster Tools and you can see all the information Google has on your site and any issues, you can also specify robots.txt information in there. That would be the best route as Google is obedient with what is on the robots.txt file. If you want more information about robots.txt, go here.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

How to prevent Google from crawling our product filter?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Crawl Stats Decline After Site Launch (Pages Crawled Per Day, KB Downloaded Per Day)

"Null" appearing as top keyword in "Content Keywords" under Google index in Google Search Console

Do google counts YouTube backlinks?

Google crawling different content--ever ok?

Google is mixing subdomains. What can we do?

My own brand name disappeared from google?

How long is the google sandbox these days?

Check Google ban on domainname

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved