Robots.txt in subfolders and hreflang issues

lauralou82

A client recently rolled out their UK business to the US. They decided to deploy with 2 WordPress installations:

UK site - https://www.clientname.com/uk/ - robots.txt location: UK site - https://www.clientname.com/uk/robots.txt
US site - https://www.clientname.com/us/ - robots.txt location: UK site - https://www.clientname.com/us/robots.txt

We've had various issues with /us/ pages being indexed in Google UK, and /uk/ pages being indexed in Google US.

They have the following hreflang tags across all pages:

We changed the x-default page to .com 2 weeks ago (we've tried both /uk/ and /us/ previously).

Search Console says there are no hreflang tags at all.

Additionally, we have a robots.txt file on each site which has a link to the corresponding sitemap files, but when viewing the robots.txt tester on Search Console, each property shows the robots.txt file for https://www.clientname.com only, even though when you actually navigate to this URL (https://www.clientname.com/robots.txt) you’ll get redirected to either https://www.clientname.com/uk/robots.txt or https://www.clientname.com/us/robots.txt depending on your location.

Any suggestions how we can remove UK listings from Google US and vice versa?

Tom-Anthony

Hi there!

Ok, it is difficult to know all the ins and outs without looking at the site, but the immediate issue is that your robots.txt setup is incorrect. robots.txt files should be one per subdomain, and cannot exist inside sub-folders:

A **robots.txt **file is a file at the root of your site that indicates those parts of your site you don’t want accessed by search engine crawlers

From Google's page here: https://support.google.com/webmasters/answer/6062608?hl=en

You shouldn't be blocking Google from either site, and attempting to do so may be the problem with why your hreflang directives are not being detected. You should move to having a single robots.txt file located at https://www.clientname.com/robots.txt, with a link to a single sitemap index file. That sitemap index file should then link to each of your two UK & US sitemap files.

You should ensure you have hreflang directives for every page. Hopefully after these changes you will see things start to get better. Good luck!

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

Robots.txt in subfolders and hreflang issues

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Crawl solutions for landing pages that don't contain a robots.txt file?

Issues with Magento layered navigation

Google indexing despite robots.txt block

Are robots.txt wildcards still valid? If so, what is the proper syntax for setting this up?

Google insists robots.txt is blocking... but it isn't.

Squarespace Duplicate Content Issues

Robots.txt file getting a 500 error - is this a problem?

Should I set up a disallow in the robots.txt for catalog search results?

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved