Google tries to index non existing language URLs. Why?

TheHecksler

Hi,

I am working for a SAAS client. He uses two different language versions by using two different subdomains.
de.domain.com/company for german and en.domain.com for english. Many thousands URLs has been indexed correctly.

But Google Search Console tries to index URLs which were never existing before and are still not existing.

de.domain.com**/en/company
en.domain.com/de/**company

... and an thousand more using the /en/ or /de/ in between. We never use this variant and calling these URLs will throw up a 404 Page correctly (but with wrong respond code - we`re fixing that ). But Google tries to index these kind of URLs again and again. And, I couldnt find any source of these URLs. No Website is using this as an out going link, etc.
We do see in our logfiles, that a Screaming Frog Installation and moz.rainyclouds.online w opensiteexplorer were trying to access this earlier.

My Question: How does Google comes up with that? From where did they get these URLs, that (to our knowledge) never existed?

Any ideas? Thanks

NickSamuel

Hi Hecksler,

Did you ever resolve this?

Quick idea from me is to double check ALL version of your website within Google Search Console. You can now register the entire domain property using DNS: https://searchengineland.com/how-to-set-up-google-search-console-domain-verification-for-site-wide-reporting-data-313256

I found that Google was trying to crawl a very old HTTP sitemap from about five years ago for one of my sites, and thus I was able to delete it.

There's some mixed comments/feeling within the Search Community about whether or not GoogleBot really "guesses" URLs, so it's probably more than likely they are getting the links from somewhere....https://stackoverflow.com/questions/20855082/googlebot-guesses-urls-how-to-avoid-handle-this-crawling

Look forward to hearing from you,

Nick

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

Google tries to index non existing language URLs. Why?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

URLs dropping from index (Crawled, currently not indexed)

Is there a way to get a list of all pages of your website that are indexed in Google?

Trying to find all internal links to a specific page (without index)

Does google index images or ALT text only?

CDN Being Crawled and Indexed by Google

How to get Google to index another page

Google News URL Format

How long will Google take to stop crawling an old URL once it has been 301 redirected

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved