Sitemaps: Best Practice

Fubra

What should and what shouldn't go in the sitemap?

In particular, pages like subscribe to our newsletter/ unsubscribe to our newsletter? Is there really any benefit in highlighting those pages to the SEs?

Thanks for any advice/ anecdotes

sarahwalsh

So, sometimes, people think adding a sitemap to their company website, is something thats very difficult to do.

for example, they may think they need a web designer to do this for them, yet often you can do it yourself, its very simple.

so if your business has a WordPress website, then it can be a piece of cake to add a site map.

If you use Yoast, its a free plugin, , you can add a site map very easily to your website, which you can then send to your site map to Google Search Console for indexing .

We did this for a large garden room company within the city of Bristol, and what happens is that it makes sure every single page and blog post is indexed.

effectdigital

Pages that I like to call 'core' site URLs should go in your sitemap. Basically, unique (canonical) pages which are not highly duplicate, which Google would wish to rank

I would include core addresses

I wouldn't include uploaded documents, installers, archives, resources (images, JS modules, CSS sheets, SWF objects), pagination URLs or parameter based children of canonical pages (e.g: example.com/some-page is ok to rank, but not example.com/some-page?tab=tab3). Parameters are additional funky stuff added to URLs following "?" or "&".

There are exceptions to these rules, some sites use parameters to render their on-page content - even for canonical addresses. Those old architecture types are fast dying out, though. If you're on WordPress I would index categories, but not tags which are non-hierarchical and messy (they really clutter up your SERPs)

Try crawling your site using Screaming Frog. Export all the URLs (or a large sample of them) into an Excel file. Filter the file, see which types of addresses exist on your site and which technologies are being used. Feed Google the unique, high-value pages that you know it should be ranking

I have said not to feed pagination URLs to Google, that doesn't mean they should be completely de-indexed. I just think that XML sitemaps should be pretty lean and streamlined. You can allow things which aren't in your XML sitemap to have a chance of indexation, but if you have used something like a Meta no-index tag or a robots.txt edit to block access to a page - **do not **then feed it to Google in your XML. Try to keep **all **of your indexation modules in line with each other!

No page which points to another, separate address via a canonical tag (thus calling itself 'non-canonical') should be in your XML sitemap. No page that is blocked via Meta no-index or Robots.txt should be in your sitemap.XML either

If you end up with too many pages, think about creating a sitemap XML index instead, which links through to other, separate sitemap files

Hope that helps!

Fubra

To further on from this, we have some parameter urls in our sitemap which make me uneasy. should url.com/blah.html?option=1 be in the sitemap? If so, what benefit is that giving us?

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

Sitemaps: Best Practice

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Best-practice URL structures with multiple filter combinations

Membership/subscriber (/customer) only content and SEO best practice

Getting a Sitemap for a Subdomain into Webmaster Tools

What is the best way to handle special characters in URLs

Should canonical links be included or excluded in a sitemap?

ECommerce product listed in multiple places, best SEO practice?

Do 404 pages pass link juice? And best practices...

Best practice for removing indexed internal search pages from Google?

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved