XML Sitemaps and Competitive Intelligence

Graywolf has an interesting post about using sitemaps for competitive intelligence:

…people who are using automated solutions like popularity contest are telling you their most highly trafficked pages. Other people who are generating Sitemap XML files in a more manual fashion, are telling you the pages they want to rank. Chances are good the pages they want to rank for are the “money pages”.

It’s an interesting idea. If you are worried about someone doing this to your sites, you could try to hide your sitemaps by giving them unpredictable names and only using the ping technique to tell search engines where the sitemap index is located.

If you want to find your competitors’ sitemaps you could use search engine queries like this:

If you want to make sure that search engines can’t index your sitemap files you might be able to block them with an x-robots-tag HTTP header telling Google and Yahoo not to index them.

EDIT: I removed the idea of blocking the sitemap with a robots.txt noindex directive, because it probably won’t work in this situation.

2 thoughts on “XML Sitemaps and Competitive Intelligence

  1. Noindex: /sitemap.xml$
    would tell Googlebot NOT to fetch it. That’s experimental syntax and it doesn’t work as expected. Better make use of the X-Robots-Tag.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>