MSN Live Search Only Has Partial Support for Wildcards in Robots.txt

Google and Yahoo both support wildcards (*) and end of string ($) characters in robots.txt files.

MSN’s Live Search is a little more confusing because they only have very limited support for wildcards in robots.txt files. Based on their docs, it looks like wildcards are supported. These are valid robots.txt rules for MSN’s Live.com:

User-agent: msnbot
Disallow: /*.PDF$
Disallow: /*.jpeg$ 
Disallow: /*.exe$

However, “MSNdude” recently stated on WebmasterWorld that “Live Search does not support wildcards in robots.txt today; we are thinking about it.”

An asterisk that substitutes for another set of characters is a wildcard, so this statement is confusing.

I think that wildcards should be added to the robots.txt standard. Wildcards in robots.txt files are essential for the ability to block certain kinds of dynamic URLs. The original robots.txt standard should be updated and MSN should fully jump on board.

Popularity: 7% [?]

Share This:
  • Sphinn
  • Digg
  • del.icio.us
  • Reddit
  • StumbleUpon
  • Twitter
  • Facebook
  • Google Bookmarks
  • Mixx
  • Tumblr
  • FriendFeed
  • LinkedIn

One Comment

  1. Posted November 27, 2007 at 9:31 am | Permalink

    Well, aside from the fact that MSNdude hasn’t been the most credible source lately imo (ie. the reason he gives for the Live bot referrer spamming our sites), you also need to take into account that at times MSN completely ignores robots.txt. For instance, that same spamming bot is downloading AdSense, which as you can see is clearly blocked by Google:

    http://pagead2.googlesyndication.com/robots.txt

2 Trackbacks

  1. [...] That page also provides a great argument for updating the robots.txt standard to include wildcards (*) and end of line characters ($). It is not possible to block complex dynamic URLs without wildcards and end of line characters. MSN Live, for example, does not fully support wildcards in robots.txt. [...]

  2. By 6 Reasons Why Clean URLs Matter - Pocket SEO on March 22, 2008 at 10:30 am

    [...] doesn’t yet officially support wildcards, though Google, Yahoo, and MSN all do. [UPDATE: MSN doesn’t fully support wildcards in robots.txt] For example, Google could block the duplicate URLs above on YouTube with the following robots.txt [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*