Robots.txt – Watching the Minor Details

I noticed that Google was indexing one of my blog’s feeds that I thought I had blocked with robots.txt:

Google indexing RSS feeds

The indexed URL is pocketseo.com/domains/7/feed. Notice that it doesn’t have a trailing slash.

My robots.txt rule is:

Disallow: /*/feed/

WordPress URLs often have trailing slashes. I don’t want to remove that trailing slash from robots.txt otherwise it would block any URL that contains the word “feed”.

It’s not a major problem, but here is a quick solution for that one indexed URL:

Disallow: /*/feed/
Disallow: /domains/7/feed

It’s not important on a site like pocketseo.com, but I think that robots.txt rules are important on large sites and they can sometimes be tough to get right.

Popularity: 8% [?]

Share This:
  • Sphinn
  • Digg
  • del.icio.us
  • Reddit
  • StumbleUpon
  • Twitter
  • Facebook
  • Google Bookmarks
  • Mixx
  • Tumblr
  • FriendFeed
  • LinkedIn

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*