I noticed that Google was indexing one of my blog’s feeds that I thought I had blocked with robots.txt:

The indexed URL is pocketseo.com/domains/7/feed. Notice that it doesn’t have a trailing slash.
My robots.txt rule is:
Disallow: /*/feed/
WordPress URLs often have trailing slashes. I don’t want to remove that trailing slash from robots.txt otherwise it would block any URL that contains the word “feed”.
It’s not a major problem, but here is a quick solution for that one indexed URL:
Disallow: /*/feed/
Disallow: /domains/7/feed
It’s not important on a site like pocketseo.com, but I think that robots.txt rules are important on large sites and they can sometimes be tough to get right.
Related posts:
- Is Google is Broken? (Robots.txt Hell) Something is wrong with Google and robots.txt....
- Serious Bug in MSN SERPs This screenshot shows a bug in the MSN SERPs....
- MSN Live Search Only Has Partial Support for Wildcards in Robots.txt Problems with MSN Live Search's robots.txt docs....
- How to Spot the Ultimate Robots.txt Mistake Don't block your entire Web site with your robots.txt file....
- Google Does Not Obey Robots.txt Why isn't Google obeying robots.txt files?...

