1,105,556 Community Members

Blocking directories in robots.txt with Apache

Member Avatar
Dani
The Queen of DaniWeb
20,571 posts since Feb 2002
Reputation Points: 1,356 [?]
Q&As Helped to Solve: 931 [?]
Skill Endorsements: 204 [?]
Administrator
Featured
Sponsor
 
0
 

We have the valid URLs:

www.daniweb.com/foo
www.daniweb.com/foo/
www.daniweb.com/foo/1
www.daniweb.com/foo/2
www.daniweb.com/foo/3

If I want to disallow them all in robots.txt, are both of these valid and will they do the same thing?

Disallow: /foo
Disallow: /foo/

Will the latter also block the URL www.daniweb.com/foo or will that be interpreted as a page underneath the root directory, and not within the foo directory? Contrastly, will the former be interpreted as only blocking the single page and not the foo directory?

Member Avatar
CimmerianX
Practically a Master Poster
661 posts since Jul 2010
Reputation Points: 53 [?]
Q&As Helped to Solve: 83 [?]
Skill Endorsements: 13 [?]
 
1
 

Using "Disallow: /foo/" would block the foo directory and everything in it.

Technically, without the trailing slash, Disallow blocks the 1 item, such as a single file. I would assume this would indicate a disallow on a single file named foo not the directory /foo/.

You can also use "Disallow: /foo*/" to block any subdirectory that began with "foo".

BTW, google has some webmaster tools available that will test the robots.txt file and report on the results. http://www.google.com/webmasters/

You
This article has been dead for over three months: Start a new discussion instead
Post:
Start New Discussion
View similar articles that have also been tagged: