We're a community of 1077K IT Pros here for help, advice, solutions, professional growth and fun. Join us!
1,076,078 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Start New Discussion Reply to this Discussion

Blocking directories in robots.txt with Apache

We have the valid URLs:

www.daniweb.com/foo
www.daniweb.com/foo/
www.daniweb.com/foo/1
www.daniweb.com/foo/2
www.daniweb.com/foo/3

If I want to disallow them all in robots.txt, are both of these valid and will they do the same thing?

Disallow: /foo
Disallow: /foo/

Will the latter also block the URL www.daniweb.com/foo or will that be interpreted as a page underneath the root directory, and not within the foo directory? Contrastly, will the former be interpreted as only blocking the single page and not the foo directory?

2
Contributors
1
Reply
2 Hours
Discussion Span
5 Months Ago
Last Updated
3
Views
Dani
The Queen of DaniWeb
Administrator
21,343 posts since Feb 2002
Reputation Points: 1,555
Solved Threads: 367
Skill Endorsements: 122

Using "Disallow: /foo/" would block the foo directory and everything in it.

Technically, without the trailing slash, Disallow blocks the 1 item, such as a single file. I would assume this would indicate a disallow on a single file named foo not the directory /foo/.

You can also use "Disallow: /foo*/" to block any subdirectory that began with "foo".

BTW, google has some webmaster tools available that will test the robots.txt file and report on the results. http://www.google.com/webmasters/

CimmerianX
Posting Pro in Training
499 posts since Jul 2010
Reputation Points: 49
Solved Threads: 59
Skill Endorsements: 10

This article has been dead for over three months: Start a new discussion instead

Post: Markdown Syntax: Formatting Help
 
You
View similar articles that have also been tagged:
 
© 2013 DaniWeb® LLC
Page rendered in 0.0568 seconds using 2.65MB