954,242 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Need mod_security rules to prevent Googlebot from crawling one file

I need mod_security rules to prevent Googlebot from indexing any file named browse.php anywhere on the server, while still allowing Googlebot to access anything else. I figured mod_security will do the trick because it can recognize user-agents and set rules accordingly.

Any ideas?

vectro
Junior Poster in Training
64 posts since Oct 2008
Reputation Points: 10
Solved Threads: 1
 

I did some research on creating mod_security rules and figured this out myself. Here is a server-wide mod_security rule for the main Apache configuration which will keep Googlebot off of 1 particular file. This only applies if the file shows up in the root directory of a domain, but it applies to all domains on the server.

<LocationMatch "/file.php">
SecRule REQUEST_HEADERS:User-Agent "@pm Googlebot" "deny,status:403"
</LocationMatch>


Change file.php to the name of the file you want to protect. The part that says "Googlebot" can also be changed to any user-agent. It's a pattern match and not an explicit match. This means the full user-agent simply needs to include the word for the rule to apply.

vectro
Junior Poster in Training
64 posts since Oct 2008
Reputation Points: 10
Solved Threads: 1
 

This question has already been solved

Post: Markdown Syntax: Formatting Help
You
View similar articles that have also been tagged: