0

My website isn't getting crawled to my knowledge at all or very infrequently does my robots.txt file have something to do with that? My current file looks like

User-agent: *
Disallow: /

Is this good or bad for crawls?

4
Contributors
4
Replies
21
Views
4 Years
Discussion Span
Last Post by james.lu.75491856
1

Dude,

What the text meant is not to crawl any pages within and below the directory where the robot.txt is located.

If you don't want the spider to crawl the image directory, you can give the instruction like this

User-agent: *
Disallow: /images/

this

User-agent: *

is for all the robots or spiders.. I would disallow a crawl from an evil spider such as slurp, and allow the rest. so, my code will be something like this

User-agent: *

User-agent: Slurp
Disallow: /

The above will disallow the spider Slurp to crawl my site.

You must itimized all the not allowed bots.

Edited by veedeoo: info added

0

Just to add to what has already been said, crawlers don't need to follow a robots.txt file. The majority will follow it, such as Google and Yahoo's crawlers will obey to the rules but there is nothing to stop me writing a crawler to crawl your site and completely ignore the rules you have set.

If you don't want the site crawled due to privacy etc., then you shall need a more secure method of stopping crawlers such as using .htaccess.

Edited by Octet

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.