is my robots file not allowing bots to crawl my site?

Question

commando1200 0 Newbie Poster

11 Years Ago

My website isn't getting crawled to my knowledge at all or very infrequently does my robots.txt file have something to do with that? My current file looks like

User-agent: *
Disallow: /

Is this good or bad for crawls?

4 Contributors
4 Replies
140 Views
21 Hours Discussion Span
Latest Post 11 Years Ago Latest Post by james.lu.75491856

veedeoo 474 Junior Poster

11 Years Ago

Dude,

What the text meant is not to crawl any pages within and below the directory where the robot.txt is located.

If you don't want the spider to crawl the image directory, you can give the instruction like this

User-agent: *
Disallow: /images/

this

User-agent: *

is for all the robots or spiders.. I would disallow a crawl from an evil spider such as slurp, and allow the rest. so, my code will be something like this

User-agent: *

User-agent: Slurp
Disallow: /

The above will disallow the spider Slurp to crawl my site.

You must itimized all the not allowed bots.

Edited 11 Years Ago by veedeoo because: info added

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

Octet 45 Newbie Poster Featured Poster · Answer 1 · 2013-08-04T08:23:09+00:00

Just to add to what has already been said, crawlers don't need to follow a robots.txt file. The majority will follow it, such as Google and Yahoo's crawlers will obey to the rules but there is nothing to stop me writing a crawler to crawl your site and completely ignore the rules you have set.

If you don't want the site crawled due to privacy etc., then you shall need a more secure method of stopping crawlers such as using .htaccess.

james.lu.75491856 0 Junior Poster · Answer 2 · 2013-08-04T23:06:33+00:00

james.lu.75491856 0 Junior Poster

11 Years Ago

Robots.txt:

IOError

james.lu.75491856 0 Junior Poster · Answer 3 · 2013-08-04T23:07:35+00:00

james.lu.75491856 0 Junior Poster

11 Years Ago

no robots.txt = crawl as much as you want.