954,360 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

what is robots.txt?

what is robots.txt?

please define robot.txt what is it and how does it works?

smith09
Light Poster
31 posts since Dec 2009
Reputation Points: 9
Solved Threads: 0
 

Robots and spiders crawl your site. The file is there to give these spiders instructions once they visit your site. http://www.seoconsultants.com/robots-text-file/

InsightsDigital
Posting Virtuoso
1,761 posts since Jun 2009
Reputation Points: 68
Solved Threads: 9
 

Hi friends,

Meaning for Robots.txt:
Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines .

you are interested to learn more about it, visit http://www.robotstxt.org/ or you can go straight to the Standard for Robot Exclusion.

Shewag

shewag
Newbie Poster
13 posts since May 2010
Reputation Points: 10
Solved Threads: 2
 

Robots.txt file is a file that gives the Search Engine crawlers the right instructions. E.g such as where is the sitemap.xml file and where to not follow.

infinique
Junior Poster
120 posts since Apr 2010
Reputation Points: 10
Solved Threads: 3
 

Robots.txt file allows or prevents Search engine crawlers to enter your site. It is like a instruction manual or a map for the crawlers to know where to crawl.

eysiojo23
Light Poster
28 posts since May 2010
Reputation Points: 10
Solved Threads: 1
 

What they're telling you is that the internet archive did not archive the site due to a preference set in that site’s robot.txt file.

The robot.txt file is checked by spiders, like the way back machine, to determine whether or not the site’s owner wants that particular spider to index their site.

jhonsadins
Newbie Poster
4 posts since May 2010
Reputation Points: 10
Solved Threads: 2
 

Robots.txt is a file that directs and gives instructions to the search engine spiders on what to follow and what not.

infinique
Junior Poster
120 posts since Apr 2010
Reputation Points: 10
Solved Threads: 3
 

Karol said it all. there are so many members here that will surely help.

eysiojo23
Light Poster
28 posts since May 2010
Reputation Points: 10
Solved Threads: 1
 

robots.txt basically tells search engine spiders not to index certain areas of your site. You can protect private areas of sites from being visible to search engines, and thus to everyone.

mackone
Posting Pro
587 posts since Aug 2008
Reputation Points: 65
Solved Threads: 10
 

Robots.txt is a text file you put on your site to tell search robots which pages you would like them not to visit.

treats
Newbie Poster
5 posts since May 2010
Reputation Points: 10
Solved Threads: 2
 

Robot.txt can help to get index ur site and crawl your site page

redesignunit
Posting Pro in Training
435 posts since May 2009
Reputation Points: 7
Solved Threads: 10
 

Robots.txt is used for giving instructions about their site to web robots. It resides under our site root directory. In my own experiences, it might be also a leak for our sites since it is only a plain text file and would be found easily by hackers.

All the best,

AirForceOne
Posting Pro in Training
457 posts since Jun 2009
Reputation Points: 19
Solved Threads: 15
 

robots.txt is a text file which can be used to restrict web robots to accessing your web site only in ways of which you approve.
The robots.txt file is a simple text file (no HTML), that must be placed in your root directory, for example:

http://www.yourwebsite.com/robots.txt

telugucinimalu
Newbie Poster
5 posts since May 2010
Reputation Points: 10
Solved Threads: 1
 

To make a robots.txt file you don't need any special knowledge. Just an ordinary Notepad application. Try to google this term and you'll find lots of examples with an explanation of what each line means. And you'll easily compose one for yourself.

iflexion
Newbie Poster
12 posts since Oct 2009
Reputation Points: 10
Solved Threads: 2
 

robot.txt is a text file(not HTML) that tell search engine bot what to follow and what not to!

sanasahil
Light Poster
37 posts since May 2010
Reputation Points: 10
Solved Threads: 5
 

If you do not know what robots.txt are, chances are you are better off not specifying them in your code. As you could get into trouble by not having your site pages crawled... and I mean EVERY page

Kilandara
Newbie Poster
18 posts since May 2010
Reputation Points: 10
Solved Threads: 1
 
If you do not know what robots.txt are, chances are you are better off not specifying them in your code. As you could get into trouble by not having your site pages crawled... and I mean EVERY page


SE's don't need to crawl every page. For example, do they need to crawl your website's terms of service or privacy policy? Or how about a co-branded promo page that you know is only going to be 'live' for an extended period of time?

Not every page needs to be indexed and crawled by the Search Engines. In fact, it would be great for the WWW if webmasters were more selective of the pages they allowed the SE's to access. There would be a lot less glut out there. Think of all the rack space Google and Bing could save! Everyone... Go Green.. Use Robots.txt! ;)

jay 11
The Dude Abides
Moderator
657 posts since Oct 2009
Reputation Points: 20
Solved Threads: 13
 

robots.txt is basically used to tell the crawler where to crawl and which section you don't need to be crawled. While optimizing your site you can keep this text file under your root directory.

mystryworld
Light Poster
46 posts since Nov 2009
Reputation Points: 10
Solved Threads: 1
 

The information was too quite well.

induswebi
Newbie Poster
5 posts since Apr 2010
Reputation Points: 10
Solved Threads: 1
 

Robot text file is like set of instruction to search engine spider about crawling site data.

joelchrist
Posting Whiz
345 posts since Mar 2010
Reputation Points: 2
Solved Threads: 8
 

This question has already been solved

Post: Markdown Syntax: Formatting Help
You