I've been making websites for many years, and recently I decided to get more into SEO. So here's my question:
I would like to know what happens if I wrote in the robots.txt:
User-agent: * Disallow: / Sitemap: http://www.example.com/sitemap.xml
And in the sitemap.xml, there are thousand of links linking to different pages in a directory, for example:
1 => http://www.example.com/site/index.php 2 => http://www.example.com/site/index.php?lang=en 3 => http://www.example.com/site/shopping.php 4 => http://www.example.com/site/picture.php etc...
In this case, as I understand what robots.txt and sitemap.xml do;
First, robots.txt disallow Search Engine (Let's talk about Google) to index any file or folder in the domain name: example.com. However, google will look at the sitemap.xml and find that it has to index the concerned links.
What happens in this situation ?
Moreover I would like to know, what happens when a page has the meta robots set to noindex, but at the same time it appears on the sitemap.xml. What happens in this situation as well ?
Finally, I would like to know if I updated my path to the sitemap for my website on google webmaster tools, will it be enough for google to go and check it or it should also appear on the robots.txt ?
Edited by cmps