robots.txt and 302 redirects

Question

Dani 4,675 The Queen of DaniWeb

14 Years Ago

I have page1.html that is being 302 redirected (temporary redirect) to page2.html
page2.html is disallowed in my robots.txt file

Under normal circumstances, when googlebot encounters a 301 redirect from page1.html to page2.html, it will index page2.html, and when googlebot encounters a 302 redirect from page1.html to page2.html, it will index page1.html

Since, theoretically, the url of page1.html is what would be indexed, would it still be indexed considering page2.html is blocked?

http-protocol seo

4 Contributors
5 Replies
982 Views
4 Months Discussion Span
Latest Post 13 Years Ago Latest Post by joeyoungblood

All 5 Replies

canadafred 220 SEO Alumni

14 Years Ago

I would think that because you redirect page1 to page2 the search engine will include it in a crawl despite a robot.txt instruction to do otherwise.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

Dani 4,675 The Queen of DaniWeb Administrator Featured Poster Premium Member · Answer 1 · 2011-03-14T06:32:13+00:00

Would it just crawl (b/c when it first finds page1.html, it is a valid url for it), or would it actually index the contents of page2.html, despite a robots.txt file to disallow crawling or indexing of page2.html?

Dani 4,675 The Queen of DaniWeb Administrator Featured Poster Premium Member · Answer 2 · 2011-03-15T01:12:40+00:00

It's been a couple of days, and Google Webmaster Tools is now showing me that page1.html is not being crawled due to being blocked in my robots.txt file, even though it is only page2.html that is actually listed in robots.txt.

This is the desired effect, in my case.

sugeshg 0 Newbie Poster · Answer 3 · 2011-07-26T22:17:22+00:00

If you had blocked your page2.html in robots.txt. The search engines bots won't crawl that page even though you had (302 - 'Found' or 'Moved Temporarily') redirected the page1.html to page2.html.

User-agent: *
Disallow: /page2.html

Confirm that you had verified your domain name in webmaster central.

http://www.google.com/webmasters/

Resubmit your sitemap.xml having page1.html in Google Webmaster Tools and Bing Webmaster Center. The SE bots will crawl the URLS given in sitemap.xml and update their index accordingly.

joeyoungblood 0 Newbie Poster · Answer 4 · 2011-07-30T01:26:37+00:00

blocking the URL in the robots.txt doesn't do much good these days. Google will still index the URL and give it whatever title they want and rank it for what they want. noindex meta robots tag is far more useful.

you say it worked, but I would keep an eye on it. in late June google posted about using robots.txt vs noindex and stated that robots.txt was nolonger their endorsed method. http://www.google.com/support/webmasters/bin/answer.py?answer=156449

they have since clarified that they WILL index the URL but not the page or it's content. that means you can easily run into duplicate/thin content issues by blocking URLs that might get shared out on the web.

robots.txt and 302 redirects

Recommended Answers Collapse Answers

All 5 Replies

Recommended Answers