Can anyone tell how to search for a keyword in any web page in java...Suppose if
am giving query in a google page, The results will be displaying in many pages. In the 1st page i want to search for a keyword www. ...

I want to download page for that first to search keyword ...can anyone locate the following problem

code:

import java.io.*;
import java.net.*;
public class page
{
    public static void main(String args[]) throws IOException
    {

        java.io.BufferedInputStream in = new java.io.BufferedInputStream(new java.net.URL("http://www.google.co.in/search?q=Testing&hl=en&start=00&sa=N").openStream());
        java.io.FileOutputStream fos = new java.io.FileOutputStream("testing1.htm");
        java.io.BufferedOutputStream bout = new BufferedOutputStream(fos,1024);
        byte data[] = new byte[1024];
        while(in.read(data,0,1024)>=0)
        {
            bout.write(data);
        }
        bout.close();
        in.close();
    }
}

Problem is:

C:\Program Files\Java\jdk1.5.0\bin>javac page.java

C:\Program Files\Java\jdk1.5.0\bin>java page
Exception in thread "main" java.io.IOException: Server returned HTTP response co
de: 403 for URL: [url]http://www.google.co.in/search?q=Testing&hl=en&start=00&sa=N[/url]
        at sun.net.[url]www.protocol.http.HttpURLConnection.getInputStream(HttpURLCon[/url]
nection.java:1133)
        at java.net.URL.openStream(URL.java:1007)
        at page.main(page.java:8)

C:\Program Files\Java\jdk1.5.0\bin>

Edited 3 Years Ago by mike_2000_17: Fixed formatting

Am saving that web page in a text file passing so tat i can avoid tat error also.Anyone send me code how to find a keyword stating from www. and ending with .doc(or .hmt/.pdf) in tat text file and i should store the url in a temp string .For example in the text file if am having link like this means www.cdc.gov/hiv/testing.htm i want to extract and pass cdc.gov/hiv/
into my url string..........

no, we're not going to do your (home)work for you.
That's pretty basic functionality, anyone should be able to figure it out for themselves.

Regular expressions to find URLs are scattered all over the web if you want them.

This article has been dead for over six months. Start a new discussion instead.