Hey everyone.

Im new to java and having some problems.

The main idea is to connect to a website and collect information off it and store it in an array.

What I want the program to do is to search the website find a key word, and store what comes after the key word..

on the front page of daniweb along the bottom of the website there is a section called "Tag Cloud" which is filled with tags / short words

Tag Cloud: "i want to store what is written here"

My idea is to first read in the html of the website and then search that file for the key word followed by the text using Scanner and StringTokenizer then store as a array.

is there a better way / easier?

where do you suggest i look for some examples

here is what i have so far.

import java.net.*;
import java.io.*;

public class URLReader {

    public static void main(String[] args) throws Exception {
        
        URL dweb = new URL("http://www.daniweb.com/");
        URLConnection dw = dweb.openConnection();
        BufferedReader in = new BufferedReader(new InputStreamReader(hc.getInputStream()));
        System.out.println("connected to daniweb");
        String inputLine;

        PrintStream out = new PrintStream(new FileOutputStream("OutFile.txt"));
        
        try {
        while ((inputLine = in.readLine()) != null)
            out.println(inputLine);

            //System.out.println(inputLine);
            //in.close();
        out.close();
        System.out.println("printed text to outfile");
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }
                                       
        try {
            Scanner scan = new Scanner(OutFile.txt);
            String search = txtSearch.getText();
            while (scan.hasNextLine()) {
                line = scan.nextLine();
            //still working
                while (st.hasMoreTokens()) {
                    word = st.nextToken();
                    if (word == search) {
                   
                    } else {
                       
                    }
                }
            }
            scan.close();
            SearchWin.dispose();
        } catch (IOException iox) {
        }
    }

any help at all would be very much appreciated!

Why write the html page lines to a file? Scan them as they are read from the site.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.