Hey guys. I am new at programming and would appreciate any help I can get. I want to make a program that reads the HTML code of a web page, and writes a specific line into a document. So for example, I want my code to read the source of the www.daniweb.com homepage and write whats inbetween the <title></title> tags into a notepad file which should be "DaniWeb - Technology Publication Meets Social Media." The code below works, but returns the entire HTML source code of the page.

import java.io.*;
import java.net.MalformedURLException;
import java.net.URL;

public class UrlReadPageDemo {
    public static void main(String[] args) {
        try {
            URL url = new URL("http://www.daniweb.com");

            BufferedReader reader = new BufferedReader(new InputStreamReader(url.openStream()));
            BufferedWriter writer = new BufferedWriter(new FileWriter("data1.txt"));

            String line;
            while ((line = reader.readLine()) != "<title>") {
                System.out.println(line);
                writer.write(line);
                writer.newLine();
            }
            reader.close();
            writer.close();
        } catch (MalformedURLException e) {
            e.printStackTrace();
        }  catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Recommended Answers

All 5 Replies

new

Try using this:

 while (!(line = reader.readLine()).equals("<title>")) 

Not sure that the logic you are using will get the desired results. Try thinking over it again.

anyone?

A simple way would be to read the input stream until the starting tag is found and then save what is read until the ending tag is found.

String line;
String outLine;

while ((line = reader.readLine()) != null) {

if (line.contains("<title>")){
    outLine = line.substring(line.indexOf("<title>")+7, line.indexOf("</title>") );
    writer.write(outLine);
    writer.newLine();
    System.out.println(outLine);
    }     

What if the tags are on different lines?

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.