Hey guys. I am new at programming and would appreciate any help I can get. I want to make a program that reads the HTML code of a web page, and writes a specific line into a document. So for example, I want my code to read the source of the www.daniweb.com homepage and write whats inbetween the <title></title> tags into a notepad file which should be "DaniWeb - Technology Publication Meets Social Media." The code below works, but returns the entire HTML source code of the page.

import java.io.*;
import java.net.MalformedURLException;
import java.net.URL;

public class UrlReadPageDemo {
    public static void main(String[] args) {
        try {
            URL url = new URL("http://www.daniweb.com");

            BufferedReader reader = new BufferedReader(new InputStreamReader(url.openStream()));
            BufferedWriter writer = new BufferedWriter(new FileWriter("data1.txt"));

            String line;
            while ((line = reader.readLine()) != "<title>") {
                System.out.println(line);
                writer.write(line);
                writer.newLine();
            }
            reader.close();
            writer.close();
        } catch (MalformedURLException e) {
            e.printStackTrace();
        }  catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Recommended Answers

A simple way would be to read the input stream until the starting tag is found and then save what is read until the ending tag is found.

Jump to Post

What if the tags are on different lines?

Jump to Post

All 5 Replies

new

Try using this:

 while (!(line = reader.readLine()).equals("<title>")) 

Not sure that the logic you are using will get the desired results. Try thinking over it again.

anyone?

A simple way would be to read the input stream until the starting tag is found and then save what is read until the ending tag is found.

String line;
String outLine;

while ((line = reader.readLine()) != null) {

if (line.contains("<title>")){
    outLine = line.substring(line.indexOf("<title>")+7, line.indexOf("</title>") );
    writer.write(outLine);
    writer.newLine();
    System.out.println(outLine);
    }     

What if the tags are on different lines?

Be a part of the DaniWeb community

We're a friendly, industry-focused community of 1.21 million developers, IT pros, digital marketers, and technology enthusiasts learning and sharing knowledge.