0

I've been working on an applet that gets all the links from a webpage. So far, I have it getting the source. I have found some regular expressions that supposedly will parse out the source, but it doesn't make any change to the original source.

I also found some example using the HTMLEditorKit, but haven't had any luck with that. I don't think that works with applets. Any ideas on how to parse out all of the hyperlinks from the html source? I have to source saved as a string. I have done this before in C#, but completely stuck with java.

3
Contributors
4
Replies
5
Views
6 Years
Discussion Span
Last Post by sfrider0
0

Use an HTML parser of some sort. Google for one. Adding it to the JNLP list or the archive tag will make it available to an applet from the server.

0

Thanks for the link, I'll check those out. I think one of my problems is that when I get the source, it is reading it as characters, then just read to a string. So when I try to split the string, it leaves it as it is, and just removes the letters or spaces from the string, and when I do a println, it prints one char per line. Would a string builder or something be able to create a string from these characters?

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.