Hello all,
I have created an RSS object and I would now like to parse out the descriptions of the children. My knowledge of RSS is extremely limited so pardon me if I am incorrect in explaining what I need. For example, I am connecting to http://rss.cnn.com/rss/cnn_topstories.rss and I would like to parse the information below the article title and date into a string. Does anyone have any ideas? Thanks!

Here is what I have

RssParser parser = RssParserFactory.createDefault();
Rss rss = parser.parse(new URL("http://rss.cnn.com/rss/cnn_topstories.rss"));
Channel channel = rss.getChannel();

Recommended Answers

All 2 Replies

I actually started a small html parser not long ago (never did get too specific with it, but was hoping on expanding it)

Just a short snippet I pulled out from the link you are pulling the feed from:

<title>Reclusive author J.D. Salinger dies</title>

<guid isPermaLink="false">http://www.cnn.com/2010/SHOWBIZ/books/01/28/salinger.obit/index.html?eref=rss_topstories</guid>
<link>http://rss.cnn.com/~r/rss/cnn_topstories/~3/j6L8eXQWUIk/index.html</link>
<description>J.D. Salinger, author of "The Catcher in the Rye" and other books, has died, according to his literary agent. He was 91.&lt;div class="feedflare"&gt;
&lt;a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=j6L8eXQWUIk:Pqkyy5dEj-8:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=j6L8eXQWUIk:Pqkyy5dEj-8:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=j6L8eXQWUIk:Pqkyy5dEj-8:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?i=j6L8eXQWUIk:Pqkyy5dEj-8:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=j6L8eXQWUIk:Pqkyy5dEj-8:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=j6L8eXQWUIk:Pqkyy5dEj-8:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?i=j6L8eXQWUIk:Pqkyy5dEj-8:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt;

&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/rss/cnn_topstories/~4/j6L8eXQWUIk" height="1" width="1"/&gt;</description>
<pubDate>Thu, 28 Jan 2010 14:55:47 EST</pubDate>
<feedburner:origLink>http://www.cnn.com/2010/SHOWBIZ/books/01/28/salinger.obit/index.html?eref=rss_topstories</feedburner:origLink></item>

Just off a glance, it looks like you are wanting to pull out everything between the <description> desired text/string </description>

I am about to leave, but perhaps I can get back to this post tomorrow, for the mean time (im not sure how helpful this will be to you) but check out the thread I created when I was parsing some html:

http://www.daniweb.com/forums/thread236349.html

Thanks! This helped a lot. I have my program working now.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.