Please support our RSS, Web Services and SOAP advertiser: PostgreSQL or MySQL? Compare and contrast the two most popular open source databases
Reply

Join Date: Jun 2009
Posts: 1
Reputation: Splower is an unknown quantity at this point 
Solved Threads: 0
Splower Splower is offline Offline
Newbie Poster

RSS feeds to PDF

 
0
  #1
Jun 25th, 2009
I want to design a website kind of like "Feedjournal" and "Tabbloid" that will get important rss feeds and blogs to make a newsletter for our company, and then auto email to all staff on set time intervals. I am very new and want to know how to start. What program language should I use? PHP? Any suggestions on how to start?
Reply With Quote Quick reply to this message  
Join Date: Apr 2009
Posts: 761
Reputation: Airshow is on a distinguished road 
Solved Threads: 106
Airshow's Avatar
Airshow Airshow is offline Offline
Master Poster

Re: RSS feeds to PDF

 
0
  #2
Jun 25th, 2009
You might like to look at XSLT. It will act directly on RSS feeds which are just XML conformng to a particular schema.

XSLT is very powerful and can be applied in more than just the normal browser environment. I think you will need to run it server-side along with something that knows how to form PDFs.

quick Google .....

Here you are, FOP from the Apache Project - open source:
http://www.onjava.com/pub/a/onjava/2002/10/16/fop.html

Have fun - it won't be easy - but at least there are sample files to support that article.

Airshow
50% of the solition lies in accurately describing the problem!
Reply With Quote Quick reply to this message  
Join Date: Jul 2009
Posts: 2
Reputation: k1m is an unknown quantity at this point 
Solved Threads: 0
k1m k1m is offline Offline
Newbie Poster

Re: RSS feeds to PDF

 
0
  #3
Jul 2nd, 2009
I wanted to do something similar a few months ago and I actually wrote my own free software/open source service in PHP. You can try it out here: http://fivefilters.org/pdf-newspaper/

Source code is available to download and you can modify it however you like. Hope that helps.

Keyvan
Reply With Quote Quick reply to this message  
Join Date: Apr 2009
Posts: 761
Reputation: Airshow is on a distinguished road 
Solved Threads: 106
Airshow's Avatar
Airshow Airshow is offline Offline
Master Poster

Re: RSS feeds to PDF

 
0
  #4
Jul 2nd, 2009
Looks like a good piece of work Keyvan. The OP could wish for nothing more.

I have bookmarkd for future reference.

From your list of libraries, I guess it's even more complex than I indicated in my post above.

Could you possible run through the libs to indicate what each one does in this app please? Having done some RSS work myself (RSS to HTML), I am very interested.

Airshow
50% of the solition lies in accurately describing the problem!
Reply With Quote Quick reply to this message  
Join Date: Jul 2009
Posts: 2
Reputation: k1m is an unknown quantity at this point 
Solved Threads: 0
k1m k1m is offline Offline
Newbie Poster

Re: RSS feeds to PDF

 
0
  #5
Jul 2nd, 2009
Thanks Airshow,

The starting point was looking for a free PDF library that had some HTML support. I settled on TCPDF as it seems to be updated fairly regularly and had an example file showing multi-column support (example number 10). Its support for HTML is limited though - you have to pass it well formed XHTML and it only handles a small set of elements. I was happy with it though as I was only interested in a few elements - the goal wasn't to create a PDF showing the content as it appeared on the original site.

So a lot of the initial work is actually turning the RSS content to clean XHTML. So for each feed item I run its content through HTML Tidy first. The next step is to remove HTML elements that I have no need for, for that I use HTML Purifier - I give it a list of HTML elements and attributes I can deal with and it strips the rest. I then pass the result through SmartyPants to turn the punctuation into a somewhat prettier form (e.g. curly quotes, ellipsis, en- em-dashes).

When I've gone through the items I want, I simply pass the XHTML to TCPDF (I've extended the multi-column example for better spacing and formatting) and it does the rest. :)

As for the other libraries: I use SimplePie to parse the feed and loop through the items and OPML parser to give users the option of submitting multiple feeds.

If you want to see how all the libraries are used, the makepdf.php source ties them all together.
Reply With Quote Quick reply to this message  
Join Date: Apr 2009
Posts: 761
Reputation: Airshow is on a distinguished road 
Solved Threads: 106
Airshow's Avatar
Airshow Airshow is offline Offline
Master Poster

Re: RSS feeds to PDF

 
0
  #6
Jul 2nd, 2009
That's great Keyvan. Thank you very much.

I wish I had known about these libs when I wrote my RSS reader (as yet unpublished). I did everything in php and I know that some of it could be done a lot better.

I need to discuss more but want not to hijack splower's topic. I will send a PM later today.

Good luck with your project splower.

Must rush, work beckons.

Airshow
50% of the solition lies in accurately describing the problem!
Reply With Quote Quick reply to this message  
Reply

This thread is more than three months old.
Perhaps start a new thread instead?
Message:



Similar Threads
Other Threads in the RSS, Web Services and SOAP Forum
Thread Tools Search this Thread



About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC