I want to design a website kind of like "Feedjournal" and "Tabbloid" that will get important rss feeds and blogs to make a newsletter for our company, and then auto email to all staff on set time intervals. I am very new and want to know how to start. What program language should I use? PHP? Any suggestions on how to start?
You might like to look at XSLT. It will act directly on RSS feeds which are just XML conformng to a particular schema.
XSLT is very powerful and can be applied in more than just the normal browser environment. I think you will need to run it server-side along with something that knows how to form PDFs.
quick Google .....
Here you are, FOP from the Apache Project - open source:
Have fun - it won't be easy - but at least there are sample files to support that article.
I wanted to do something similar a few months ago and I actually wrote my own free software/open source service in PHP. You can try it out here: http://fivefilters.org/pdf-newspaper/
Source code is available to download and you can modify it however you like. Hope that helps.
Looks like a good piece of work Keyvan. The OP could wish for nothing more.
I have bookmarkd for future reference.
From your list of libraries, I guess it's even more complex than I indicated in my post above.
Could you possible run through the libs to indicate what each one does in this app please? Having done some RSS work myself (RSS to HTML), I am very interested.
The starting point was looking for a free PDF library that had some HTML support. I settled on TCPDF as it seems to be updated fairly regularly and had an example file showing multi-column support (example number 10). Its support for HTML is limited though - you have to pass it well formed XHTML and it only handles a small set of elements. I was happy with it though as I was only interested in a few elements - the goal wasn't to create a PDF showing the content as it appeared on the original site.
So a lot of the initial work is actually turning the RSS content to clean XHTML. So for each feed item I run its content through HTML Tidy first. The next step is to remove HTML elements that I have no need for, for that I use HTML Purifier - I give it a list of HTML elements and attributes I can deal with and it strips the rest. I then pass the result through SmartyPants to turn the punctuation into a somewhat prettier form (e.g. curly quotes, ellipsis, en- em-dashes).
When I've gone through the items I want, I simply pass the XHTML to TCPDF (I've extended the multi-column example for better spacing and formatting) and it does the rest. :)
As for the other libraries: I use SimplePie to parse the feed and loop through the items and OPML parser to give users the option of submitting multiple feeds.
If you want to see how all the libraries are used, the makepdf.php source ties them all together.
That's great Keyvan. Thank you very much.
I wish I had known about these libs when I wrote my RSS reader (as yet unpublished). I did everything in php and I know that some of it could be done a lot better.
I need to discuss more but want not to hijack splower's topic. I will send a PM later today.
Good luck with your project splower.
Must rush, work beckons.