| | |
class to read any kind of file format e.g .doc,.pdf,.txt
Please support our Java advertiser: Programming Forums - DaniWeb Sister Site
![]() |
•
•
•
•
HI
my package reads file from txt format only. it does not read .pdf or .doc file. is there any single java library that opens any kind of file format as stream and then reads or manipulates its contents.
The actual answer to the question as you intend it:
No, and there never will be.
Any application could use anything it wanted for a file format and "manipulate its contents" depends completely on the nature of that application. You cannot expect to have a completely generic solution to an inherently non-generic problem.
•
•
Join Date: Oct 2008
Posts: 4
Reputation:
Solved Threads: 0
HI
Thank you two for your reply. if what you are saying is true, then "Does it mean that there cannot be a java application that reads any kind of file and count the frequency of any given word in it."
1) i mean we can not use java for trying to make an index of keywords?
2) URL and URLConnection classes of java given me the html contents of the web page, is there any way to get just the content like text etc of the web page.
Thank you two for your reply. if what you are saying is true, then "Does it mean that there cannot be a java application that reads any kind of file and count the frequency of any given word in it."
1) i mean we can not use java for trying to make an index of keywords?
2) URL and URLConnection classes of java given me the html contents of the web page, is there any way to get just the content like text etc of the web page.
What it means is that you need identify exactly what you want to read and use appropriate APIs for that content. There are libraries for working with .doc files, .pdf files, etc., but there is not a "reading anything I might happen to come across" library because that is a completely unrealistic expectation.
•
•
Join Date: Apr 2006
Posts: 164
Reputation:
Solved Threads: 10
•
•
•
•
HI
Thank you two for your reply. if what you are saying is true, then "Does it mean that there cannot be a java application that reads any kind of file and count the frequency of any given word in it."
1) i mean we can not use java for trying to make an index of keywords?
2) URL and URLConnection classes of java given me the html contents of the web page, is there any way to get just the content like text etc of the web page.
There can be a java application that reads any kind of file and count the frequency of any given word in it. BUT, when the application is given that "any" kind of file, that application has to recognize the type, and treat them differently while parsing data from the files. Once data is parsed, they can be treated as same. An application dealing with "any kind of file" and a class dealing with "any kind of file" is NOT the same.
1. I think I already said the answer. Precisely, you can.
2. So are you saying, you want the data from html file without the tags? Them parse the html file using DOM or SAX, and iterate through the nodes (the tags), and get whatever data you want. Or another way is to using regular expression to remove the tags.
A Perfect World
![]() |
Other Threads in the Java Forum
- Previous Thread: How do I round?
- Next Thread: Best compiler to use java applet on online games?
| Thread Tools | Search this Thread |
2dgraphics account android api apple applet application array arrays automation banking binary binarytree bluetooth chat chatprogramusingobjects class classes client code component data database derby design draw eclipse encryption error event exception fractal game givemetehcodez graphics gui html ide if_statement image inheritance input integer interface j2me java javadesktopapplications javaprojects jlabel jni jpanel jtextfield julia linux list loop map method methods midlethttpconnection mobile monitoring netbeans newbie nullpointerexception open-source oracle print printing problem program programming project property recursion reference ria scanner screen search server set size sms sort sourcelabs splash sql static stop string swing testautomation threads time tree ui unicode validation windows






