I thought that notepad was a MS program. How do you read it? What do you get when you read the notepad.exe file? Do you mean text files?
The Apache project has some packages (?called POI) for reading PDF files.
Google for it
NormR1
Posting Expert
6,677 posts since Jun 2010
Reputation Points: 1,138
Solved Threads: 656
The answer is yes and no. No in case that PDF is actually batch of images that been previously scanned and just converted from what ever image format to PDF(they would need to under go OCR- optical character recognition process which is not 100% perfect)
Yes you can, and should be able to do with iText PdfReader and get page build components, or you can use Apache PDFBox which should be more flexible in the way of data extraction from PDF.
@NormR1 POI is for Microsoft Office document formats(Word, Excel, etc)
peter_budo
Code tags enforcer
15,436 posts since Dec 2004
Reputation Points: 2,806
Solved Threads: 902