Convert pdf to text

Question

bhallarahul -4 Light Poster

13 Years Ago

Is there is any way or method to get the content of pdf row by row/line by line which help to convert pdf to word using itext api??????????

api java pdf

2 Contributors
3 Replies
955 Views
16 Hours Discussion Span
Latest Post 13 Years Ago Latest Post by peter_budo

All 3 Replies

peter_budo 2,532 Code tags enforcer

13 Years Ago

iText is not intended for PDF to other format conversion. Its only for converting other document formats to PDF.
There are multiple reasons, but one on top of all is that if PDF was created from document that been scanned, this mean that this document was after scan an image which may or may not went through OCR (optical character recognition) process which is never 100%, you will never get all text.
You can always try to read pdf with PdfReader and extract Chunk, Phrase, Paragraph, List, ListItem, Anchor, Chapter, Section, and Image but thats about it

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

bhallarahul -4 Light Poster · Answer 1 · 2012-01-30T00:22:33+00:00

iText is not intended for PDF to other format conversion. Its only for converting other document formats to PDF.
There are multiple reasons, but one on top of all is that if PDF was created from document that been scanned, this mean that this document was after scan an image which may or may not went through OCR (optical character recognition) process which is never 100%, you will never get all text.
You can always try to read pdf with PdfReader and extract Chunk, Phrase, Paragraph, List, ListItem, Anchor, Chapter, Section, and Image but thats about it

thanks for your precious reply.... :-)
Are you able to tell me that which api i choose to convert pdf to text

peter_budo 2,532 Code tags enforcer Team Colleague Featured Poster · Answer 2 · 2012-01-30T00:31:21+00:00

Apache PDFBox should be better (in description they claim PDF to text conversion), however I cannot comment on it since I never used it

Convert pdf to text

Recommended Answers Collapse Answers

All 3 Replies

Recommended Answers