Is there is any way or method to get the content of pdf row by row/line by line which help to convert pdf to word using itext api??????????

Recommended Answers

All 3 Replies

iText is not intended for PDF to other format conversion. Its only for converting other document formats to PDF.
There are multiple reasons, but one on top of all is that if PDF was created from document that been scanned, this mean that this document was after scan an image which may or may not went through OCR (optical character recognition) process which is never 100%, you will never get all text.
You can always try to read pdf with PdfReader and extract Chunk, Phrase, Paragraph, List, ListItem, Anchor, Chapter, Section, and Image but thats about it

iText is not intended for PDF to other format conversion. Its only for converting other document formats to PDF.
There are multiple reasons, but one on top of all is that if PDF was created from document that been scanned, this mean that this document was after scan an image which may or may not went through OCR (optical character recognition) process which is never 100%, you will never get all text.
You can always try to read pdf with PdfReader and extract Chunk, Phrase, Paragraph, List, ListItem, Anchor, Chapter, Section, and Image but thats about it

thanks for your precious reply.... :-)
Are you able to tell me that which api i choose to convert pdf to text

Apache PDFBox should be better (in description they claim PDF to text conversion), however I cannot comment on it since I never used it

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.