0

I have a bunch of PDF files made from documents scanned into jpeg images. Now, how can i search these pdf documents for text that is in the image.

:!: This does not seem possible to me as well but i'll appreciate any sort of methods. Any modifications that may need to be done to the images before converting them to pdf?? Anything else?

Thanks a lot

3
Contributors
6
Replies
7
Views
10 Years
Discussion Span
Last Post by WaltP
0

It isn't. There is no text in an image. It's just part of the image.

You can try an OCR (optical character reader) program that can read an image (non-compressed, usually). The best one I found is from http://www.transym.com/index.html

Thanks...

Actually I am trying to write a program that'll search the pdfs for keywords. I guess i'll have to associate files with the keywords in the image then. I can think of only this method... Is this the only wayt to do that?

Thanks..

0

Thanks...

Actually I am trying to write a program that'll search the pdfs for keywords. I guess i'll have to associate files with the keywords in the image then. I can think of only this method... Is this the only wayt to do that?

Thanks..

I really have no idea how you plan to "associate files with the keywords in the image" unless you mean connecting text file to an image somehow. Maybe you need to explain in detail what you are trying to accomplish, what type of images (bitmaps, jpg, etc)

0

I have a bunch of images that are a result of scanned newspapers. I need to link the headlines in the newspaper's image to an index. SO, what i was thinking was making a data base to link the image file with keywords of the headlines. I think this is the only way to do that.

0

That makes sense. The only way to do that as far as I know is by hand. As I said, there is no text in an image so you will have to eyeball the headlines and type them in. Even an OCR program won't help that much.

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.