Hello fellow code enthusiasts.

I am currently trying to figure out how to retrieve text from pdf images submitted to my MS office inbox and put them into an excel spread sheet.

At the moment, I plan on using Tesseract as a library (well reference) to do this. Is there a better tool for this job? As you can probably guess, I am trying to use free tools.

Thanks in advance!

One thing you could try is using VBA, which comes with office, and the tesseract library. However, from my reading of the docs tesseract itself is a commandline program, so you'll have to either run it from the program, or find a .net wrapper which will allow you to use it as a library. There are others,try here, but trial and error is pretty much the best way to get any real info on which is better for your application.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.