Hi Friends,
I want to do a project for Parsing Resume in C#. i.e when we upload resumes(More than 100), it should extract Name, email id, phone no, skills.

Please don't tell that software’s are available. I tried those soft wares, but they are not working properly. So, I wanted to do by myself.

Please Help me.

The following is java, for extracting text from files:

import java.io.IOException;

import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.parser.PdfReaderContentParser;
import com.itextpdf.text.pdf.parser.SimpleTextExtractionStrategy;
import com.itextpdf.text.pdf.parser.TextExtractionStrategy;

public class PDFToText {

    public static void main(String[] args) {
        for(int i = 0 ; i < args.length ; i++)
            try {
            } catch (IOException e) {

    }//end main

    //counts words specified in the word map that occur in a given pdf document
    public static String wordsInFile(String pdf) throws IOException {
        PdfReader reader = new PdfReader(pdf);
        PdfReaderContentParser parser = new PdfReaderContentParser(reader);
        TextExtractionStrategy strategy;
        StringBuilder fullText = new StringBuilder();
        //String result = "";

        //Only a single page in memory at a given time
        for (int i = 1; i <= reader.getNumberOfPages(); i++) {
            strategy = (TextExtractionStrategy) parser.processContent(i, new SimpleTextExtractionStrategy());
            //one page of text
        }//end loop

        return fullText.toString();
    }//end method

}//end class

The reason it is relivant is that instead of using itext, you can use itext sharp, a C# port of itext. I am reading a book about itext sharp in java called "itext in action". I think this is where itext sharp is downloaded.
Word documents I am unsure of how to parse, but I seem to remember some code project projects that discuss the process a little. The projects are much different from this itext library.

This article has been dead for over six months. Start a new discussion instead.