Dear All,

I had developed a desktop application that searches word files contents in a folder for a specific word(s) .It's working fine, but it's slow.

I want to apply indexing(or any other technique) to speed up the performance of my application ... I want to know how to apply indexing step by step as I didn't apply it before.

Thanks in advance ...

Recommended Answers

All 9 Replies

Indexing just means you know about the locations of the files ahead of time.
You could:

1a) have the program scan all directories and store all of the file names (full path) in a list of strings (when it starts)
1b) when searching for a particular file, search the only list and get the full path.
1c) you could even make the collection a dictionary that also contains the path and the keywords

2a) do step 1a then archive that collection, so whenever the program starts, it loads that collection and searches it first.
2b) if the file is not found, it can then search the disk to see if there are new files to add to the collection.

* I'm sure there are also other techniques especially if you have also stored "document property" information in the word files.

Thanks for reply,

I don't just search in the names of files, but I also search in the contents of files, so i need to open each file and search in its content, this makes application slow.

So I need a technique that speeds up the performance.

I understand. 1c addresses that, sort of.
You would still need to scan the files before they are searched.

Which part is slow? Building the file list or searching the file? I assume you are using one Word instance to do the searching.

I don't know which part is slow.
What do you mean by using one Word instance to do the searching?

Also, how slow is slow?
...over how many files?

I don't know which part is slow.
What do you mean by using one Word instance to do the searching?

By "one Word instance", I mean something like ...

Word.ApplicationClass wordApp=new ApplicationClass();
// For each file in list of doc files
// Open file and
// Search for text and ...

Instead of ...

// For each file in list of doc files
Word.ApplicationClass wordApp=new ApplicationClass();
// Open file and
// Search for text and ...

You might find this useful ... http://goo.gl/vZhrY

To find out which part is slow, refactor your code so you can test building the file list and searching the doc files separately.

I hope you find this helpful.

Dear All,

I had developed a desktop application that searches word files contents in a folder for a specific word(s) .It's working fine, but it's slow.

I want to apply indexing(or any other technique) to speed up the performance of my application ... I want to know how to apply indexing step by step as I didn't apply it before.

Would you mind showing us your code?
thx in advance.

Non sequential hard drive access is slow. Really slow. So if you have a lot of files, and no idea what's in them (locations of every word aren't pre-determined) then there's little you can do about this bottleneck.

As thines said, if you want to do the indexing you could use a dictionary. I would use something like this:

dictionary<string,list<string>> keywordsToFilePaths

You would need a background worker analyzing the directory structure for changes, then parsing through these files and loading each word into the dictionary. I would automatically skip files that aren't plaintext (or at least contain a lot of plain text). This dictionary could be stored in a file on the harddrive, hashing each keyword for quick lookup access when needed. This might be useful if it starts consuming a lot of memory (likely, especially if you are indexing a whole drive)

Windows does this type of thing for you with indexing enabled. Although I have no idea how to tap into its cache (I've never tried). I bet google might help you with that one.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.