I am working on an application and would like to be able to calculate the efficiency of various algorithms and choose the best one.
The application operates on a list of records that are stored in a file. Operations include adding records, deleting records, and searching for records based on any field in the record.
I know that the File I/O and searching has cost and I am having trouble find a balance between them. Is it more desirable to minimize search time or File I/O? For example, I could use the search from the STL, but that would require reading all of the records from the file before starting the search, so I think that would be bad. The application currently uses a simple search, but I am wondering if there is a better way and generally, how to balance the cost of File I/O and search operations.

//simple search for "name"
while (!File.end()) {
Record temp(File.readLine());
if (name.toUpper() == temp.name().toUpper())
return temp;
}

What kind of algorithms should I look at for this?

Recommended Answers

All 3 Replies

It would depend on the size of the file and how the file was written. If the file is relatively small (less than 100 meg or so) and the computer has lots of free ram then reading the file into memory and searching would be the fastest.

If each record of the file contains multiple fields, such as name, age, street, city, zip etc. etc, then a fix-length binary file can be read a lot faster than a simple text file becuse an entire record can be read at one time. And if the file is already sorted by the search field then you can use binary search algorithms on it. But sorting the file before searching would probably not be very efficient unless there were multiple searches on the same field.

Thank you for your insight. The file will be relatively small, so I can try to optimize my search after reading a list of all records.
The problem that I am running in to while looking for a search algorithm is that I need to be able to do searches for different data fields. If the data field I'm search is a string, in some cases I will need to find out if the search string is a subset of that field for any list item. Sometimes I will also have to try to match the search string to any data field.
I've looked at the STL search algorithms and haven't come up with anything that would allow me to specify both a search string and a match criteria.

Any suggestions?

Ok, here is a simple example that I whipped up in just a few minutes. It creates a vector of people structures then calls std::search_b() to find one of the objects.

#include <vector>
#include <iostream>
#include <algorithm>
#include <string>
using namespace std;

struct person
{
    string fname;
    string lname;
    void init(const char* f, const char*l)
    {
        fname = f;
        lname = l;
    }
};

bool compare1(person& p1, const char* value)
{
    return p1.fname == value;
}

bool compare2(person& p1, const char* value)
{
    return p1.lname == value;
}

int main()
{
    vector<person> people;
    person p;
    p.init("Jones","Tom");
    people.push_back(p);
    p.init("smith","Mary");
    people.push_back(p);
    p.init("Adams","Tom");
    people.push_back(p);
    p.init("Baker","Henry");
    people.push_back(p);
    vector<person>::iterator it = search_n(people.begin(),people.end(),1,"smith",compare1);
    if( it == people.end() )
    {
        cout << "Not found\n";
    }
    else
    {
        cout << "Found\n";
    }
	return 0;
}
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.