I have an issue with my code ... my below code is createing databases(database here is a kd tree) and indexing one image per database. i have two classes LastDatabaseTndexingPolicy and another forwardingDatabaseaccessor.cpp .

I am calling the GetDatabaseToAccess() function from forwardingDatabaseAccessor.cpp class .GetDatabaseToAccess() function is present in LastDatabaseTndexingPolicy class and it returns the database created as a pointer and using that database pointer we call another function which actually indexes the image to the database .

Now my issue is i am not able to have multiple threads act on the following functions as DatabaseAccessor_ptr db which is in the following file is coupled with two functions and however i put locks in the LastDatabaseTndexingPolicy file as below i end up getting synchronization issue ..........

Hence now i am using a single lock in forwardingDatabaseAccessor.cpp and serializing the code,.can anyone suggest me how can i change my design to parallelize this code .........

In ForwardingDatabaseAccessor.cpp we are calling function from LastDatabaseTndexingPolicy as shown below:-

DatabaseAccessor_ptr db is something which needs to be synchroinized. I tried createing 256 databases with one image each and when i run this code i ended up creating 175 databses and though i was restricting in code with locks that every database has only one image i ended up having two images in single database ..... ideally i had to get only one image per database but i got two images in few of them hence instead of 256 database this code created 175 or so databases.

 indexing::IndexReturnValue ForwardDatabaseAccessor::index(cv::Mat image,
            const std::string& imageName, features::Camera::Type indexCameraType,
            features::Camera::Type recogCameraType) {  
        DatabaseAccessor_ptr db = this->IndexingPolicy_ptr->GetDBToIndex();

            return db->index(image, imageName, indexCameraType, recogCameraType);
    }




    In file LastDatabaseTndexingPolicy we have this function GetDatabaseToIndex as follows:----




 DatabaseAccessor_ptr LastDatabaseIndexingPolicy::GetDBToIndex()
         {
            int numDBs;
            std::string db_Name;
            size_t totNoOfIndexedImagesInDB;
            DatabaseAccessor_ptr db;
            std::vector<std::string>::iterator endIter;
            {
                READING_LOCK(lock, adbMutex);
                numDBs = this->associateDBs->GetNumberOfAssociateDBs();
                endIter = this->associateDBs->end();
                endIter--;

            }
            if (numDBs == 0) {

                std::string newDBName = this->CreateAssociateDBs(numDBs);
                {
                    WRITING_LOCK(lock, dbMutex);
                    db = dbGroup.getDatabase(newDBName);
                    return db;
                }

            }

            else {


                    WRITING_LOCK_ADV(writeLock, dbMutex);
                    totNoOfIndexedImagesInDB = dbGroup.getDatabase(*endIter)->size();
                    if (totNoOfIndexedImagesInDB >= (this->maxNumberOfImagesDBCFG))
                    {
                        writeLock.unlock();
                        std::string newDBName = this->CreateAssociateDBs(numDBs);
                        {
                        WRITING_LOCK(lock, dbMutex);
                        db = dbGroup.getDatabase(newDBName);
                        return db;
                        }
                    }
                    else
                    {

                        db = dbGroup.getDatabase(*endIter);
                        return db;
                        writeLock.unlock();

                    }
                    return db;
                }
            }






        std::string IndexToLastDB::CreateAssociateDBs(int numDB) {



                std::ostringstream convert;
                convert << (numDB + 1);
                std::string newDBName = this->dbName + "_server_fwd_" + convert.str();
                std::string db_properties = "type=ripe";
                LOG(info,dbName) << "Name of the new database: " << newDBName;
                {
                    WRITING_LOCK(lock, dbMutex);
                    dbGroup.createDatabase(newDBName, db_properties);
                }

                {
                    WRITING_LOCK(lock, adbMutex);
                    this->associateDBs->AddAssociateDB(newDBName);
                }
                {
                    WRITING_LOCK(fileLock, fileMutex);
                    util::serialization::save<indexing::forward::AssociateDBCollection>(
                            *this->associateDBs, dbFile.string(), serializationFormat);
                }

                return newDBName;

            }

WRITING_LOCK" is implementing boost lock guard with shared mutex and its designed in such a way such that it gets destroyed when its out of scope .

WRITING_LOCK_ADV is boost unique lock with shared mutex which is implemented in our code such that it allows us to unlock before it gets out of scope , but we cannot unlock WRITING_LOCK before it gets out of scope

But my real issue is :-

 indexing::IndexReturnValue ForwardDatabaseAccessor::index(cv::Mat image,
            const std::string& imageName, features::Camera::Type indexCameraType,
            features::Camera::Type recogCameraType) {  

        DatabaseAccessor_ptr db = this->IndexingPolicy_ptr->GetDBToIndex();

            return db->index(image, imageName, indexCameraType, recogCameraType);
    }

there would be race condition however i lock the database creation code in LastDataIndexingPolicy class as "DatabaseAccessor_ptr db " in the above code in ForwardDatabaseAccessor.cpp class where i would be calling GetDBToIndex() is associated with two functions calls . GetDBToIndex() and Index()

like for example thread 1 is returned a database1 pointer and another thread would also be returning datbase pointer then instead of thread one indexing in database1 it indexs in db2 hence there would be two images in database2 though we have restricted in code for having only one image in each databases. Actually i am sorry i am not able to articulate my issue properly .

Here is what I think is happening:

thread-1              thread-2
LOCK(READ)            
read numDBs == X      
UNLOCK(READ)          
YIELD         --->    LOCK(READ)
                      read numDBs == X
                      UNLOCK(READ)
                      LOCK(WRITE)
                      create DB: "Y_server_fwd_X"
                      result <- getDB("Y_server_fwd_X")
                      UNLOCK(WRITE)
                      return result
LOCK(WRITE)   <---    YIELD
create DB: "Y_server_fwd_X"
get DB("Y_server_fwd_X")
result <- getDB("Y_server_fwd_X")
UNLOCK(WRITE)
return result

As you can see, the problem is in the fact that two threads might, concurrently, find the same amount of databases in the group, leading to both threads creating a database with the same name. Presumably, when you "get" the database pointer again by looking it up by name, you will get the same pointer in both cases, leading to the indexing of multiple images to the same database.

You need to have a lock over the entire operation. In this case, the simpler solution seems to be the best, just lock the entire operation with one mutex, instead of this complicated scheme.

thanks a lot for your reply. Yes what you are saying is right , hence if i put a lock to the entire operation then i am making the indexing operating serial and not utilizing the parallel threads .But it is impossible with this code to have parallel threads do indexing . Is there no way i can redesign this to make my indexing parallel???

You could do something like this:

thread-1              thread-2
LOCK(GETNAME)            
DBName = getNewDBName()
DBName == "Y_server_fwd_10"
UNLOCK(GETNAME)          
YIELD         --->    
                      LOCK(GETNAME)
                      DBName = getNewDBName()
                      DBName == "Y_server_fwd_11"
                      UNLOCK(GETNAME)
                      pDB = new DB(DBName)
                      pDB->index(..)
              <---    YIELD
pDB = new DB(DBName)
pDB->index(..)
LOCK(ADDDB)
addDBToGroup(pDB)
UNLOCK(ADDDB)
YIELD         --->    
                      LOCK(ADDDB)
                      addDBToGroup(pDB)
                      UNLOCK(ADDDB)

The point here is that the two operations that you really need to "lock" are (1) obtaining a new and unique name for a new database, and (2) adding the new database to the group of databases. The creation of the database (construction, and allocation with "new") is something that is independent between threads, since each thread deals with its own database. Similarly for the indexing of the image into the database. So, you could lock a function that returns a new unique name for a new database (that also increments the "count" that is used to generate it), and then lock a function to add a database to the group. And then, everything else can be done concurrently (presumably, the most expensive operations are the creation of the database and the indexing of an image, not those two small operations that are locked).

This article has been dead for over six months. Start a new discussion instead.