hello!
i am in a great dilemma.iam a student.ilove writing programs and love experimenting.

i was writing a code in C to enter all directories and their files(in my computer) into a hash table and then to search for the presence of a file.if the file is present,itmust display the pathname.the hash function i have used is sha1.i can reach the directories but am not been able to reach the files.the hash table entry for a file corresponds to its pathname.so as i reach a directory which contains files,it must be entered to the hash table with its pathname as the key.do i need to send my code?
i hope i made myself clear.
plzzzzzzzz help.

Recommended Answers

All 9 Replies

Let me get this straight because your communication skills are lacking. You have a hash table with an exceptionally slow hash function (FNV would be faster, and since you need to handle collisions due to duplicates anyway, a cryptographic hash is a little much for table lookup). You're keying the table on file and directory names, and the value paired with the key is the path of the file or directory name?

>i can reach the directories but am not been able to reach the files
I don't understand this at all. Are you having trouble traversing the file structure to get to each file? Or are you having trouble searching the hash table for the files?

Could a super mod move this thread to the C and C++ forum, please?

yes,i am having trouble in traversing the file structure to get to each file.
i need to enter the files and directories in the hash table.and then i need to search the hash table for the files.my logic was to check the contents of each directory,starting from /.
if its a file,then file is entered to the hash table and the corresponding key to be hashed is its pathname.that is i pair filename(data)and pathname(key).
if its another subdir.,theni should traverse that subdir.,traverse its files/subdir.,insert if files and the procedure should go on recursively.

after all entry is made,i would search for the file and it should give me the pathname.

i dont know about FNV.i decided on sha1 or md5.also,i want to know about a better searching algorithm.
thanks for the reply.please suggest me a better metod to implement if my logic seems to be wrong.


Let me get this straight because your communication skills are lacking. You have a hash table with an exceptionally slow hash function (FNV would be faster, and since you need to handle collisions due to duplicates anyway, a cryptographic hash is a little much for table lookup). You're keying the table on file and directory names, and the value paired with the key is the path of the file or directory name?

>i can reach the directories but am not been able to reach the files
I don't understand this at all. Are you having trouble traversing the file structure to get to each file? Or are you having trouble searching the hash table for the files?

Could a super mod move this thread to the C and C++ forum, please?

>yes,i am having trouble in traversing the file structure to get to each file.
How are you trying to traverse the file structure?

>file is entered to the hash table and the corresponding key to be hashed is its pathname
Why? It makes more sense for the file name to be the key and the paths for each occurance of the file to be the value.

>i dont know about FNV
http://www.eternallyconfuzzled.com/tuts/hashing.html#fnv

i used getwd() to get the current working directory.by using scandir()
listed all the files and directories in the current working directory.i called a function to check whether the listed component is a directory or not.if its a file i entered it to the hashtable(pathname being its dir appended to parent dir .....and eventually to its current working dir).if its a dir again i called scandir() to list its components but its not working.
yes , its a good idea to enter the file as key.
please suggest a correct logic if i am wrong.
thanks.

>yes,i am having trouble in traversing the file structure to get to each file.
How are you trying to traverse the file structure?

>file is entered to the hash table and the corresponding key to be hashed is its pathname
Why? It makes more sense for the file name to be the key and the paths for each occurance of the file to be the value.

>i dont know about FNV
http://www.eternallyconfuzzled.com/tuts/hashing.html#fnv

>but its not working
Have you tested it in isolation? If so, and it still fails to work, post your code and we can look it over for you.

You didn't mention the operating system you are using. If MS-Windows, then you can use _findfirsrst() and _findnext() to transverse through the file system. When a sub-directory is found, just merly make a recursive call to the function.

*nix is somewhat different. It uses opendir() and readdir(). I don't know *nix enough to explain more, but you progably goole for those functions and you should be able to find code.

Here is the basic algorithm for searching MS-Windows file system. If you use it, you will have to modify it to suit your application -- such as collect filenames in a hash table.

//
// include files
#include <stdio.h>
#include <io.h>
#include <string.h>

void FileFinder(const char* path)
{
	struct _finddata_t data;
	char buf[255];
	// Crete the initial filespec string, which may contain * or ?
	// wild cards.  To specify all files, use "*.*".  You can also limit
	// the search to certain file extension, for example to find all the
	// files with .txt extension the filespace is "*.TXT" -- MS-Windows is
	// not case sensetive, so *.TXT is the same as *.txt and *.tXt.
	strcpy(buf,path);
	strcat(buf,"\\*.*");
	long n = _findfirst(buf,&data);
	if(n >= 0)
	{
		do {
			// check if not current or parent directories
			if(strcmp(data.name,".") != 0 && strcmp(data.name,"..") != 0)
			{
				if(data.attrib & _A_SUBDIR)
				{
					// this is a directory.  So make recursive call
					// to search it.
					strcpy(buf,path);
					strcat(buf,"\\");
					strcat(buf,data.name);
					FileFinder(buf);
				}
				else if( data.attrib & (_A_NORMAL | _A_HIDDEN | _A_ARCH))
				{
					// normal filename, so just display it
					printf("%s\\%s\n",path,data.name);
				}


			}
		// get next file or directory name
		}while(_findnext(n,&data) == 0);
		// close the find handle
		_findclose(n);
	}
}


int main(int argc, char* argv[])
{
	FileFinder("c:\\windows");
	return 0;
}

>You didn't mention the operating system you are using.
I think it's safe to assume Linux since Windows doesn't have a / directory (though I suppose C:\ serves that purpose on a default installation), nor is scandir() a function that I've ever seen on a Windows box.

>It uses opendir() and readdir().
Which is just a more verbose and flexible way to get the same effect as calling scandir(). By the way, you can google with "man function" if you aren't on a Linux or Unix system to see how they work. It's easier than saying "I don't know about this, but here's an identical solution that I heard someone talk about once".

>>It uses opendir() and readdir().
Which is just a more verbose and flexible way to get the same effect as calling scandir(). By the way, you can google with "man function" if you aren't on a Linux or Unix system to see how they work. It's easier than saying "I don't know about this, but here's an identical solution that I heard someone talk about once".

I never heard of scandir(), but I'm glad *nix os writers got smart and made it a little easier. :p And glad you mentioned how to google for man pages -- good to know. :idea:

yes we are using linux platform.
we are sending the code along.
its just a scratch,please help us develop it.

#include <sys/types.h>
#include <sys/dir.h>
#include <sys/param.h>
#include <stdio.h>
#include<strings.h>
#include<math.h>
#include<string.h>
#include<ctype.h>
#include<limits.h>
#include<stdlib.h>
#include<unistd.h>
#include<sys/stat.h>
#include<dirent.h>

 
#define FALSE 0
#define TRUE !FALSE



#define MAXCOMMAND 100
#define HASHELEMENTS 10007

struct htab {
   struct htab *child;
   struct htab *parent;
   char key[MAXCOMMAND];
   char data[HASHELEMENTS];
};

unsigned int hash(char *);
struct htab *addhash(char *, char *);
//struct htab *findhash(char *);
struct htab *hashtab[HASHELEMENTS];
//newstart(char *,char *); 

 struct htab *curhash;
   struct htab *newhash;

//struct direct **files;
                //int file_select();




 
extern  int alphasort();
 
char pathname[MAXCOMMAND];
char pathname2[MAXCOMMAND]; 

unsigned int hash (char *s) {
   //int i, n;                  // Our temporary variables
   unsigned int hashval;


   hashval  = 0;                // Initialize our variables



  for(; *s !='\0';s++)

  hashval=*s+(hashval<<5)-hashval;

      return hashval % HASHELEMENTS;
}



struct htab *addhash (char *key, char *data)
 {
   struct htab *newhash;
   struct htab *curhash;
   unsigned int i, hashval;
    

   newhash = (struct htab *)(malloc(sizeof(struct htab)));
   if (newhash == NULL) {
   }

   for(i = 0; i <= strlen(key); i++) {
      newhash->key[i] = key[i];
   }
   for(i = 0; i <= strlen(data); i++) {
      newhash->data[i] = data[i];
   }

   hashval = hash(key);

   if (hashtab[hashval] == NULL) {
      hashtab[hashval] = newhash;
      hashtab[hashval]->parent = NULL;
      hashtab[hashval]->child = NULL;
   }


else {
      curhash=hashtab[hashval];
      while(curhash->child != NULL) {
         curhash=curhash->child;
      }
      curhash->child = newhash;
      newhash->child = NULL;
      newhash->parent = curhash;
   }

   return newhash;
}


int dircheck( char *vut[100])
{
        int s=0;
        struct  stat statinfo;
        //vut=(char *)(malloc(sizeof(char)));

          stat(vut,&statinfo);
                if(statinfo.st_mode&S_IFDIR)
                       {
                        s=1;
                        printf("is a directory\n");
                       }                        
                else    printf(" is not a directory\n");
                        return s;

}


main()   
{ 
   struct htab *curhash;
   struct htab *newhash; 
                  int count,i,s;
char *key,*data,*key1;
		struct direct **files;
		int file_select();
 
		if (getwd(pathname) == NULL )
			{ printf("Error getting path$\backslash$n");
					exit(0);
						}
				printf("\nCurrent Working Directory = %s\t\n",pathname);
                               
                                     
                                  key=pathname;
                           //strcpy(key,pathname);
                             // printf("\n%s\t\n",key);
                                 key1=pathname;    
                             // printf("\n%s\t\n",key1);
                             //  printf("pathname %s\t\n",pathname);         
                                   newstart(pathname,key);
                                         }

               newstart1(char *pathname,char *key)
                           {
                               // printf("\n%s\t\n",pathname);
                               // char *tst;
                              int count;
                               int i;
                                 struct direct **files;
                                     int file_select();
                                count =scandir(pathname, &files, file_select, alphasort);

                                /* If no files found, make a non-selectable menu item */
                                if (count <= 0)
                                        {
                                          printf("\nNo files in this directory\n");
                                                                
                                        }
                printf("\n\n Number of components THIS directory= %d",count);
                                        for(i=1;i<count+1;i++)
                                          {
                                            //  printf("\n%s",files[i-1]->d_name);
                                              //tst= files[i-1]->d_name;
                                               recur(files[i-1]->d_name,key);
                                                    }
                            }




              newstart(char *pathname,char *key)
                           {
                               // printf("\n%s\t\n",pathname);
                               // char *tst;                               
				   int count;
                               int i;
                                 struct direct **files;
                                     int file_select();
                         	count =scandir(pathname, &files, file_select, alphasort);
 
				/* If no files found, make a non-selectable menu item */
				if (count <= 0)
		                	{		
                                          printf("\nNo files in this  directory\n");
					 		
				        }
	       	printf("\n\n Number of components in MAIN  directory= %d\n",count);
                                        for(i=1;i<count+1;i++)
                                          {
                                             // printf("\n%s",files[i-1]->d_name);
                                              //tst= files[i-1]->d_name;
                                               recur(files[i-1]->d_name,key); 
                                                    }                                  
                            }
                             
                                
              /* compo(int count,char *key )                  
                                   {
                                        printf("%d",count);
                                        printf("%s\t",key); 
                                 	int i;
                                         struct direct **files;
                              		for (i=1;i<count+1;++i)
                                        {    printf("%s",files[i-1]->d_name);
                                            // recur(i,key);
                                        }
                                   }*/
                                     
               recur(char *data,char *key)
                                    {
                                          // printf("%s",data);
                                           // printf("%s",key);
                                     char *k1,*k2,*key2;
                                               int s;
                                                //struct direct **files;
                                               // int file_select();   
                                             //    printf("\n are u");
                                         
                                               //  printf("\n still?");
				
                                               //  printf("\n yes i am");
                                           //data=files[i-1]->d_name;
                                                 //printf("\n%s",data);
                                               curhash = addhash(key,data);
                                             
                            printf("\n\nAdded \'%s\' to \'%s\'\n", curhash->data, curhash->key);
                            printf("curhash: %lx\n", curhash);
                            if (curhash->parent == NULL) {
                                printf("No Parent!\n");
                                    }
                               else {
                            printf("Parent: %lx\n", curhash->parent);
                                    }
                            if (curhash->child == NULL) {
                            printf("No Child!\n");
                                                        }
                                else {
                             printf("Parent: %lx\n", curhash->child);
                                     }
                                       
                           s = dircheck(*data);   
                         //   printf("value of s is %d",s);
                                     if(s==1)
                               
                            {         
                                      key2=key;
                                     
                                      k1  = strcat(key2,"/");
                                      k2  = strcat(k1,data);
                                    // printf("\n concatn:::");
                                    //  printf("%s",k2);
                                     *key2=*k2;
                                     strcpy(pathname2,k2);
                                  //    printf("\n pathnamesss %s",pathname);
                                      
                                      newstart1(pathname2,key2);
                          
                             }           
                                      
                          }
				//printf("$\backslash$n"); /* flush buffer */
                                  
		



int file_select(struct direct   *entry)
 
		{if ((strcmp(entry->d_name, ".") == 0) ||
						(strcmp(entry->d_name, "..") == 0))
						 return (FALSE);
				else
								return (TRUE);
		}
/*int file_select(struct direct   *entry)
 
		{
                       char *ptr;
				 
                       char *rindex(char *s, char); 
                 if ((strcmp(entry->d_name, ".")== 0) ||(strcmp(entry->d_name, "..") == 0))
		 return (FALSE);
 
				 /* Check for filename extensions */
		/*ptr = rindex(entry->d_name, '.');
		  if ((ptr != NULL) && ((strcmp(ptr, ".c") == 0)||  (strcmp(ptr, ".h") == 0) || (strcmp(ptr, ".o") == 0)))
						    return (TRUE);
				                              else
						 return(FALSE);
		}*/

<< moderator edit: added code tags: [code][/code] >>

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.