In my program, I pick a pivot p from a database S of strings, compute the median r of the distances of the string objects to p and then divide the objects into roughly equal-sized subsets S1 and S2 as follows:
S1={o ε S \{p}|d(p,o)<r} and
S2={o ε S \ {p}|d(p,o)>=r}
where d(p,o) is the distance of the database object o to pivot p.Thus, the objects in S1 are inside the ball of radius r around p, while the objects in S2 are outside this ball. Applying this rule recursively leads to a binary tree,where a pivot object is stored in each internal node, with the left and right subtrees corresponding to the subsets inside and outside the corresponding ball,respectively.

Below I give the function for the above criteria but when I call the function from main, it gives floating point exception. I will really appreciate if someone can help me find the errors in my code and where I am going wrong:

void  query_ball(TreeNode *t, TreeNode *n,int N)
{

string str;
int *arr=new int[1000];
char *B=new char[40]; // array to store  strings from text.txt
char *C=new char[40]; // C is pivot
int v;
vector<string> myStrings(N);
int r,n1,n2;
ifstream file("text.txt"); // this file contains the string objects

for(p=0;p<N;p++)
{      
   getline(file,str);
   strcpy(B,str.c_str());
   myStrings[p]=B;
}

 int pos=0;
 pos=rand()%N;
 strcpy(C,myStrings[pos].c_str());

for(p=0;p<N;p++)
{
strcpy(B,myStrings[p].c_str());
arr[p]=edit(B,C); // arr contains the edit distances of the strings to pivot C 
//cout<<arr[p]<<endl;
}
       
           
n1=N/2;
n2=(N/2)+1;
r=int((arr[n1]+arr[n2])/2); // r is the median of string distances to C
   
TreeNode *np=new TreeNode;
 np->val=C;
if(t==NULL)
  t=np;
else  
   t->np;
 
for(p=0;p<N;p++)
{
strcpy(B,myStrings[p].c_str());
 if(edit(B,C)>=r)  // string objects outside ball of radius r
  {   n->val=B;
      if(t->RChild==NULL)
         {
            t->RChild=n;}
 
       else
           { query_ball(t->RChild,n,int(N/2));}
   }   
   if(edit(B,C)<r)  // string objects inside ball of radius r
      { n->val=B;
        if(t->LChild==NULL)
         { t->LChild=n;}
       else
         {  query_ball(t->LChild,n,int(N/2)); }
 
      }
}

PreOrder(Root);  
  
}

Recommended Answers

All 14 Replies

Given my problem, how can I call the function recursively to create a binary tree so that a pivot object is stored in each internal node of the tree, with the left and right subtrees corresponding to inside and outside the ball of radius r respectively? I somehow feel my codes are wrong. I would welcome somebody to outline the steps in the algorithm.

Any help much appreciated.

Lets do some binary Tree.
What your algorithm says.

I pick a pivot p from a database S of strings, compute the median r of the distances of the string objects to p and then divide the objects into roughly equal-sized subsets S1 and S2 as follows:
S1={o ε S \{p}|d(p,o)<r} and
S2={o ε S \ {p}|d(p,o)>=r}
where d(p,o) is the distance of the database object o to pivot p.Thus, the objects in S1 are inside the ball of radius r around p, while the objects in S2 are outside this ball.

first analyze your problem in detail.
how to find a pivot is the first question that rings in my mind, if I find the pivot by dividing the entire database S of strings by 2. that means I should know the bounds. m i right of lets forget about dividing it by 2, next what we need to do if we are choosing random pivot, what if it goes out of bounds ? you are doing the right thing by taking modulo N.
Then we need to calculate the median. you can use simply formula over the vector indices to calculate the median.

now what you need to do is to calculate the distances of pivot from each element in the string database S. and store them in some collection if you know vector use vector instead of plain arrays.

okey lets come to binary tree.
What your edit function do I don't know.
First insert the median Node.
then traverse the entire vector and insert them by using the simple BST insert algorithm.
the left subtree contains the object inside ball and right subtree consists of the objects outside ball.
hope this help.

The nodes of my tree should be objects and not distances. So the root of the tree should be a pivot object while the left subtree should correspond to objects if the object distances from pivot is less than radius r. Similarly the right subtree should correspond to objects if the object distances from pivot are greater than or equal to r. My edit function relates to edit distances of pivot from other string objects. An edit distance between 2 strings is found by changing one string to another with the minimum number of insertions, deletions and/or replacement of characters in the strings.
Now I am not sure if I doing the lines 38-41 correctly. What I am trying to do here is check if the treenode is null; if it is then the pivot object will be assigned to the tree node. I guess I dont need the else condition, right? Another problem I am having is with recursion. I need to update r. Am I doing it right in lines 53 and 60. For example in line 53 I have
query_ball(t->RChild,n,int(N/2)) where I am making a call to N/2 to update r in successive rounds. Is this logic right?
In other words, I badly need help with recursion to build a binary tree where a pivot object is stored in each internal node, with left and right subtrees corresponding to objects inside and outside the corresponding ball of radius r, respectively.
Any help much appreciated.

Ok, I have modified my codes as below but when I print out the tree in preorder it just shows 3 strings although I have 1000 strings in my database. Here goes:
Please tell me how I can modify my codes to print out all the 1000 strings in preorder.

void query_ball()
{
string str;
int i=0,j=0,p,num;
int *arr=new int[1000];
char *B=new char[40]; // array to store each string from text.txt
char *C=new char[40]; // C is pivot
vector<string> myStrings(1000);
double r;
int n1,n2,temp;   
ifstream file("text.txt"); // this file contains the string objects
 
for(p=0;p<1000;p++)
{
   getline(file,str);
   strcpy(B,str.c_str());
   myStrings[p]=B;
}
 int pos=0;
 pos=rand()%1000;
 strcpy(C,myStrings[pos].c_str());

for(p=0;p<1000;p++)
{
strcpy(B,myStrings[p].c_str());
arr[p]=edit(B,C); // arr contains the edit distances of the strings to pivot C 
}

n1=1000/2;
n2=(1000/2)+1;
r=(arr[n1]+arr[n2])/2; // r is the median of string distances to C 

TreeNode *np=new TreeNode;
TreeNode *n=new TreeNode;
 np->val=C;
if(Root==NULL)
 { Root=np;

 }
for(p=0;p<1000;p++)
{
strcpy(B,myStrings[p].c_str());
 if(edit(B,C)>=r)  // string objects outside ball of radius r
  {
           
     n->val=B;
      if(Root->RChild==NULL)
         {
            Root->RChild=n;}

       else
           {Root->RChild=Root->RChild->n;
 
           }
    }
   if(edit(B,C)<r) // string objects inside ball of radius r
      {
        n->val=B;
        if(Root->LChild==NULL)
         { Root->LChild=n;}
       else
         { Root->LChild=Root->LChild->n;
   
          }
      }
      
}
            
}

I don't know about what your TreeNode do ? what are its members.
one problem that is visible is that, you have 1000 data elements,
but you have just created only 2 nodes out the loop what about other 998 nodes...you are not allocating the space for them

your code.

for (int p=0; p<1000; ++p)
if (Root == 0)
{
   Root = new TreeNode();
   Root->val = C;
 }
else 
{
   TreeNode* temp = Root;
   while (temp)
   {
	if (edit(B,C)<r)
	{
	   if (temp->LChild)
	      temp = temp->LChild;
	   else 
	   {
                      TreeNode* t = new TreeNode();
                      t->val = B;
	      temp->LChild = t;
	      break;
	   }
	}

	if (edit(B,C)>=r)
	{
	   if (temp->RChild)
	       temp = temp->RChild;
	   else 
	   {
                       TreeNode* t = new TreeNode();
                       t->val = B;
	       temp->RChild = t;
	       break;
	   }
	}
   }
}

Hope this helps.

I included braces within for loop of you code and ran it. It is just printing the same string 1000 times. What could be wrong? Please help.

I figured out that the line strcpy(B,myStrings[p].c_str()); is missing in Ahmed's codes. Even if I include it as the first line of the while statement, it prints the same string 1000 times. I have modified his codes tuned to my previous codes but now allocating space for 1000 nodes but still only two strings get printed. By the way the definition of TreeNode is as follows:

class TreeNode {   
public:
TreeNode *LChild; 
char *val;
TreeNode *RChild;
TreeNode *t;
TreeNode() {
LChild=NULL;
RChild=NULL;
} };

Below I give my modified codes for which only two strings get printed:

for (p=0; p<1000; p++)
{
  //strcpy(B,myStrings[p].c_str());
  if (Root == NULL)
{
   Root = new TreeNode();
   Root->val = C;
 }
else
{
  TreeNode* temp = Root;
  if(temp!=NULL)
    {   strcpy(B,myStrings[p].c_str());

        if (edit(B,C)<r)
        {

           if (temp->LChild==NULL)
            { TreeNode* t = new TreeNode();
              t->val = B;

              temp->LChild=t;

            }
       else
           {
                     TreeNode* t = new TreeNode();
                     t->val = B;
                      temp->LChild = temp->LChild->t;

           }
        }
  if (edit(B,C)>=r)
        {
 
           if (temp->RChild==NULL)
              {   TreeNode* t = new TreeNode();
                   t->val = B;

                 temp->RChild=t;
              }
          else
           {
              TreeNode* t = new TreeNode();
              t->val = B;
               temp->RChild = temp->RChild->t;

           }
        }
   }
}
        
}
}

I would really appreciate if somebody would help out with the codes.

Thank you

Somebody please help me with the codes for binary tree so that I can print 1000 strings of the tree in preorder. At the moment I can print only two strings in preorder. Please see the previous threads that I posted for my recent codes.

Help badly needed.

Will appreciate any help.

change line

12:   if(temp!=NULL)

to

12:     while (temp != NULL)

Regards.

When I change line 12 to while(temp!=null), nothing gets printed and the program doesnt terminate either. What could be the problem? Somebody please help!!

paste your database of string file, let me give you the solution.

My database text.txt of strings is not that interesting. The strings contain A's and B's between 20 to 40 characters in total. This is a requirement from my professor. Below I paste the first 37 lines of my database file to give you an idea. I dont think it is worth pasting 1000 lines.

ABBBBAABBABABBAAAAABABB
AABBBBAAABBBABABBBBABA
BABABAABAAABBBABABABBBAB
BABAABABAAAAABBABAAAAB
AAABBAAABBBABAAABAABBBABABBBBBBB
BBABBABABBAABABBBAAABABABABAB 
BBAAABABAAAAABABBBBBA
BBABBABABBBBBAABBABAABBBAAAAAAAB
BAABABABABAABBAABAAABBBAABAABBAAAABAB
BAABAAAAABAAAAABAAAAA
BAABBBBABAABBBBABBBAABAAAAAAAAABBABABAA
BBAAABABBABBBBBABBABBABAAAAAABAB
AAAAABBABAAABAAAABBBAABAABBBAA
AABABBBBABBAABAAAAABAAABAAAAAABABBB
AABAABAAABBABBBBBAABAABABBAA
BAAABAAAAABBBBAABABBABBBABABBABBAB    
BBBAABBBBAABBABAAABBAABAABBBB 
ABABBAAAABAABAAAAAAAAAABAABAB
BBBBAAABAAABABBABBAB
AAAAAABAABBBBBABBBABBBBBABA
BABAABAABBBBAABBBBBAABBBAAAABBB
BABBBBBBAAAABBAAABBAA
AABAAABBBAAABBBBABBBAAAABBBABBABBBABAB
BAAABABAABABBAABBBABABAAABABBBBBBABBAA
ABBAAAABBBBBBABBAABBABBBABABBABBAABAA
AAABBAAABAABAAABBAABABABAABAAAB
AAABAABBABAAAABBBBBBABABBBBBBAABAAAAAB
ABABBBAAAABBAABBBAABABBBBBAABBABAAAB
BABBBABABBBABAAABBAABABBABBABBAAB
BAABABAABBABAABAAAAABABBAAAAB
AABABABBABBAABABBBABBBBAABAAABBAABABB
BBAABBBAAABABAAAAABABBAABABBBABABBABA
BBBABAABAAAAABBABBABA
BBABBBABABAABAABBBBBBBAAAABABA   
ABBBAABAAAAABABAAABBBBABBABBBBABAB
BBABBBBAAABBABABABBBABAAABAAABBBABABA
BBBAABBABBAABAABAABAABB

I appreciate your help, Ahmed.
Thank you

Referring to the previous codes, I see that when I print the children of the left tree, it gives me segmentation fault. Can somebody please point out why I am getting segmentation fault for the cout statement in the codes below:

if (edit(B,C)
        {
 
           if (temp->LCh==NULL)
            { 
              t1->val = B;

              temp->LCh=t1;
           
            }
       else
           {         

                     TreeNode* t = new TreeNode();
                     t->val = B;
                                  
                     temp->LCh=temp->LCh->t;         
                     cout<<temp->LCh->val<<endl; // gives segmentation fault
           }
        }

Any help much appreciated

I am still struggling with this program. I have modified some codes within query_ball() and I think the tree is being built. But I have problems printing out the tree.

As it may be noted, I have a class called TreeNode which I have posted earlier.
I am defining query_ball() function within another class called Tree. The prototype of this function is TreeNode* query_ball() which returns temp (equivalent to Root, a pointer to TreeNode).

In main() function I have:

int main(void)
{
  Tree s;
   TreeNode *t=new TreeNode();
   t=s.query_ball();
   cout<<t->val<<endl;
   s.PreOrder(t);
}

The cout within main() prints BBABAABA¨&{4BBBAAABB`§T{4$ as root which should actually be BBABAABABABAAABABBBAAABBAABBABBBABAA.
The preorder() function just prints a bunch of garbage values.

Where could I be going wrong? Please help.

Thank you

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.