Hi,

I'm trying to compare two text files (outputs from a database and corresponding spatial table in a GIS) to check for errors. Basically, i'm assuming that the output from the database is correct and any missing/repeat numbers in the spatial table will be errors and should be reported. I've written the part of the program that searches for a specific number in the spatial output and returns the errors, but i can't work out how to change the number it's searching for.

Here's what i've done so far:

#include <string>
#include <fstream>
#include <iostream>
#include <stdlib.h>
using namespace std;

fstream DBinput("DB.txt", ios::in);
fstream SPinput("SP.txt", ios::in);
fstream output("Errors.txt", ios::out);

int i = 0;
string correct;
string word;


void count(void);
void SPcompare(void);

void main(void)
	{
		while(DBinput >> correct)
			{
 //////// Presumably this is where the 'find number to search for' part should come in ////////
			SPcompare();
			}
		
	}

void SPcompare(void) 
	{
		while(SPinput >> word) 
			{
				count();
  			}

		if(i == 0)
			output << "Error - '" << word << "' has no matches in spatial database" << '\n';
		else if(i > 1)
			output << "Error - '" << word << "' has " << i << " matches in spatial database" << '\n';
	}


void count(void)
	{
		if(word == "5")
		i++;
	}

As you can see, so far it only searches for how many times the specific number 5 appears in the SP.text file.
What i'd like it to do is read the DB.txt file, find the first number to search for, then find out how many times it appears in SP.txt, then move on to the next number in DB and repeat until the end, generating a txt file output of all the errors. The SP.txt file has every number containted within inverted commas, and the DB.txt file has each number at the beginning of a new line follow by a lot of " delineated text/

Any help would be greatly appreciated, as would any 'you're making this a lot more complicated that it needs to be, here's how you can solve your problem in fifteen seconds' :)

P.S apologies for length

On the subject of an easier solution, is there some way that i can use string compare... ie something like:

if(strcmp(correct, word) == 0)
cout << "Number has no errors" << '\n';

Or will this try and compare the entire document or line or something? How can i narrow it down to only comparing individual values within the file?

Can we see a couple of lines of example for each file? Sounds like you want to

open db
while lines left in db
    read next line of db
    trim off everything after the leading number
    open sp
    while lines left in sp
        read next line of sp
        (is there one number per line?  several?)
        parse next/only number from that line
        COMPARE NUMBER FROM SP WITH NUMBER FROM DB
        (go on to next number in the same line?)
    end
    close sp
end
close db
print totals

seems like the easiest part of this is the compare! :-)

That's pretty much exactly what i want to do, except there's not guarantee that the two numbers that need comparing will be on the same line, so using the DB file as a master it should:

while lines left in db
    read next line of db
    trim off everything after the leading number
    open sp
    while lines left in sp
        search SP for leading number in DB
                   if number appears only once then proceed to find next number in DB, otherwise output an error to a txt file      
    end

SP file is in the form:

"6853"
"6854"
"6855"
"6856"
"6857"
"6858"
"6859"
"6860"
"6861"
"6862"
"6863"
"6864"
"6865"
"6866"
"6867"
"6868"
"6869"
"6870"
"6871"

DB file is in the form:

299	"Planning"	"Land Use"	"Urban Land Uses"	"Industry & Commerce"	"Retailing"	"n/a"	"1"	"n/a"	"n/a"	"n/a"	0	0	"n/a"	"n/a"	""	""	"n/a"	"28/04/2003"	""	0	0	0	0	0	0	"Land use baseline report"
300	"Planning"	"Land Use"	"Urban Land Uses"	"Industry & Commerce"	"Industry"	"n/a"	"2"	"n/a"	"n/a"	"n/a"	0	0	"n/a"	"n/a"	""	"Associated carpark"	"n/a"	"28/04/2003"	""	0	0	0	0	0	0	"Land use baseline report"
etc

And i'm trying to extract the 299, 300, etc numbers to search for in the other one (real files contain many thousands of entries to be compared). Also the SP numbers will not neccessarily be in order (almost certainly won't be, actually). That's pretty much the nature of the beast...

Hope that made sense... :)

since the numbers in the db file are at the BEGINNING of the line, you could just use the library function 'atoi()' to get the number:

string lineFromDBFile;
. . .
int numberToLookFor = atoi( lineFromDBFile.c_str() );

For the SP file, if you KNOW that the numbers are always at the start of the line, AND they are always within quotes like shown, you could also use atoi():

int numberInSPFile = atoi( lineFromSPFile.c_str() + 1 ); // +1 skips the leading quote

Then you can say

bool numberFoundAlready = false;
...
<fetch next line from SP file>
...
if (numberInSPFile == numberToLookFor)
{
    if (numberFoundAlready)
    {
        // do what you want when a duplicate is found
    }
    else
    {
        numberFoundAlready = true;
    }
}
<go back for more SP lines>

Edited 3 Years Ago by Dani: Fixed formatting

Cheers for that, i'll give it a whirl this evening when i'm near a c compiler (or i sort it out at work...).

In the meantime, i've made some changes to the database so it outputs to this program a lot more neatly, so now both text files contain numbers in the form:

1
2
3
4
5
6
7
etc

with no uselesse data or " marks in it.

With that in mind, does anyone have any ideas why this doesn't work;

#include <string>
#include <fstream>
#include <iostream>
#include <stdlib.h>
using namespace std;

fstream DBinput("DB.txt", ios::in);
fstream SPinput("SP.txt", ios::in);
fstream output("Errors.txt", ios::out);

int i = 0;
string correct;
string word;

void main(void)
	{
		while(DBinput >> correct)
			{
				while(SPinput >> word)
					{
					if(correct == word)
					i++;
					}	
					

				output << "There are "<< i << " matches for " << correct << "." << '\n';
				i = 0;
			}
		
	}

When it runs it makes an output file something like:

There are 1 matches for 1.
There are 0 matches for 2.
There are 0 matches for 3.
There are 0 matches for 4.
There are 0 matches for 5.
There are 0 matches for 6.
There are 0 matches for 7.
etc

So it's only counting the first number from DB.txt, ie not updating the 'correct' string in the comparison if statement, and i don't understand why not, seeing as it's updating it in the cout statement.

Any ideas?

So it's only counting the first number from DB.txt, ie not updating the 'correct' string in the comparison if statement, and i don't understand why not, seeing as it's updating it in the cout statement.

Any ideas?

It looks to me like you are finding a value correct from DBinput, and then going inputting all words into word from SPinput. When you get to the end of SPinput, you stay at the end, but then grab the next word and put it into correct. Since SPinput is still at the end, nothing will be put into word, and the inner while loop will never execute -- only the initial time. I think you need to rewind SPinput after you've looked through it.

I might try something like this.

#include <string>
  #include <fstream>
  #include <iostream>
  
  using namespace std;
  
  int main(void)
  {
     ifstream DBinput("DB.txt");
     ofstream output("Errors.txt");
  
     string correct;
     while ( DBinput >> correct )
     {
  	  ifstream SPinput("SP.txt");
  	  int i = 0;
  	  string word;
  	  while ( SPinput >> word )
  	  {
  		 if ( correct == word )
  			i++;
  	  }
  	  cout << correct << ": " << i << endl; // for debugging
 	 output << "There are "<< i << " matches for " << correct << "." << '\n';
     }
     return 0;
  }

[edit]And don't use void main().

This article has been dead for over six months. Start a new discussion instead.