We're a community of 1076K IT Pros here for help, advice, solutions, professional growth and fun. Join us!
1,075,560 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Start New Discussion Reply to this Discussion

eliminating int duplicates

Write an EFFICIENT program that reads in the name of a file containing positive integers that are less than 1,000,000. The program prints out the integers in the file, in the order that they appear, eliminating duplicates. So if the file contained 8 3 1 9 2 8 15 9 8 4, the program would print 8 3 1 9 22 15 4.

Here I am not even sure how to approach. Do we read in the file name as an int array, and then eliminate duplicates and save the result in that same file?
Any suggestions would be great! thanks.

6
Contributors
14
Replies
1 Day
Discussion Span
5 Months Ago
Last Updated
15
Views
marnun
Light Poster
28 posts since Dec 2012
Reputation Points: 0
Solved Threads: 0
Skill Endorsements: 0

Yes, you can do.

Store the values inside an array, then send the reference (pointer) of this array to a function that checks to see if the value is unique. (You can even construct a temp array to store the unique values inside) then you can print these to a file.

phorce
Master Poster
738 posts since Jul 2011
Reputation Points: 63
Solved Threads: 91
Skill Endorsements: 16

First, define "efficient". I can think of solutions that are efficient in terms of speed, and in terms of memory usage.

As to solving the probem, yes, an array will almost certainly be needed for any solution. It's a matter of what you put in the array, and what you do with it.

If you store all the inputs, you then have to search through it in some manner and eliminate duplicates. You could set them to be a negative value, which will be skipped in the printing phase. But, you would have to make many passess through the array to find duplicates. Also, how big do you make the input array - your problem only state the limit on the size of the values, not the size of the file!

I would look for a way to test each value as it's read in, determine if you've already seen it or not, print it if it's unique. That way you don't worry about storing all the input or accessing it multiple times. There's an easy way to do this.

vmanes
Postaholic
2,015 posts since Aug 2007
Reputation Points: 1,283
Solved Threads: 242
Skill Endorsements: 6

thanx for replies!! been very helpful.
i think i am in overdrive when simple seams complex. i'll get back to it.

marnun
Light Poster
28 posts since Dec 2012
Reputation Points: 0
Solved Threads: 0
Skill Endorsements: 0

I am back on this exercise, and this is what I came up with. Does it make any sense?

    #include <iostream>
    #include <fstream>
    using namespace std;

    int main() {
        string nameFile;
        int k, n, a[k], b[k];
        k=0;
        cin>>nameFile;
        ifstream integerFile (nameFile.c_str());
        while (integerFile>>n) //loop to find number of integers in the file
            k++;
        integerFile.close();

        fstream integerFile (nameFile.c_str(), ios::in |ios::out);

        for (int i=0; i<k; i++) {
            integerFile>>a[i];
            for (int j=0; j<i; j++) {
                if (a[i]!=a[j]) b[i]=a[i]; // b[] is an array that holds all non-duplicated integers
            }
        }

        for (int i=o; i<k; i++)
            integerFile<<b[i]; //will this override information stored in that file? 

        integerFile.close();
        return 0;
    }
marnun
Light Poster
28 posts since Dec 2012
Reputation Points: 0
Solved Threads: 0
Skill Endorsements: 0

Here's a fairly memory efficient example. The idea is basically to create an array where each bit in the array represents an integer. If the bit representing that integer is set then the function returns false, otherwise it sets the corresponding bit and returns true.

bool add2intArray(unsigned int* iarray, int number)
{
    int index = number >> 3; // same as number / 8
    unsigned char bit = number & 7;

    if ((iarray[index] & 1 << bit) == 1 << bit)
        return false;

    iarray[index] |= 1 << bit;

    return true;
}

int main()
{
    const int size = 1000000/8 + 1;
    unsigned int arr[size] = {0};

    int test[ ] = {1 , 2, 8, 9, 7, 3, 203, 9, 2, 16, 54, 54, 99, 4, 6, 2, 9, 97, 8, 98};

    for (int i : test)
    {
        if (add2intArray(arr, i))
            std::cout << i << ", ";
    }

    getchar();

    return 0;
}
nullptr
Junior Poster
154 posts since Mar 2012
Reputation Points: 46
Solved Threads: 32
Skill Endorsements: 0

I just realised that I messed up slightly in the above example. The fixed code is as follows.

bool add2intArray(unsigned char* iarray, int number)
{
    int index = number >> 3;
    unsigned char bit_value = 1 << (number & 7);

    if ((iarray[index] & bit_value) == bit_value)
        return false;

    iarray[index] |= bit_value;

    return true;
}

int main()
{
    const int size = 1000000/8;
    unsigned char arr[size] = {0};

    int test[ ] = {1 , 2, 8, 9, 7, 3, 203, 9, 999999, 2, 16, 54, 54, 99, 4, 6, 2, 9, 97, 8, 98, 999999, 999991};

    for (int i : test)
    {
        if (add2intArray(arr, i))
            std::cout << i << ", ";
    }

    getchar();

    return 0;
}

Write an EFFICIENT program that reads in the name of a file containing positive integers that are less than 1,000,000. The program prints out the integers in the file, in the order that they appear, eliminating duplicates. So if the file contained 8 3 1 9 2 8 15 9 8 4, the program would print 8 3 1 9 22 15 4.

The way I read the question, you simply need to read the file and print the content minus any duplicate values. There's no need to then save the non-duplicate list back to a file.

nullptr
Junior Poster
154 posts since Mar 2012
Reputation Points: 46
Solved Threads: 32
Skill Endorsements: 0

thanks for your answer. not sure what you're doing in your code though...
here is my code, adjusted to save non-duplicates in a new file instead of overwriting the initial one.

    #include <iostream>
    #include <fstream>
    using namespace std;

    int main() {
            string nameFile, nameFile2;
            int k, n, a[k];
            k=0;
            cin>>nameFile;
            cin>>nameFile2;
            ifstream integerFile (nameFile.c_str());
            ofstream integerFile2 (nameFile2.c_str());

            while (integerFile>>n) // loop to find number of integers in the file
                k++;

            for (int i=0; i<k; i++) {
                integerFile>>a[i];
                for (int j=0; j<i; j++) {
                    if (a[i]!=a[j]) 
                          integerFile2<<a[i]; 
                }
            }

            integerFile2.close();
            integerFile.close();
            return 0;
        }
marnun
Light Poster
28 posts since Dec 2012
Reputation Points: 0
Solved Threads: 0
Skill Endorsements: 0

Why do you read the file to count items, then use that count to limit the for loop?

Why not just use that while in place of the first for? In fact, I don't think your code will successfully run as is, since you read the file to end, then start reading it again without resetting in any manner.

Your array a[]. You set its size with variable k, which has no value. So how big is your array?

This bit

for (int j=0; j<i; j++) {
    if (a[i]!=a[j])
        integerFile2<<a[i]; 

Is going to write the newly read value a[i] to the output many times, each time it is tested against a number it doesn't match. You should instead keep track of finding a match, if not, then output the number.

As I mentioned earlier, efficiency has many ways of being measured. In terms of time, you have something that's in the n^2 category, about the same time efficency as bubble sort. And you have the problem of not knowing how bit to make your array a[]. Is there a limit to how many numbers can be in the input file? 100, 1,000, 1,000,000?

vmanes
Postaholic
2,015 posts since Aug 2007
Reputation Points: 1,283
Solved Threads: 242
Skill Endorsements: 6

Thanks a lot for your comment!
1. The number of integers in the file is not specified. That's why I thought to use while loop to see the size of an array i need. Actually, you are right, I should declare an array after while loop, not before.
2. How do I reset file? Close it and re-open again?
2. I see what you are saying about testing for duplicates.

Modified code:

    #include <iostream>
    #include <fstream>
    #include <string>
    using namespace std;

    int main() {
            string nameFile, nameFile2;
            int k, n, count;
            k=0; count=0;
            cin>>nameFile;
            cin>>nameFile2;
            ifstream integerFile (nameFile.c_str());
            ofstream integerFile2 (nameFile2.c_str());

            while (integerFile>>n) // loop to find number of integers in the file
                k++;

            int a[k];

            integerFile.close();
            ifstream integerFile (nameFile.c_str());

            for (int i=0; i<k; i++) {
                integerFile>>a[i];
                for (int j=0; j<i; j++) {
                    if (a[i]==a[j]) 
                          count++;
                    if (count==0)
                          integerFile2<<a[i]; 
                }
            }

            integerFile2.close();
            integerFile.close();
            return 0;
        }
marnun
Light Poster
28 posts since Dec 2012
Reputation Points: 0
Solved Threads: 0
Skill Endorsements: 0

Just realized that I made a mistake in part checking for duplicates. It should be like this:

    for (int i=0; i<k; i++) {
            integerFile>>a[i];
            for (int j=0; j<i; j++) {
                if (a[i]==a[j]) 
                      count++;
            }
            if (count==0)
                integerFile2<<a[i]; 
   }
marnun
Light Poster
28 posts since Dec 2012
Reputation Points: 0
Solved Threads: 0
Skill Endorsements: 0

You can always use the sorted set from STD:

#include <iostream>
#include <string>
#include <fstream>
#include <sstream>
#include <set>
using namespace std;

int main() {
    set<int> a;
    ifstream fin ("filename.txt", ios::in);
    int nr;
    string line;
    while (getline(fin, line)){
        stringstream token(line);
        cout<<line<<endl;
        token>>nr;
        a.insert(nr);
    }
    set<int>::iterator it;
    for (it=a.begin();it!=a.end();it++)
        cout<<*it<<" ";
    cout<<endl;
    return 0;
}
Lucaci Andrew
Practically a Master Poster
649 posts since Jan 2012
Reputation Points: 91
Solved Threads: 91
Skill Endorsements: 11

I had thought to tell him that but the problem states to print the numbers in the order they appear. If his compiler supports c++11 he could use an unordered map and use the value for the key.

NathanOliver
Posting Virtuoso
1,513 posts since Apr 2009
Reputation Points: 269
Solved Threads: 277
Skill Endorsements: 3

marnum, you're still making a few classic mistakes, and writing code that loses on the efficiency standpoint.
Assuming you make array a[] big enough for any possible input;
1. be sure to reset count to 0 for each new pass in the for loop.
2. don't write duplicate numbers to the array, you'll just be comparing to them numerous times.

How about:

   end = 0;
   while( integerFile >> temp ) 
    {
        count = 0;
        for (int j = 0; j < end; j++) 
        {
           if (temp == a[j]) 
               count++;
        }
        if (count == 0)
        {  
           a[end] = temp;
           integerFile2 << a[end]; 
        }
   }

This is still not the most time efficient method, but I hope these corrections to your chosen path help.

vmanes
Postaholic
2,015 posts since Aug 2007
Reputation Points: 1,283
Solved Threads: 242
Skill Endorsements: 6

i'll have to look up more on stringstream token, set and unordered_map - haven't used them yet.
yes, numbers are not supposed to be sorted.
thanx everybody! i'll get back to this code, hopefully, after my finals..

marnun
Light Poster
28 posts since Dec 2012
Reputation Points: 0
Solved Threads: 0
Skill Endorsements: 0

This article has been dead for over three months: Start a new discussion instead

Post: Markdown Syntax: Formatting Help
 
You
View similar articles that have also been tagged:
 
© 2013 DaniWeb® LLC
Page rendered in 0.1063 seconds using 2.8MB