Hello, I'm trying to have my program locate any duplicate strings in an array, say, there were two instances of the word 'man', it would add 1 to the Word Amount of that word, and move the duplicate to the back of the array in some way. The problem is, I cannot figure out for the life of me how to do this correctly. As of right now, when I try to run the program, it'll find the first duplicate and add 1 to the word amount of that word, but then it goes wonky and adds an empty line as an entry in the array and won't recognize any more same words. Here's what I've got so far, the beginning is me using a selection sort to sort the array, the else if statement is where I'm having problems.

void sortArray(WordStruct Words[1000], int &itemamount)
{
    int i, minIndex;
    string temp;

    for (i = 0; i < itemamount; i++)
    {
        minIndex = i;
        for (int j = i + 1; j < itemamount; j++)
        {
            if (Words[minIndex].Word > Words[j].Word)
            {
                temp = Words[minIndex].Word;
                Words[minIndex].Word = Words[j].Word;
                Words[j].Word = temp;
            }
            else if (Words[minIndex].Word == Words[j].Word)
            {
                Words[minIndex].WordAmt++;
                temp = Words[j].Word;
                Words[j].Word = Words[999].Word;
                Words[999].Word = temp;
                itemamount--;
            }
        }
        cout << Words[i].Word << endl << Words[i].WordAmt << endl;
    }
}

Recommended Answers

All 8 Replies

It's hard to do two things at once.

Try doing the sort first.
Then look for the duplicates.

Unless you use something like a C++ vector, removing an item from an array will not adjust the array. You have to do that yourself, otherwise you have an empty spot as you saw.

I just need them swapped to the end of the array so they are no longer counted in the for loop. The sorting works fine, it's the removing duplicates part that's the problem.

Did you split the functionality, or are you going to insist on doing the two things together?

Use std::sort() and std::unique() in <algorithm>

#include <algorithm>
#include <iostream>

int main()
{
    int a[] = { 0, 1, 0, 2, 3, 4, 3, 3, 5, 2 } ;
    enum { SIZE = sizeof(a) / sizeof(*a) } ;
    std::sort( a, a+SIZE ) ;
    const int size_unique = std::unique( a, a+SIZE ) - a ;
    std::cout << "size_unique: " << size_unique << '\n' ; // size_unique: 6
    for( int i=0 ; i<size_unique ; ++i ) std::cout << a[i] << ' ' ; // 0 1 2 3 4 5 
}
commented: You learn nothing from using language extensions. -3
commented: I don't see any language extensions here. This code is standard C++. +6

@ vijayan121

I think he wants his own program to do this things, he doesn't wants library functions.

@Secone

Why are you replacing j with 999 ? Is there any specific reason, that helps you in program later ?

You can replace j with j+1. Shifting one place and decreasing the amount by 1.

Also as WaltP said, you must split the functionality as it makes the program logic more complex, and you can also make mistake in array indices.

I've gotten it to APPEAR it works, it sorted the array and added a count to the WordAmt as it should, but now when I output the code, it gets most of the words but sometimes it repeats a word at the end of the loop. Also, if a word in the text file, such as 'a', repeats a LOT, it shows up numerous times in the output. I am not aware of how to easily split the functionality without making them two separate functions. This function has to do both sorting the array and removing duplicates. Here's where I'm at:

void sortArray(WordStruct Words[1000], int itemamount)
{
    int counter, minIndex;
    string temp;

    for (counter = 0; counter < itemamount; counter++)
    {
        minIndex = counter;
        for (int wordCounter = counter + 1; wordCounter < itemamount; wordCounter++)
        {
            if (Words[minIndex].Word > Words[wordCounter].Word)
            {
                temp = Words[minIndex].Word;
                Words[minIndex].Word = Words[wordCounter].Word;
                Words[wordCounter].Word = temp;
            }
            else if (Words[minIndex].Word == Words[wordCounter].Word)
            {
                Words[minIndex].WordAmt++;
                temp = Words[wordCounter].Word;
                Words[wordCounter].Word = Words[itemamount - 1].Word;
                Words[itemamount - 1].Word = temp;
                itemamount--;
            }
        }
    }
}

void outputResults(string fileLocation, int charnumber, int lineamount, WordStruct Words[1000], int itemamount)
{
    cout << "File: " << fileLocation << endl
        << "Characters: " << charnumber << endl
        << "Lines: " << lineamount << endl;
    cout << "Word          Word Amount" << endl;
    for (int counter = 0; counter < itemamount; counter++)
    {
        cout << Words[counter].Word << "          " << Words[counter].WordAmt << endl;
        if (Words[counter].WordAmt >= 2)
        {
            itemamount--;
        }
    }
    return;
}

I think he wants his own program to do this things, he doesn't wants library functions.

He has a choice; and I would let him decide for himself.

I personally tend to agree with this view (Koenig and Moo):

Our approach is possible only because C++, and our understanding of it, has had time to mature. That maturity has let us ignore many of the low-level ideas that were the mainstay of earlier C++ programs and programmers.
The ability to ignore details is characteristic of maturing technologies. For example, early automobiles broke down so often that every driver had to be an amateur mechanic. It would have been foolhardy to go for a drive without knowing how to get back home even if something went wrong. Today's drivers don't need detailed engineering knowledge in order to use a car for transportation. They may wish to learn the engineering details for other reasons, but that's another story entirely.

We define abstraction as selective ignorance--concentrating on the ideas that are relevant to the task at hand, and ignoring everything else--and we think that it is the most important idea in modern programming. The key to writing a successful program is knowing which parts of the problem to take into account, and which parts to ignore. Every programming langauge offers tools for creating useful abstractions, and every successful programmer knows how to use those tools.
http://www.acceleratedcpp.com/details/preface.html

IMNSHO, one doesn't have to first learn to be a mechanic before one starts learning how to drive.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.