Hey all,

I was wondering about this piece of code on how to make it simpler to find the needle in the haystack for example "abbaabbaabb", "abba" - the needle, "abba" is found 2 times in the haystack "abbaabbaabb".

I have this code:

unsigned int substringCount(const std::string& strng, const std::string& subStrng)
{
    if (subStrng.length() == 0)
    {
        cout << "There is nothing in your string" << endl;
        return 0;
    }
    else
    {
        int count = 0;
        for (size_t offset = strng.find(subStrng); offset != std::string::npos;offset = strng.find(subStrng, offset + subStrng.length() ) )
        {
            ++count;
        }
        return count;
    }
}

BUT it would be great if it was simplified because to be honest I have no idea what is going on here...

Thanks! :)

Recommended Answers

All 8 Replies

First of all this code is wrong. But before that let me describe how one expects this to work.
The find function in string class returns the index where the substring starts in the original string. The finding starts from the index starting at the second argument which is optional. find returns the constant string::npos if string is not found.
So what this code does is finds the first occurence of substring and then keeps looping till we don't find the substring. The starting position is updated each time using: offset = strng.find(subStrng, offset + subStrng.length() ), i.e new start position = offset + subStrng.length().

Now this updation of start position seems to be correct at first but consider this:
substring = abab
and string = sjskabababc

Here count should be 2 but your code will give 1 as it advance past second ab after finding it once.

The way I have done it is like:

size_t pos = 0;
int counter = 0;
// loop untill substrng is not found.
while ((pos = strng.find(subStrng, pos)) != std::string::npos)
{
    counter++;
    pos++;  // increment pos so we dont duplicate the last search.
}

I am writing this from memory so there might be a bug I am forgetting about.

Ok so I managed with this code:

int n = 0;
string::size_type strSize = 0;

while ( (strSize = haystack.find (needle,strSize) ) != string::npos  )
{
    strSize++; 
    n++;

    //case sensivity?
}

return n;

BUT: its not case sensitive. Meaning when I want to find "TH" in "The black panther is a huge cat, don't you think?" - it will not find it?

How can I change so that its not case sensitive??

You need to transform the strings to all lower case. You can use this to do that.

string input = "AbAbbAbBBaAA";
transform(input.begin(), input.end(), input.begin(), ::tolower);

Yeah, what I realised is that "Th" should be considered the same as "th"... Will that code still work in that case?

If you call transform on both haystack and needle and have them both lower case then it should work.

#include <iostream>
#include <string>
#include <algorithm>
#include <functional>
#include <cctype>

using namespace std;

int main()
{
    /* Assumption: substring may overlap. E.g: "aaa" contains "aa" two times. */
    const string HAYSTACK = "abbaabbaabb";
    const string NEEDLE   = "abba";
    int counter = 0;

    /* A custom compare function for case-insensitive comparing */
    function<bool(const char, const char)> compare_fun = [](const char l, const char r) 
        { return toupper(l) == toupper(r); };

    string::const_iterator h_it(HAYSTACK.begin()), h_end(HAYSTACK.end());
    string::const_iterator n_it(NEEDLE  .begin()), n_end(NEEDLE  .end());

    h_it = search(h_it, h_end, n_it, n_end, compare_fun);

    while (h_it != h_end)
    {
        /* Perform tasks here, just a counter in this case */
        counter++;

        /* If assuming that substrings may not overlap, you'd do something like */
        /* advance(h_it, distance(n_it, n_end));                                */
        h_it = search(h_it + 1, h_end, n_it, n_end, compare_fun);
    }

    cout << "\"" << HAYSTACK << "\" contains \"" << NEEDLE << "\" " << counter << " times.\n";
    return 0;
}

It was as simple as converting both strings into upper or lowercase and then compare it.

Huge thanks to all who contributed!

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.