Finding a string within a string (Needle and Haystack c++)

Question

Joemeister 0 Light Poster

9 Years Ago

Hey all,

I was wondering about this piece of code on how to make it simpler to find the needle in the haystack for example "abbaabbaabb", "abba" - the needle, "abba" is found 2 times in the haystack "abbaabbaabb".

I have this code:

unsigned int substringCount(const std::string& strng, const std::string& subStrng)
{
    if (subStrng.length() == 0)
    {
        cout << "There is nothing in your string" << endl;
        return 0;
    }
    else
    {
        int count = 0;
        for (size_t offset = strng.find(subStrng); offset != std::string::npos;offset = strng.find(subStrng, offset + subStrng.length() ) )
        {
            ++count;
        }
        return count;
    }
}

BUT it would be great if it was simplified because to be honest I have no idea what is going on here...

Thanks! :)

c++

4 Contributors
8 Replies
2K Views
5 Days Discussion Span
Latest Post 9 Years Ago Latest Post by Joemeister

All 8 Replies

tapananand 13 Junior Poster

9 Years Ago

First of all this code is wrong. But before that let me describe how one expects this to work.
The find function in string class returns the index where the substring starts in the original string. The finding starts from the index starting at the second argument which is optional. find returns the constant string::npos if string is not found.
So what this code does is finds the first occurence of substring and then keeps looping till we don't find the substring. The starting position is updated each time using: offset = strng.find(subStrng, offset + subStrng.length() ), i.e new start position = offset + subStrng.length().

Now this updation of start position seems to be correct at first but consider this:
substring = abab
and string = sjskabababc

Here count should be 2 but your code will give 1 as it advance past second ab after finding it once.

NathanOliver 429 Veteran Poster

9 Years Ago

If you call transform on both haystack and needle and have them both lower case then it should work.

Edited 9 Years Ago by NathanOliver

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

NathanOliver 429 Veteran Poster Featured Poster · Answer 1 · 2014-07-29T19:33:18+00:00

The way I have done it is like:

size_t pos = 0;
int counter = 0;
// loop untill substrng is not found.
while ((pos = strng.find(subStrng, pos)) != std::string::npos)
{
    counter++;
    pos++;  // increment pos so we dont duplicate the last search.
}

I am writing this from memory so there might be a bug I am forgetting about.

Joemeister 0 Light Poster · Answer 2 · 2014-07-30T12:03:09+00:00

Ok so I managed with this code:

int n = 0;
string::size_type strSize = 0;

while ( (strSize = haystack.find (needle,strSize) ) != string::npos  )
{
    strSize++; 
    n++;

    //case sensivity?
}

return n;

BUT: its not case sensitive. Meaning when I want to find "TH" in "The black panther is a huge cat, don't you think?" - it will not find it?

How can I change so that its not case sensitive??

NathanOliver 429 Veteran Poster Featured Poster · Answer 3 · 2014-07-30T12:19:49+00:00

You need to transform the strings to all lower case. You can use this to do that.

string input = "AbAbbAbBBaAA";
transform(input.begin(), input.end(), input.begin(), ::tolower);

Joemeister 0 Light Poster · Answer 4 · 2014-07-30T12:29:35+00:00

Yeah, what I realised is that "Th" should be considered the same as "th"... Will that code still work in that case?

Gonbe 32 Newbie Poster · Answer 5 · 2014-07-30T13:40:57+00:00

#include <iostream>
#include <string>
#include <algorithm>
#include <functional>
#include <cctype>

using namespace std;

int main()
{
    /* Assumption: substring may overlap. E.g: "aaa" contains "aa" two times. */
    const string HAYSTACK = "abbaabbaabb";
    const string NEEDLE   = "abba";
    int counter = 0;

    /* A custom compare function for case-insensitive comparing */
    function<bool(const char, const char)> compare_fun = [](const char l, const char r) 
        { return toupper(l) == toupper(r); };

    string::const_iterator h_it(HAYSTACK.begin()), h_end(HAYSTACK.end());
    string::const_iterator n_it(NEEDLE  .begin()), n_end(NEEDLE  .end());

    h_it = search(h_it, h_end, n_it, n_end, compare_fun);

    while (h_it != h_end)
    {
        /* Perform tasks here, just a counter in this case */
        counter++;

        /* If assuming that substrings may not overlap, you'd do something like */
        /* advance(h_it, distance(n_it, n_end));                                */
        h_it = search(h_it + 1, h_end, n_it, n_end, compare_fun);
    }

    cout << "\"" << HAYSTACK << "\" contains \"" << NEEDLE << "\" " << counter << " times.\n";
    return 0;
}

Joemeister 0 Light Poster · Answer 6 · 2014-08-03T21:02:44+00:00

It was as simple as converting both strings into upper or lowercase and then compare it.

Huge thanks to all who contributed!

Finding a string within a string (Needle and Haystack c++)

Recommended Answers Collapse Answers

All 8 Replies

Recommended Answers