Hello Daniweb!
I learn in the Computer Sciences Circle, Click Here
in the chapter 8.
there strings, for exemple "sse" and "assessement" or "an" and "trans-panamian bananas", and we just need to count the occurences of thr furst in the second.
My wrong code is

Cet exercice consiste a denombrer les occurences d'un certain
segment de chaine dans une chaine

# On introduit les deux variables de chaine
needle = input("segment = ")
haystack = input("chaine = ")
# On nomme leurs longueurs respectives.
l = len(needle)
h = len(haystack)
# On transformes les variables en listes.
nou = list(needle)
haye = list(haystack)
print("La liste mineure est: ", nou)
print("La liste majeure est: ", haye)
# on initialise la liste des listes candidats.
nouvelle = []
total = 0
for item in haye[0: h - l]:
    p = haye.index(item)

    candidat = haye[p: p + l]
    print("candidat = ", candidat)
    if candidat == nou:
        # on ajoute a la liste nommee nouvelle un element, lui
        # meme etant une sous-liste, nomme candidat
        nouvelle = nouvelle + (candidat)
        print (nouvelle)
        total += 1
    print (total)

the problem is that for an in trans panamian bananas. it output 7 instead 6.
What the problem.
nest question: i tried to create list of lists """nouvelle + candidate """ and the program enter the items of candidate in nouvelle instead to enter candidate itself as an item. Beginning beginning...

Recommended Answers

All 13 Replies


''' str_find_sub_index.py
s.find(sub[ ,start[,end]]) returns index or -1

text = "trans panamanian bananas"
sub = "an"

start = 0
count = 0
while True:
    ix = text.find(sub, start)
    if ix < 0:
    # move up start in function find()
    start = ix + 1
    count += 1
    #print(ix, count)  # test

print("'{}' appears {} times in '{}'".format(sub, count, text))

''' output -->
'an' appears 6 times in 'trans panamanian bananas'
>>> text = "trans panamanian bananas"
>>> text.count('an')


>>> import re
>>> len(re.findall(r'an', text))

to @snippsat
can I make?:

text = input()
subtext = input()

I ask because the folow program did not pass test:

# On introduit les deux variables
needle = input()
haystack = input()
# On nomme leurs longueurs respectives.

I'm sorry, i feel that it is very easy but i'm blocking.

can I make?:

I think it should fine,i have not read your assignment yet.
Here a test run.

#Python 3.4.2 
>>> text = input('Enter some text: ')
Enter some text: trans panamanian bananas
>>> subtext = input('Enter subtext to find: ')
Enter subtext to find: an
>>> text.count(subtext)

>>> text = input('Enter some text: ')
Enter some text: assessement
>>> subtext = input('Enter subtext to find: ')
Enter subtext to find: sse
>>> text.count(subtext)

I read this thread, and of course I would use the count method, but I got to make this task by yet another method and thought to share it.

def count(haystack, needle):
    return sum(haystack[n:].startswith(needle)
           for n in range(len(haystack) - len(needle) + 1))

print(count('assessement', 'as'))
# -> 1
print(count('assessement', 'sse'))
# -> 2
print(count('trans panamanian bananas', 'an')
# -> 6
commented: Nice one +5

To @snippsat
Your code resist to me:
for sses in assesses, it output me 1 instead 2! What is the secret?
Very strange. Possibly the count method remove each sses and don't enable us to count the secund sses d/t the removing of the first, rest onlt ses. Maybe?

Python 3.4.0 (default, Apr 11 2014, 13:05:18) 
Type "copyright", "credits" or "license" for more information.

IPython 1.2.1 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.
%guiref   -> A brief reference about the graphical user interface.

In [1]: # python 3.4.2

In [2]: text = input("Enter some text: ")

Enter some text: assesses

In [3]: subtext = input("Enter subtext to find:")

Enter subtext to find:sses

In [4]: text.count(subtext)
Out[4]: 1

In [5]: 

to pyTony.
The solution seems very good (i.e. true)
I'm surprised that we can addition the True/False outputs of the .startswith method. It's find!

I took the liberty to time some of the approaches:

''' str_count_sub_timing_hperf.py
timing functions that count the number of sub_strings in a string
using high performance time.perf_counter() 
new in Python 3.3 and higher

import time

def count_tony(text, sub):
    return sum(text[n:].startswith(sub)
           for n in range(len(text) - len(sub) + 1))

def count_snee(text, sub, start=0):
    count = 0
    while True:
        ix = text.find(sub, start)
        if ix < 0: break
        start = ix + 1
        count += 1
    return count

text = "this person assesses your performance"
sub = "sses"

# returned value is in fractional seconds
start = time.perf_counter()
result1 = count_tony(text, sub)
end = time.perf_counter()

elapsed = end - start
print("count_tony('{}', '{}') --> {}".format(text, sub, result1))
print("elapsed time = {:0.6f} micro_seconds".format(elapsed*1000000))

start2 = time.perf_counter()
result2 = count_snee(text, sub)
end2 = time.perf_counter()

elapsed2 = end2 - start2
print("count_snee('{}', '{}') --> {}".format(text, sub, result2))
print("elapsed time = {:0.6f} micro_seconds".format(elapsed2*1000000))

''' result (Python 3.4.1 64bit)-->
count_tony('this person assesses your performance', 'sses') --> 2
elapsed time = 38.228700 micro_seconds
count_snee('this person assesses your performance', 'sses') --> 2
elapsed time = 5.119915 micro_seconds
commented: time well spent! :D +5

Note that overlapping subs won't work with text.count():

text = "assesses"
sub = "sses"

print(text.count(sub))  # --> 1 ???

for sses in assesses, it output me 1 instead 2! What is the secret?

In your first post you dont't have "assesses" and "sses" test.
str.count() don't count overlapping occurrences.
So if you need to count overlapping occurrences you can not use str.count().
Help and doc do mention that it return non-overlapping occurrences.

>> help(str.count)
Help on method_descriptor:

    S.count(sub[, start[, end]]) -> int

    Return the number of non-overlapping occurrences of substring sub in
    string S[start:end].  Optional arguments start and end are interpreted
    as in slice notation.

To fix my regex soultion,to count overlapping occurrences.

>>> import re
>>> text = "assesses"
>>> sub = "sses"
>>> len(re.findall(r'(?={})'.format(sub), text))

All the answers are realy fantastic and give me a lot of information and homeworks.
I will integrate them.
Thank's to the experts.

snippsat's latest re approach is actually quite speedy.

Ok, so here is a new standard lib candidate

import re

def overcount(S, sub, start=0, end=None):
    """overcount(S, sub[, start[, end]]) -> int

    Return the number of overlapping occurences
    of substring sub in string S[start:end].
    p = r'(?={})'.format(re.escape(sub))
    t = () if end is None else (end,)
    return len(re.compile(p).findall(S, start, *t))

if __name__ == '__main__':
    print(overcount("assesses assesses", "sses", 0, 9)) # 2
commented: Nice +12
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, learning, and sharing knowledge.