I am working on this program in Python that would allow me to scan for a certain phrase in a certain text document. I used the finditer function in Python's re library.The certain line of code resembled the following:

phrase=regex.finditer(page)

I also had a print statement at the end that read:

print(phrase)

When I did this I got the following results:

<callable_iterator object at 0x02CF8FB0>

This is obviously not the result I want. So I tried replacing finditer with findall:

phrase=regex.findall(page)

For this the result that was printed it was:

With far more 'r's than displayed here. In the end I am trying to find a way to display all the times that that phrase occurs, individually. It would be great if this program can work anything that involves extracting phrases from text.

*NOTE*

I am a fourteen year-old novice not a computer science major, so please be careful with the explanations that you use.

Recommended Answers

All 3 Replies

Finditer returns a sequence of 'match objects' from which you can extract the phrase

for match in regex.finditer(page):
    phrase = match.group(0)
    print("A phrase was found at position", match.start())
    print(repr(phrase))

But I'm afraid there is an error in your regular expression, because the result
of findall() shows that your regex finds only the letter 'r'.

If you want to test regular expressions, you can use kodos http://kodos.sourceforge.net/about.html (but I think it will run only in python 2).

Thank you so much. I received the results that I wanted. And now all that is left is some editing of the regular expression.

THANK YOU!! :D

I think you have two things going (wr)on(g):

  1. Apparently, your regular expression is matching just 'r'. I base this on the results of your findall attempt.
  2. finditer returns a match object, not just a substring. You need to use ghe group() method on the match object.

Here is a program that works as you might expect:

import re
import sys

# use this to avoid spelling out the vowels and consonants every time
alphabet = {
  'v': 'aeiou',
  'c':'bcdfghjklmnpqrstvwxyz',
}

# String formatting:
# http://docs.python.org/library/stdtypes.html#string-formatting
p = re.compile("[%(v)s][%(c)s]+[%(v)s]"%alphabet)
print('The pattern is "%s"'%(p.pattern))

# the with statement is a cool shortcut for try:...except:...finally:...
with open(sys.argv[0],'r') as f:
  page = f.read()
  print("First, look at hits using finditer")
  for hit in p.finditer(page):
    print(hit.group())
  print("Now, look at result of findall")
  print(p.findall(page))
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.