Help with adding text to a list

Question

peste19 0 Newbie Poster

13 Years Ago

i am beginning to learn python so please be easy on me, i am trying to create a function that will accept user input and store it in a list, separating words and ignoring punctuation example:

input = i.am/lost

stores in list as ",[am][lost]"

this is what i got so far

def text():
	sentence = 0
	while line !="EOF":
		sentence= raw_input()
		line1 = []
		processed_line = ""
		for char in sentence:
			if char.isalpha():
				processed_line = processed_line+char
			else:
				processed_line = processed_line+" "
				
		line1.append(processed_line)	
		line1.split()
		
		print processed_line
		
		print line1
	
text()

my problem is that its adding it all under 1 item in the list instead of separating word by word

thanks in advance

python

Edited 13 Years Ago by peste19 because: n/a

4 Contributors
10 Replies
179 Views
10 Hours Discussion Span
Latest Post 13 Years Ago Latest Post by TrustyTony

All 10 Replies

woooee 814 Nearly a Posting Maven

13 Years Ago

Your indentation is funky, probably using tabs instead of spaces. First, "print processed_line.split()" at the end of your code. Lists do not have a split() method.

You can use split() on the original string, but then you still have to iterate letter by letter and test for alpha as you do in your code. To get your code to run correctly, you should append "processed_line" when a non-alpha character is found (under the else) and then set "processed_line" to an empty string, ready to accept the next new word. Note that if you have some input like "I am a dog, I am not a cat.", you will get a break on the comma and on the space, leading to an empty "processed_line" being appended to the list. So, check processed_line for a positive length before appending. It would also be a good idea to print both "processed_line" and "line1" both before and after the append, while testing this code, so you know what is going on.

Also, the while() loop will never exit since "line" is never defined and therefore will never equal "EOF".
while line !="EOF":

Edited 13 Years Ago by woooee because: n/a

woooee 814 Nearly a Posting Maven

13 Years Ago

it will only list and skip work because there is no non alpha character after work

Correct, so you have to test after the loop, for "processed_line" has length (something in it since the input could end with a period), and if so, append it to the list. Post your new code.

woooee 814 Nearly a Posting Maven

13 Years Ago

See the comments.

def text():
	sentence = 0
	while sentence !="EOF":
		sentence = raw_input()
		line1 =[]
		processed_line = ""
		for char in sentence:  ## use the input string
                        ## isalnum() includes isalpha()
			#if char.isalpha() or char.isalnum():
			if char.isalnum():
				processed_line = processed_line+char

			else:
                                if len(processed_line): ## test for empty string
        				line1.append(processed_line)
				processed_line = ""  ## empty, not a space
                if len(processed_line):  ## append the final word
                    line1.append(processed_line)
		print processed_line
                print sentence.split()  ## second way to do it
		print line1
	
text()

Edited 13 Years Ago by woooee because: n/a

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 1 · 2011-10-14T22:00:10+00:00

check itertools.groupby, use it with string.isalpha as the grouping function.

peste19 0 Newbie Poster · Answer 2 · 2011-10-14T22:19:03+00:00

i was trying to use the .split() but i am implementing wrong somehow

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 3 · 2011-10-15T00:21:02+00:00

I meant this:

>>> import itertools
>>> input = "i.am/lost"
>>> words = [''.join(word) for letters, word in itertools.groupby(input, lambda x: x.isalpha())]
>>> words
['i', '.', 'am', '/', 'lost']
>>> words = [''.join(word) for letters, word in itertools.groupby(input, lambda x: x.isalpha()) if letters]
>>> words
['i', 'am', 'lost']
>>>

peste19 0 Newbie Poster · Answer 4 · 2011-10-15T00:56:09+00:00

Your indentation is funky, probably using tabs instead of spaces. First, "print processed_line.split()" at the end of your code. Lists do not have a split() method.
You can use split() on the original string, but then you still have to iterate letter by letter and test for alpha as you do in your code. To get your code to run correctly, you should append "processed_line" when a non-alpha character is found (under the else) and then set "processed_line" to an empty string, ready to accept the next new word. Note that if you have some input like "I am a dog, I am not a cat.", you will get a break on the comma and on the space, leading to an empty "processed_line" being appended to the list. So, check processed_line for a positive length before appending. It would also be a good idea to print both "processed_line" and "line1" both before and after the append, while testing this code, so you know what is going on.
Also, the while() loop will never exit since "line" is never defined and therefore will never equal "EOF".
while line !="EOF":

i did some changes but there is a flaw with putting the append in the else statement, for example if i input this

i.going.to.work
it will only list and skip work because there is no non alpha character after work

and thanks for looking at my thread

I meant this:

>>> import itertools
>>> input = "i.am/lost"
>>> words = [''.join(word) for letters, word in itertools.groupby(input, lambda x: x.isalpha())]
>>> words
['i', '.', 'am', '/', 'lost']
>>> words = [''.join(word) for letters, word in itertools.groupby(input, lambda x: x.isalpha()) if letters]
>>> words
['i', 'am', 'lost']
>>>

i really didnt want to get into that yet since i am still learning and dont want to get ahead of myself but thanks for showing that way to me

JoshuaBurleson 23 Posting Whiz · Answer 5 · 2011-10-15T01:30:10+00:00

maybe I'm misunderstanding, but wouldn't something simple like

def make_norm_lis(entry):
	lis=[]
	for letter in entry:
		if letter.isalpha():
			lis.append(letter)
		else:
			lis.append(' ')
	lis=''.join(lis)
	lis=lis.split()
	return lis

work?

>>> st='!@#this@#$is//sparta'
>>> print(make_norm_lis(st))
['this', 'is', 'sparta']
>>> 
>>> work='i.going.to.work'
>>> 
>>> print(make_norm_lis(work))
['i', 'going', 'to', 'work']

peste19 0 Newbie Poster · Answer 6 · 2011-10-15T01:51:38+00:00

i got this

def text():
	sentence = 0
	while sentence !="EOF":
		sentence = raw_input()
		line1 =[]
		processed_line = ""
		for char in line:
			if char.isalpha() or char.isalnum():
				processed_line = processed_line+char

			else:
				line1.append(processed_line)
				processed_line = " "
		print processed_line

		print line1
	
text()

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 7 · 2011-10-15T02:28:18+00:00

My version of yours:

def text():
    # docstring, which comes to tooltip of function
    """ This function processes out non-alphanumerics from user input
        function exits, when user inputs empty inputline == Enter

    """
    # common loop until break loop
    while True:
        # more descriptive name
        words =[]
        # another variable name change, it really does matter, doesn't it
        this_word = ""
        #  prompt it and add punctuation to get out of special casing last word
        for char in raw_input('Your words (Enter to finish): ')+'.':
            if char.isalnum(): # one enough, .isalpha not needed
                this_word += char
            else:
                if this_word:
                    words.append(this_word)
                    this_word = "" # clear only once
        if words:
            # as string
            print ' '.join(words)
        else:
            break
    
text()

Help with adding text to a list

Recommended Answers Collapse Answers

All 10 Replies

Recommended Answers