How to handle endâ€ofâ€line characters

Question

pink_872 0 Newbie Poster

13 Years Ago

I am trying to read from a txt file and counts the number of times each word appears. The problem is that it counts the EOL characters as well. I tried to use the rstrip, still it didn't do anything. So how can I handle these end-of-line characters?
Please help.

Object= open('w.txt','r')
L= Object.read().rstrip()
occurrenences={}

for word in L.split():
    occurrenences[word] = occurrenences.get(word,0)+1
    
for word in occurrenences:
    print(occurrenences[word],word)

Object.close()

python

3 Contributors
3 Replies
168 Views
9 Hours Discussion Span
Latest Post 13 Years Ago Latest Post by griswolf

All 3 Replies

GDICommander 54 Posting Whiz in Training

13 Years Ago

On my machine (Windows), when providing a file that contains "a b c d e", I have this output:

(1, 'a')
(1, 'c')
(1, 'b')
(1, 'e')
(1, 'd')

There is no EOL character in the output, which is expected. So, the code haven't took the EOL character on my machine.

Are you running your script on another OS?

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

pink_872 0 Newbie Poster · Answer 1 · 2011-03-21T02:05:52+00:00

On my machine (Windows), when providing a file that contains "a b c d e", I have this output:
(1, 'a')
(1, 'c')
(1, 'b')
(1, 'e')
(1, 'd')
There is no EOL character in the output, which is expected. So, the code haven't took the EOL character on my machine.
Are you running your script on another OS?

I am running on Windows 7 OS. Lets say that I had the following line in my txt file:
My \r name \r is Nana \n

I want my program to skip \n and \r. I don't want them to be counted.
I tried everything, still it didn't work.

PLEASE HELP :(

griswolf 304 Veteran Poster · Answer 2 · 2011-03-21T02:17:41+00:00

Your code should just work because split() should handle all kinds of white space including various newline characters. However, you can try

import string
# ...
for word in L.split(string.whitespace):
    if word:
        occurrenences[word] = occurrenences.get(word,0)+1
#...

The extra test is because when you specify the split characters, you get empty strings if there are adjacent split characters as you have in your example.

How to handle endâ€ofâ€line characters

Recommended Answers Collapse Answers

All 3 Replies

How to handle endâ€ofâ€line characters

Recommended Answers