test.txt

Hello
basically ive parsed a website's data into an object. I wrote it to file and read it back in to write to a dictionary. but when i read it in it reads it like 3 times. so im a bit confused as to why ad would like to know what bit of codes wrong. im ne to python and am learning to debug my code but its sometimes a bit hard to find the problem as i think its all legit.
so please point out where ive gone wrong thanks. the output file itself looks fine as it writes i only once

!/usr/bin/env python

import HTMLParser
class MyParser(HTMLParser.HTMLParser): ########################################################
def init(self):

    HTMLParser.HTMLParser.__init__(self)
    self.titleFound = False
    return  

#

def handle_starttag(self, tag, attrs):

    if tag == 'td':

        self.titleFound = True
    return   

##########################################################
def handle_data(self, titleString):

    if self.titleFound == True:
        filename = "test.csv"
        file = open(filename, 'a')
        file.write(titleString)
        file.close()
        file = open(filename)
        print file.read()
        file.close()

    return   

##########################################################
def handle_endtag(self, tag):

    if tag == 'td':
        self.titleFound = False
    return

#############End of Class definition

if name == 'main':
titleExtractor = MyParser()
buffer = open('live.html', 'r').read()
titleExtractor.feed(buffer)

Recommended Answers

All 4 Replies

sorry about the double post but it wont let me reedit my post
file = open(filename, 'a')
i know the a is append and should be w for write but using w only writes the time.
file = open(filename) is actually file = open(filename ,'r')

The problem may be that you are using the reserved word "file" and perhaps overloading it. file and open are basically the same for Python. So use something other than "file", check the contents of the output file with a text/word processor to see that it is really there 3 times, and post back if the problem doesn't disappear.

commented: good catch +6

Also please post code using code tags, as it preserves the formatting and highlights reserved words (such as file). This will increase the chance that other forum members will read your post and be more willing to help.

Use code tags like this:
[code=python] #You code goes here

[/code]

done it i had to copy and paste the readfile onto the bottom of the code

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.