test.txt

Hello
basically ive parsed a website's data into an object. I wrote it to file and read it back in to write to a dictionary. but when i read it in it reads it like 3 times. so im a bit confused as to why ad would like to know what bit of codes wrong. im ne to python and am learning to debug my code but its sometimes a bit hard to find the problem as i think its all legit.
so please point out where ive gone wrong thanks. the output file itself looks fine as it writes i only once

!/usr/bin/env python

import HTMLParser
class MyParser(HTMLParser.HTMLParser): ########################################################
def init(self):

    HTMLParser.HTMLParser.__init__(self)
    self.titleFound = False
    return  

#

def handle_starttag(self, tag, attrs):

    if tag == 'td':

        self.titleFound = True
    return   

##########################################################
def handle_data(self, titleString):

    if self.titleFound == True:
        filename = "test.csv"
        file = open(filename, 'a')
        file.write(titleString)
        file.close()
        file = open(filename)
        print file.read()
        file.close()

    return   

##########################################################
def handle_endtag(self, tag):

    if tag == 'td':
        self.titleFound = False
    return

#############End of Class definition

if name == 'main':
titleExtractor = MyParser()
buffer = open('live.html', 'r').read()
titleExtractor.feed(buffer)

Attachments
FlightFromScheduledRemark


 T34712  ABERDEEN  0800  LANDED 08:00 


 BE171  SOUTHAMPTON  0820  LANDED 08:07 


 WOW482  NEWQUAY / BRISTOL  0835  LANDED 08:24 


 NM322  ISLE OF MAN  0850  LANDED 08:36 


 LS324  BELFAST INTL  0910  LANDED 08:54 


 BD404  EDINBURGH  0925  LANDED 09:03 


 BD291  GLASGOW  0930  LANDED 09:09 


 KL1545  AMSTERDAM  0940  LANDED 09:17 


 T34701  SOUTHAMPTON  0945  LANDED 09:33 


 BD612  BRUSSELS  0955  LANDED 09:35 


 LS202  AMSTERDAM  1010  LANDED 09:45 


 FR152  DUBLIN  1025  LANDED 10:29 


 BD412  HEATHROW  1055  LANDED 10:47 


 LS456  PARIS (CHARLES-DE-GAULLE)  1150  LANDED 12:10 


 LS286  GENEVA  1245  LANDED 12:32 


 T34714  ABERDEEN  1435   


 BD414  HEATHROW  1445   


 KL1549  AMSTERDAM  1555   


 BE731  BELFAST CITY  1620   


 T34705  SOUTHAMPTON  1710   


 NM328  ISLE OF MAN  1725   


 BD418  HEATHROW  1800   


 WOW486  PLYMOUTH / BRISTOL  1810   


 T34716  ABERDEEN  1910   


 LS232  BARCELONA  1915   


 BD406  EDINBURGH  1925   


 BD297  GLASGOW  1930   


 BD616  BRUSSELS  1930   


 FR9078  ALICANTE  1930   


 LS328  BELFAST INTL  2000   


 BE175  SOUTHAMPTON  2020   


 T34707  SOUTHAMPTON  2025   


 LS206  AMSTERDAM  2110   


 KL1543  AMSTERDAM  2120   


 LS348  DUSSELDORF  2130   


 BD420  HEATHROW  2145   


 FR156  DUBLIN  2205   Current
time: 14:03, 17 Feb 2009Last updated: 
13:59, 17 Feb 2009

sorry about the double post but it wont let me reedit my post
file = open(filename, 'a')
i know the a is append and should be w for write but using w only writes the time.
file = open(filename) is actually file = open(filename ,'r')

The problem may be that you are using the reserved word "file" and perhaps overloading it. file and open are basically the same for Python. So use something other than "file", check the contents of the output file with a text/word processor to see that it is really there 3 times, and post back if the problem doesn't disappear.

Comments
good catch

Also please post code using code tags, as it preserves the formatting and highlights reserved words (such as file). This will increase the chance that other forum members will read your post and be more willing to help.

Use code tags like this:
[code=python] #You code goes here

[/code]

done it i had to copy and paste the readfile onto the bottom of the code

This question has already been answered. Start a new discussion instead.