Using py2exe I've made an exe file which processes a file then creates a new file which details the errors found in the original file.
However, at the moment I have to change the file allocation in the code each time I want to do it on a different file name.

I'm looking to either be able to drag and drop a file onto the exe and treat that as the input, or enable the script to work no matter what the file name is (as long as it ends in .kml).
In an ideal world I'd love to be able to throw a bunch of .kml files into a folder then just double click the exe and have it create a txt file for each .kml called [original file name]-errors.txt.
I can vaguely picture how to do this but I'm completely thrown by the file handling aspect of it. Could anyone please explain how to work with files without knowing their names? Initially struck my mind as maybe *.kml working, didn't naturally, probably a better way of explaining it though.
I'm not very skilled with file handling and programming in general beyond the basics, so a simple explanation would be great.

Help is MUCH appreciated, and thank you in advance.

Recommended Answers

All 17 Replies

This simple solution will work for all files on the current dir.

import os

filelist = os.listdir('')

for files in filelist:
    basename, ext = os.path.splitext(files)
    if ext == '.kml':
        f_output = open('%s-errors.txt' % basename, 'w')
        f_output.write('whatever you want')
        f_output.close()

Cheers and Happy coding

import os
myext='.kml'
for filename in (fn for fn in os.listdir(os.curdir) if fn.endswith(myext)):
    print filename # replace with your activity

Works beautifully! Had to do a bit of awkward shuffling to actually read the file, but it works now!

@tony - thanks for your help, but I opted for beast slayers solution as it looked more simpleton friendly :)

(can't see an edit button on my post? sorry for double posting)

Actually, after using it I've come across a problem. It seems to like reading all the files a couple of times randomly..

With this input:

counter = 0
for file in filelist:
    counter +=1
    
    basename, ext = os.path.splitext(file)
    if ext <> ".kml":
        break
    print file
    print counter

I get this output (there are 3 files in the folder):

[file1]
1
[file1]
1
[file2]
2
[file3]
3
[file2]
2
[file3]
3

Which I found rather bizarre, considering the counter is negatively incrementing (not used anywhere else in code) and it doesn't seem to increment at all between the first and second, suggesting that it ignores the "counter +=1" code, but executes the "print counter" code.
I really can't understand that at all. Getting brainfreeze just thinking about it.

Why you want to stop the loop if non-.kml file is found and not process the .kml files after it?

Preferable way to do the for and counter is:

for count,file in enumerate(filelist):
## count is zero based but add one if necessary

I would suggest to do print(filelist) before loop to make sure you got the filelist right.
Maybe would be better to count only .klm files and not care about others:

import os

myext = '.klm'
count = 0
for file in os.listdir(os.curdir):
    basename, ext = os.path.splitext(file)
    if ext == myext:
        count+=1
        print count,':',file

I noticed this mistake myself and changed "break" to "continue", but with no improvement.

Tried printing filelist too, and it was correct, just the 3 files, but it also showed me something interesting.
It printed out the filelist more than once, when the "print filelist" was at the start of the code, in no loops whatsoever.
So it looks like the whole script is being repeated somehow, didn't that was even possible without a loop or external script.

You must post whole script or find where you do loop or recursive call.

The Python module glob is much better for these things. Also note that glob has the advantage that extension are not case sensitive ...

# use module glob to get a list of selected files

import glob
import os

# select only .exe files
# or any other file extension you are interested in
directory = "C:/Temp/*.exe"
for path in glob.glob(directory):
    print( path )
    # optional ...
    dirname, filename = os.path.split(path)
    basename, ext = os.path.splitext(filename)
    # test ...
    print( dirname, filename, basename, ext )

The Python module glob is much better for these things. Also note that glob has the advantage that extension are not case sensitive ...

Ok, let's put lower() then in my generator also to same functionality.

import os
myext='.kml'
for filename in (fn for fn in os.listdir(os.curdir) if fn.lower().endswith(myext)):
    print filename # replace with your activity

Why the use of break or continue for the purpose??

Cheers and Happy coding

Cheers for the tip about glob, I'll keep it in mind for the future, but for now I've done that aspect, but am struggling to understand why my program is so glitchy..

Full code, excluding the functions, they're not really relevant to the files related problem I'm having of it repeating them.

import os, sys
import functions

os.chdir('C:\Documents and Settings\Chris\My Documents\Chris\GPS work\Check')
if os.path.isdir('C:\Documents and Settings\Chris\My Documents\Chris\GPS work\Check\Errors') == False:
    os.mkdir('C:\Documents and Settings\Chris\My Documents\Chris\GPS work\Check\Errors')
filelist = os.listdir('')
for file in filelist:
    os.chdir('C:\Documents and Settings\Chris\My Documents\Chris\GPS work\Check')
    points = []
    first = True
    CourseName = ''
    errors = 0
    towrite = []
    
    basename, ext = os.path.splitext(file)
    if ext <> ".kml":
        continue
    old = open(file, 'r')
    towrite.append("Checking file " + basename)
    for line in old:        
        if line.find("<name>") <> -1:
            start = line.find("<name>") + 6
            end = line.find("</name>")
            if line[start:end].find(".kml") <> -1:
                continue
            if first == True:
                CourseName = line[start:end]
                first = False
            else:
                points.append(line[start:end])
    if CourseName <> basename:
        towrite.append("Filename does not match course name")
        errors +=1
    towrite, errors = functions.analyse(points, errors, towrite)
    os.chdir('C:\Documents and Settings\Chris\My Documents\Chris\GPS work\Check\Errors')
    if errors <> 0:
        new = open('ERRORS-%s.txt' % basename, 'w')
        for item in towrite:
            new.write(item + "\n")
        new.close()

I know it's probably messy and not as efficient as it could be, but it works for me.. Well, it doesn't actually, but you know what I mean.

The first one works great, here's the textfile output:

Checking file 17951-WindyKnollGolfClubSpringfieldOhioUSA
Filename does not match course name

1 errors found.

The second course doesn't have any errors, so the txt file isn't created, but here's the third, where it all goes wrong:

0 errors found.

0 errors found.
Invalid second qualifier (Line 20)
water n m5
Invalid second qualifier (Line 69)
trees r f 15

2 errors found.

The whole code is being repeated (earlier post) and I'm not sure why or how, and it seems to be adding extra to this file. Desired output would be:

Checking file [filename]

Invalid second qualifier (Line 20)
water n m5
Invalid second qualifier (Line 69)
trees r f 15

2 errors found.

I've checked all the settings are at null for each file, the problem is that when the second course repeats, it reads in the file as nothing.

Would it be worth trying glob to see if there's any improvement? If so, where would I change the code?

@Beat Slayer - so it doesn't waste time executing all the code when it's not the right file anyway.

You have not written windows filenames properly use r'' instead of '' for windows filenames or double the \ to \\

Tried to refactor your code little to

import os, sys
import functions

extension = '.kml'
checkdir = r'C:\Documents and Settings\Chris\My Documents\Chris\GPS work\Check'
errordir = os.path.join(checkdir,'Errors')

if not os.path.isdir(errordir):
    os.mkdir(errordir)

filelist = os.listdir(checkdir)

for file in filelist:
    basename, ext = os.path.splitext(file)
    if ext.lower() == ".kml":
        points = towrite = []
        coursename = ''
        errors = 0
           
        towrite.append("Checking file " + basename)

        for line in open(os.path.join(checkdir,file)):
            if "<name>" in line:
                 _,_,info = line.partition("<name>")
                 info,_,_ = line.partition("</name>") ## always in same line?
                
                 if extension in info:
                      if not courseName:
                          coursename = info
                          if coursename <> basename:
                              towrite.append("Filename does not match course name: %s" % coursename)
                              errors +=1
                      else: points.append(info)

        towrite, errors = functions.analyse(points, errors, towrite)

        if errors: ## sometimes towrite but not errors, or is errors == len(towrite) ??
             with open(os.path.join(errordir, 'ERRORS-%s.txt' % basename), 'w') as new:
                 new.write("\n".join(towrite))

I do not know if this works but this is how I interpreted the meaning of your code without knowing exact meaning of it.

Yes that seems very similar. Made a few changes, a few good points of improvements that I've adopted. Don't understand the _,_, section so I'll leave that for now.

The problem is the script keeps repeating itself, without apparent cause. For example if I put a "print "hello"" as the first line of the script, it'll print multiple times. It's the main script so nothing is calling it.
Cannot explain how the entire code is being repeated without me telling it to.

You are not importing the code from other module?
Try putting after imports:

if __name__ == '__main__':

and indent your code.

Sorry, that code had some faults as I have not proper files to test it:

I put coursename variable name to lowercase as is normal in Python (I have no files with the correct <name> tags so program did not enter inside condition). But one camelcase version looks like was left to line 28. I put variable for the extension, but version I posted did not use it in line 15(for testing the code without .kml files).

Partition is very efficient method for dividing strings, please learn to use it!

str.partition(sep)

Split the string at the first occurrence of sep, and return a 3-tuple containing the part before the separator, the separator itself, and the part after the separator. If the separator is not found, return a 3-tuple containing the string itself, followed by two empty strings.)

Consider also that if you change the towrite list in functions.analyse the changes will be visible from caller as list is mutable data type. The count of errors you can probably get from length of towrite and need also not to pass around (len is quite expensive function to call though).

So my problem is that I can't import the variables into the function script, without re-executing the code.
On it's way to finding the variables (line in code of functions that says: 'import check') it re-executes the code on the way down to finding the variables, how do I stop this?

Also, the "points = towrite = []" sends my code into an infinite loop.

Depends on function, basically you should pass the needed variables as parameters. If you change list or dictionary, the change will influence the caller value, so you do not need to pass in list and return it at the end. Is the code of functions long? Isn't there one error message per error in towrite and so the error count is unnecessary. I think only doing functions.analyse(points, towrite) would be enoungh without return value or returning True for everything OK and False for errors found.

Thanks! Turns out I didn't need the import at all, works fine without it.
Cheers for all your help :)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.