ihatehippies 11

Tony pointed me in the right direction. My old understanding of default parameters was that they were set each time the function was called. It appears however that each call to the function actually mutates the function object, meaning that if a mutatble object is changed that change is permanent. Heres some helpful answers
http://stackoverflow.com/questions/366422/what-is-the-pythonic-way-to-avoid-default-parameters-that-are-empty-lists

Now I get to rewrite most of my python programs.

ihatehippies 11

K, those are good convention tips, but why is this happening?

ihatehippies 11 Junior Poster

I stumbled on upon this anomaly in one of my programs and can't figure out why this is happening. I made a small test function that exhibits the same behavoir. It's like the local variables of my function are being saved after the function exits. The exclude_ids variable keeps growing. This same behavoir is apparent in IDLE, from cmd line, and when compiled. Tested on windows XP & 8. Python 2.7.4

>>> def is_running(exe_name, by_path=False, exclude_ids=[], exclude_this_process=False):
    from os import getpid
    from os.path import basename, abspath
    if exclude_this_process:
        exclude_ids.append(getpid())
    print "exluded ids %s" % exclude_ids
    return "-----------------"

>>> for x in xrange(10):
    is_running("bob.exe", exclude_this_process=True)

exluded ids [5584]
'-----------------'
exluded ids [5584, 5584]
'-----------------'
exluded ids [5584, 5584, 5584]
'-----------------'
exluded ids [5584, 5584, 5584, 5584]
'-----------------'
exluded ids [5584, 5584, 5584, 5584, 5584]
'-----------------'
exluded ids [5584, 5584, 5584, 5584, 5584, 5584]
'-----------------'
exluded ids [5584, 5584, 5584, 5584, 5584, 5584, 5584]
'-----------------'
exluded ids [5584, 5584, 5584, 5584, 5584, 5584, 5584, 5584]
'-----------------'
exluded ids [5584, 5584, 5584, 5584, 5584, 5584, 5584, 5584, 5584]
'-----------------'
exluded ids [5584, 5584, 5584, 5584, 5584, 5584, 5584, 5584, 5584, 5584]
'-----------------'
>>> 

The list just keeps growing....

ihatehippies 11

Thanks for the reply/code. I am considering just making a .pyd file to handle the inner loop. Using c for this function tested about 16x faster than python (probably even more with your contribution).

ihatehippies 11 Junior Poster

I'm soliciting advice for performace improvements for creating a weak checksum for file segments (used in the rSync algorithm)

Here's what I have so far:

def blockchecksums(instream, blocksize=4096):

    from hashlib import md5
    weakhashes = []
    stronghashes = []

    for chunk in iter(lambda: instream.read(blocksize),""):
        a = b = 0
        l = len(chunk)
        for n, i in enumerate(bytes(chunk)):
            a += i
            b += (l - n)*i    
        weakhashes.append((b << 16) | a)
        stronghashes.append(md5(chunk).hexdigest())

    return weakhashes, stronghashes

I haven't had any luck speeding things up using itertools or using c functions (like any() )

ihatehippies 11

Do you have a file named DailyExpenses.py in the same folder. It is case sensitive.

ihatehippies 11

Maybe I'm missing something, but how are you keeping time? You set clock_start, but I don't see the math to determine how much time has elapsed. I would expect something like

clock_start = time.clock()

time_elapsed = time.clock() - clock_start

if time_elapsed > 2:
    ....(and so on)

ihatehippies 11 Junior Poster

I came across this project that could be really helpful.. if I was c++ literate. I know enough c++ to be dangerous and thats about it. Thanks in advance for anyone who wants to take a crack at it.

ps credit to: http://www.codeproject.com/Articles/13839/How-to-Prepare-a-USB-Drive-for-Safe-Removal

ihatehippies 11

if you want to drop everything after the last "." you would add this after line 39

file_name = ".".join(file_name.split(".")[:-1])

Here is a breakdown of that code

file_name = "test.file.txt"

file_name.split(".")
<<< ["test", "file", "txt"]

take a slice of that minus the last item in the list

["test", "file", "txt"][:-1]
<<< ["test", "file"]

then reinsert the remaining periods that were 'split' in step 1 with 'join'

".".join(["test", "file"])
<<<"test.file"

if you are wanting only the first part file_name then the code would be

file_name = file_name.split(".")[0]
like you posted above

ihatehippies 11

if you google "monitor changes to folder python" the first link that comes up is http://timgolden.me.uk/python/win32_how_do_i/watch_directory_for_changes.html
Pretty useful article

ihatehippies 11

fyi it fails when comparing an empty string

>>> lcs_tuple('', 'fail')

Traceback (most recent call last):
  File "<pyshell#41>", line 1, in <module>
    lcs_tuple('', 'fail')
  File "test.py", line 19, in lcs_tuple
    return round(200.0*this[0]/(n1+n2),2),this[1]
UnboundLocalError: local variable 'this' referenced before assignment

ihatehippies 11

..... little similarity? It's copied from your post.

ihatehippies 11

Here's something that I use for similar purposes. (Note: I cannot take credit for the longest_common_sequence function and am too lazy to look up who actually created it, sorry.)

Usage:

>>> find_likeness(what='ardvark', where=['aardvark', 'apple', 'acrobat'], threshold=80.0, case_sensitive=True, return_seq=True, bestonly=False)
[(93.33, 'aardvark', 0, 'ardvark')]

ardvark is a 93.33% match with aardvark, found at index 0, They have the letters 'ardvark' in sequentially in common.

def find_likeness(what, where, threshold=0, case_sensitive=True, return_seq=False, bestonly=False):
   """ generator object; searches thru list and yields closest
   matches first. Returns (match, otherWord, index, seq)"""
   if not case_sensitive:
      what = what.lower()
      where = [x.lower() for x in where]
   del case_sensitive

   word_len = len(what)

   minus1 = word_len - 1

   result = []

   for index, otherWord in enumerate(where):
     two, match = otherWord, 0
     total = len(otherWord)+word_len
     req = total*threshold*.005
     for n, x in enumerate(what):
         if x in two:
            two = two.replace(x, '', 1)
            match += 1
         elif match + (minus1-n) < req:
            break
     else:
         match, seq = longest_common_sequence(what, otherWord)
         if match >= threshold:
            result.append( (match, otherWord, index, seq) )
##            if match > best:
##                best = match

   result.sort(reverse=1)
   if not return_seq:
      result = [x[:-1] for x in result]

   if bestonly:
      if result:
         return result[0]
   else:
      return result

def longest_common_sequence(one, two):
   len_one, len_two = len(one), len(two)
   longestSequence = {}
   if not len_one+len_two: return 100.0, ''

   [longestSequence.__setitem__((two_index,0), [0,'']) for two_index in xrange(len_two+1)]
   [longestSequence.__setitem__((0,two_index), [0,'']) for two_index in xrange(len_one+1)]

   prev_two=0 ## j-1
   for two_index in xrange(1, len_two+1):
      prev_one = 0 ## i-1
      for one_index in xrange(1, len_one+1):
         if one[prev_one] == two[prev_two]:
            longestSequence[two_index, one_index] ...

ihatehippies 11

seems a bit drastic...

ihatehippies 11

The DatePickerCtrl works really well with what I'm developing. I guess I could recreate a DatePickerCtrl using a CalendarCtrl on a PopupWindow, it just seems unecessary.

ihatehippies 11

no, that's similar to what I have. When you are choosing what month/year you want, before clicking on the date you choose I event fires. If you want to move the month forward one, the event fires, if you want to move the year forward one, the event fires again. I'm trying to catch when the user actually clicks on a day and the popup closes.

ihatehippies 11 Junior Poster

I have a wx.DatePickerCtrl with the dropdown popup window that allows the user to pick a date from the calendar. What I would l like to have my program do is process an event when the user has clicked on a day in the dropdown calendar. Unfortunately the only native event for this control is EVT_DATE_CHANGED and that event gets fired every time the user scrolls the month/year while looking for the date of their choice (firing the event many more times than I would like). I can't seem to access the popup window that is created below the datepickerctrl to see if the window is shown or to bind events directly to it. It isn't created as a child of the datepickerctrl. Basically I'm trying to have the user 1.Click the dropdown button, 2.navigate through the popupwindow calendar, 3. Click on a date, have the popupwindow disappear, 4. Process the event

ihatehippies 11 Junior Poster

I'm trying to create a wx.ListCtrl with a searchable header. I've been looking through the listctrl mixins, but I really don't have the wx expertise needed. I'm thinking I need to paint a textctrl using a dc object, but other than that I'm lost. Any ideas?

ihatehippies 11

With a larger block size copy1 and copy3 are significantly faster than shutil in the large file category. Here's what I got

[CODE=python]
Input file size: 104 MiB; iteration count: 20.0
Function name: shutil_copy; total time: 19.107860699 sec; one iteration: 0.955393034948
Function name: mmap_copy; total time: 10.1346310338 sec; one iteration: 0.506731551691
Function name: copy1; total time: 6.04644839464 sec; one iteration: 0.302322419732
Function name: copy3; total time: 6.07191979681 sec; one iteration: 0.303595989841
Input file size: 1086 MiB; iteration count: 2.0
Function name: shutil_copy; total time: 156.421775246 sec; one iteration: 78.2108876228
[Error 8] Not enough storage is available to process this command
Function name: copy1; total time: 143.615497448 sec; one iteration: 71.8077487238
Function name: copy3; total time: 135.97764655 sec; one iteration: 67.9888232748
Input file size: 0 MiB; iteration count: 500
Function name: shutil_copy; total time: 0.815837531761 sec; one iteration: 0.00163167506352
Function name: mmap_copy; total time: 0.567309099254 sec; one iteration: 0.00113461819851
Function name: copy1; total time: 0.498243360426 sec; one iteration: 0.000996486720852
Function name: copy3; total time: 0.87309633203 sec; one iteration: 0.00174619266406[/CODE]

[CODE=python]
import mmap
import os
import shutil

uses generator to load segments to memory

def copy1(src, dst):
def _write(filesrc, filedst):
filegen = iter(lambda: filesrc.read(100000),"")
try:
while True:
filedst.write(filegen.next())
except StopIteration:
pass

with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst:
     _write(fsrc, fdst)
loads entire file to memory

def copy2(src, dst):
with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst:
fdst.write(fsrc.read())

def copy3(src, dst):
with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst:
for x in iter(lambda: fsrc.read(100000),""):
fdst.write(x)

def mmap_copy(infname, ...

ihatehippies 11

I increased the blocksize from read(16384) to read(100000) and it seemed to reduce processing time by about 15-20%

I modified copy1 slightly

[CODE=python]
@bench
def copy1(src, dst, iteration=1000):

def _write(filesrc, filedst):
filegen = iter(lambda: filesrc.read(100000),"")
try:
while True:
filedst.write(filegen.next())
except StopIteration:
pass
for x in xrange(iteration):

   with open(src, 'rb') as fsrc:
      with open(dst, 'wb') as fdst:
     _write(fsrc, fdst)[/CODE]

Here are the results including the 3 shutil functions

Filesize: 107mb
Iterations: 7 each
System Environment:
win 7, 3gb ram, core 2 duo @ 2.0ghz

benchmark(7)

copy1 uses a generator

copy1 took 2.72599983215 seconds <-------------- SECOND

copy2 loads entire file to memory

copy2 took 21.0590000153 seconds <------------- SLOWEST

copy3 uses an iterator

copy3 took 2.67499995232 seconds <-------------- FASTEST

shutil functions

shutil_copy took 5.40400004387 seconds
shutil_copy2 took 7.40000009537 seconds
shutil_copyfile took 4.68000006676 seconds

copy3 is still the fastest. Shutil is slow for a few reasons. It dynamically recalculates the block size (not a factor with only 7 iterations) and it passes the objects to other functions which consumes more memory. Plus the blocksize is too small for today's computers. The iterator was consistently slightly faster than the generator, but there is probably a more efficient way to write both

ihatehippies 11

try raw_input instead of input

ihatehippies 11 Junior Poster

I'm doing some research to determine the most efficient way to copy files. I've got 3 candidate functions:

1

[CODE=python]

uses generator to load segments to memory

def copy(src, dst, iteration=1000):
for x in xrange(iteration):
def _write(filesrc, filedst):
filegen = iter(lambda: filesrc.read(16384),"")
try:
while True:
filedst.write(filegen.next())
except StopIteration:
pass

   with open(src, 'rb') as fsrc:
      with open(dst, 'wb') as fdst:
     _write(fsrc, fdst)[/CODE]
2

[CODE=python]

loads entire file to memory

def copy2(src, dst, iteration=1000):
for x in xrange(iteration):
with open(src, 'rb') as fsrc:
with open(dst, 'wb') as fdst:
fdst.write(fsrc.read())[/CODE]

3

[CODE=python]
def copy3(src, dst, iteration=1000):
for x in xrange(iteration):
with open(src, 'rb') as fsrc:
with open(dst, 'wb') as fdst:
for x in iter(lambda: fsrc.read(16384),""):
fdst.write(x)
[/CODE]

System Environment:
Win 7 64 bit
3gb ram
Intel Core 2 Duo @ 2.0 GHz

The results

when the file size is 1mb, 1000 iterations each:

copy(SRC, DST)
copy took 5.96600008011 seconds
copy2(SRC,DST)
copy2 took 3.85299992561 seconds
copy3(SRC, DST)
copy3 took 5.35699987411 seconds

The most efficient function is the one that loads the file entirely to memory

when the file size is 107mb 5 iterations each:

copy(SRC, DST, 5)
copy took 3.04099988937 seconds
copy2(SRC,DST, 5)
copy2 took 17.0360000134 seconds
copy3(SRC, DST, 5)
copy3 took 2.2429997921 seconds

Loading the file entirely to memory is now the slowest by far

I thought the results were interesting, if anyone has a more efficient function feel free to contribute

ihatehippies 11

What's the question..?

ihatehippies 11

no problem

ihatehippies 11

The code perfect for me. windows 7, python 2.7.2, wxpython 2.9.3.1
You can try updating wx or use the wx.CallAfter method which calls whatever you pass to it after the current event has finished processing. Maybe something like

[CODE=python]
def HideFrame(self, evt):
self.Hide()
wx.CallAfter(self.Wait)

def Wait(self):
time.sleep(1)
self.Show() [/CODE]

ihatehippies 11

You question isn't very clear. Are you wondering about the print function? That is [URL="http://docs.python.org/release/2.5.2/lib/typesseq-strings.html"]string formatting[/URL].

ihatehippies 11

I read it a bit. It takes 2 arguments besides 'self'
[CODE=python]
00621 def init(self, aParent, aBgRsrc):[/CODE]
It's hard to diagnose your issue without actually seeing the code but from my limited vantage I wouldn't rework the pythoncard code, if you have the gui working leave it that way. I'm fairly certain the actual processing methods/functions don't have to be included in the gui class but rather as stand alone functions or in a class created before the pythoncard class is created. You can reference other classes or modules from inside the pythoncard class.

[CODE]
class processing(object):
def do_work(self):
print 'working'

process_instance = processing()

class YourPythonCardClass(model.whatever):
def a_method(self):

this is the operative line
  process_instance.do_work()[/CODE]

ihatehippies 11

model.Background is the class that it inherits. I've never worked with pythoncard so I can't tell you how you are supposed instantiate the classes but it takes an additional 2 arguments that you need to pass to it when you create it. If you look at the source code for the "Background" class you can see what arguments are required for it or look online for other pythoncard examples or use staticmethods.

ihatehippies 11

that code you posted won't produce that. The init method you made for Mymain takes 3 arguments
[CODE=python] def init(self, something, somethingelse):[/CODE]

ihatehippies 11

On second thought. Is there any reason your methods have to be methods and can't be functions. Why do they need to be inside a class? Are you trying not to pollute the namespace?