Comparing lists and likeness

Question

tillaart36 2 Light Poster

16 Years Ago

Hello all,

I have some difficulty with something I want to do:

I want to 'measure' the likeness of two lists of integers. What I mean is I want for example to compare:

list1 = [0, 0, 1, 1, 0, 1, 0]
list2 = [0, 0, 0, 1, 1, 1, 0]

these two lists. What I need is to measure these lists where if 0 == 0 it gets a score of say +1, where 1 == 1 it gets a score of +3 and if 0 != 1 or 1 != 0 it gets a score of -1.

In this little example I need to compare list1[0] and list2[0], since they're bout 0 it gets +1 to the score. list1[1] and list2[1] are both 0 so again +1, list1[2] and list2[2] gets -1 and list1[3] and list2[3] get +3 and so on.

In my example the score would have to be: +1 +1 -1 + 3 - 1 +3 + 1 = 7. I'm trying to find a loop that works this way but since I'm not that far with python I could use some pointers or starters.

python

5 Contributors
33 Replies
247 Views
2 Months Discussion Span
Latest Post 16 Years Ago Latest Post by tillaart36

All 33 Replies

sneekula 969 Nearly a Posting Maven

16 Years Ago

One of the ways to do this is to flatten the nested list. Here is an example:

# flatten a nested list

def flatten(q):
    """
    a recursive function that flattens a nested list q
    """
    flat_q = []
    for x in q:
        # may have nested tuples or lists
        if type(x) in (list, tuple):
            flat_q.extend(flatten(x))
        else:
            flat_q.append(x)
    return flat_q

template = [[0, 0, 0, 0],
            [0, 1, 1, 0],
            [0, 0, 1, 0],
            [0, 0, 0, 0]]

target = [[0, 0, 0, 0],
          [0, 1, 1, 0],
          [0, 1, 1, 0],
          [0, 0, 1, 0]]

flat_template = flatten(template)
flat_target = flatten(target)

# test it
print(flat_template)
print(flat_target)

"""
my test result -->
[0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0]
"""

The code you are using to calculate the score is straight forward, simply go head an use it on the flattened list.

Another way to to add the score would be to zip the two lists:

template =  [0, 0, 1, 1, 0, 1, 0]
target =    [0, 0, 0, 1, 1, 1, 0]

print(template)
print(target)

print(zip(template, target))

score = 0
for x in zip(template, target):
    a = x[0] + x[1] + 1
    if a == 2:
        score -= 1
    else:
        score += a

print(score)

"""
my result -->
[0, 0, 1, 1, 0, 1, 0]
[0, 0, 0, 1, 1, 1, 0]
[(0, 0), (0, 0), (1, 0), (1, 1), (0, 1), (1, 1), (0, 0)]
7
"""

Notice that I am using 'print(template)' rather than 'print template'. This way the code works both with Python25 and the new Python30.

sneekula 969 Nearly a Posting Maven

16 Years Ago

To transpose a 2d array (list of lists) see:
http://www.daniweb.com/forums/showpost.php?p=778261&postcount=161

Stefano Mtangoo 455 Senior Poster

16 Years Ago

I played around a little and came with this. Note assumption here is, two lists are equal :)

list1 = [0, 0, 1, 1, 0, 1, 0]
list2 = [0, 0, 0, 1, 1, 1, 0]
score = 0

i = 0
while True:
            #assumming length of both list is equal
            if i == len(list1):
                        break
                     
            elif list1[i] == list2[i]:
                        print "first number is %s and second %s" %(str(list1[i]), str (list2[i]) )
                        score = score+1
                        print "Score is %d: " %(score,)
                        i = i+1
            else:
                        #do anything else like -3 scores
                        print "Not Equal!"
                        i = i+1
                                    
            
print "Total Score is %d" %(score, )

jlm699 320 Veteran Poster

16 Years Ago

It should iterate as (x, y)

You just have your x and y mixed up:

>>> def printGrid(w,h):
...     for y in xrange(h):
...         for x in xrange(w):
...             print '%d,%d' % (x,y),
...         print
...     
>>> printGrid(w=7,h=6)
0,0 1,0 2,0 3,0 4,0 5,0 6,0
0,1 1,1 2,1 3,1 4,1 5,1 6,1
0,2 1,2 2,2 3,2 4,2 5,2 6,2
0,3 1,3 2,3 3,3 4,3 5,3 6,3
0,4 1,4 2,4 3,4 4,4 5,4 6,4
0,5 1,5 2,5 3,5 4,5 5,5 6,5
>>>

jlm699 320 Veteran Poster

16 Years Ago

class Grid(object):
    ...
    @property
    def height(self):
        return len(self.grid)
    @property
    def width(self):
        return len(self.grid[0])

Can you explain the @property portion a little bit? What does that do for us ?

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

tillaart36 2 Light Poster · Answer 1 · 2009-01-14T19:36:53+00:00

Hmm stupid me,

I must have made some small mistake in my loops why it went wrong.

I now have this code:

template =  [0, 0, 1, 1, 0, 1, 0]
target =    [0, 0, 0, 1, 1, 1, 0]

print template
print target

score = 0 
for x in range(len(template)):
    if template[x] == target[x]:
        if template[x] == 0:
            score += 1
        if template[x] == 1:
            score += 3
    if template[x] != target[x]:
        score -= 1
print score

and it gives indeed a score of 7. My next step is to compare lists of lists like:

template = [[0, 0, 0, 0],
            [0, 1, 1, 0],
            [0, 0, 1, 0],
            [0, 0, 0, 0]]

target = [[0, 0, 0, 0],
          [0, 1, 1, 0],
          [0, 1, 1, 0],
          [0, 0, 1, 0]]

I think it's just adding some tweaks to my loop but as I said I'm not good with loops so I guess it will take some good thinking again :-/

tillaart36 2 Light Poster · Answer 2 · 2009-01-14T20:33:39+00:00

Hey!

Thanks for your help :), I will look at it further.

What I do now is something like this:

template1 = [[0, 0, 0, 0, 0, 0],
             [0, 1, 1, 1, 1, 0],
             [0, 1, 1, 1, 1, 0],
             [0, 0, 0, 1, 1, 0],
             [0, 0, 0, 1, 1, 0],
             [0, 0, 0, 0, 0, 0]]

template2 = [[0, 0, 0, 0, 0, 0],
             [0, 1, 1, 1, 1, 0],
             [0, 1, 1, 1, 1, 0],
             [0, 0, 1, 1, 0, 0],
             [0, 0, 1, 1, 0, 0],
             [0, 0, 0, 0, 0, 0]]

target =    [[0, 0, 0, 0, 0, 0],
             [0, 1, 1, 1, 1, 0],
             [0, 1, 1, 0, 0, 0],
             [0, 1, 1, 0, 0, 0],
             [0, 1, 1, 1, 1, 0],
             [0, 0, 0, 0, 0, 0]]

score1 = 0 
for x in range(len(template1)):
    for y in range(len(template1[0])):
        if template1[x][y] == target[x][y]:
            if template1[x][y] == 0:
                score1 += 1
            if template1[x][y] == 1:
                score1 += 3
        if template1[x][y] != target[x][y]:
            score1 -= 1
print score1

score2 = 0
for x in range(len(template2)):
    for y in range(len(template2[0])):
        if template2[x][y] == target[x][y]:
            if template2[x][y] == 0:
                score2 += 1
            if template2[x][y] == 1:
                score2 += 3
        if template2[x][y] != target[x][y]:
            score2 -= 1
print score2

This seems to work ok, but maybe it's not as clean as it should be. But I'm doing this as an exercise for my bigger project where I have a grid of width and height filled with cells with values. Therefore I think this works better...Thanks for the help so far!

BTW is there a neat way to 'rotate' these lists? What I try to do is these cells stand for a landuse of an area. For now cells with 0 are unassigned cells while cells with 1 have a building. I need to be able to rotate this building and again compare it with the target.

Before rotating 90 degrees clockwise:
0 0 0 0 0
0 1 1 1 0
0 0 1 1 0
0 0 0 0 0

After rotating 90 degrees clockwise:
0 0 0 0
0 0 1 0
0 1 1 0
0 1 1 0
0 0 0 0

I guess in order to do this I have to convert rows into columns and columns into rows? I'll look further to see if I can find some more information on that.

Gribouillis 1,391 Programming Explorer Team Colleague · Answer 3 · 2009-01-14T22:50:24+00:00

Transposition plus reversing each line should give you a rotation of the matrix. This question of the 'likeness' of 2 lists is very interesting. I think you should check the Levenshtein distance between 2 strings, which measures the 'likeness' of 2 character strings. You will easily find a python implementation on the web.

tillaart36 2 Light Poster · Answer 4 · 2009-01-15T01:32:12+00:00

Thanks evstevemd, I will try it out a bit.

To transpose a 2d array (list of lists) see:
http://www.daniweb.com/forums/showpost.php?p=778261&postcount=161

sneekula, what you do here:

"""
my transpose result -->
[0, 0, 0, 0, 0, 0]
[0, 1, 1, 1, 1, 0]
[0, 0, 0, 1, 1, 0]
[0, 1, 0, 1, 1, 0]
--------------------
[0, 0, 0, 0]
[0, 1, 0, 1]
[0, 1, 0, 0]
[0, 1, 1, 1]
[0, 1, 1, 1]
[0, 0, 0, 0]
"""

is not exactly what I need to do. My template cells shouldn't change place. In your example the array is rotated 90 degrees and also mirrored (or am I wrong?) I need only the rotation part so that:

0 0 0 0
0 0 1 0
0 1 1 0
0 0 0 0

would become this (rotated 90 degrees clockwise):

0 0 0 0
0 1 0 0
0 1 1 0
0 0 0 0

Dont know if this is possible with your hints, but I'll try some fiddling. Thanks Gribouillis also, I will try and get some more info about that!

tillaart36 2 Light Poster · Answer 5 · 2009-01-15T18:28:25+00:00

Hello,

I'm further working on this concept where I compare cells of a template with cells of a target. The part where I need to rotate my template to see of this has a better match with the target grid I now do with the following function:

def rotate_grid(g):
	rotated = [[g[y][len(g[0])-1-x] for y in range(len(g))] for x in range(len(g[0]))]
	return rotated

Which I saw in this thread:
http://www.programmingforums.org/thread10842.html

I made these kinds of comparisons earlier in an excel sheet and I got the same results so so far it does what I want I assume. The way I compare a template and a target needs more refining now. It must for example be possible that the cellsize of template and target differ from eachother. And the fact that I'm now just comparing pairwise cells should be refined more I think. Now I will look more into this Levenshtein theory and other concepts that might be available...

But maybe you guys can look over this code and see if it does what I want and to maybe point out some poor coding from my side. And ideas about a better comparing technique are welcome also.

template1 = [[0, 0, 0, 0, 0, 0],
             [0, 1, 1, 1, 1, 0],
             [0, 1, 1, 1, 1, 0],
             [0, 0, 0, 1, 1, 0],
             [0, 0, 0, 1, 1, 0],
             [0, 0, 0, 0, 0, 0]]

target =    [[0, 0, 0, 0, 0, 0],
             [0, 1, 1, 1, 1, 0],
             [0, 1, 1, 0, 0, 0],
             [0, 1, 1, 0, 0, 0],
             [0, 1, 1, 1, 1, 0],
             [0, 0, 0, 0, 0, 0]]

def rotate_grid(g):
	rotated = [[g[y][len(g[0])-1-x] for y in range(len(g))] for x in range(len(g[0]))]
	return rotated
	
def compare(temp, tar):
	score = 0
	for x in range(len(temp)):
		for y in range(len(temp[0])):
			if temp[x][y] == tar[x][y]:
				if temp[x][y] == 0:
					score += 1
				if temp[x][y] == 1:
					score += 3
			if temp[x][y] != tar[x][y]:
				score -= 1
	print "Comparison score bedraag: ", score

print ('-'*20)
print "Template 1 over 0 graden geroteerd: "
for row in template1:
	print row

compare1 = compare(template1, target)

print ('-'*20)

template1_90 = rotate_grid(template1)
print "Template 1 over 90 graden geroteerd: "
for row in template1_90:
	print row

compare2 = compare(template1_90, target)

print ('-'*20)

template1_180 = rotate_grid(template1_90)
print "Template 1 over 180 graden geroteerd: "
for row in template1_180:
	print row

compare3 = compare(template1_180, target)

print ('-'*20)

template1_270 = rotate_grid(template1_180)
print "Template 1 over 270 graden geroteerd: "
for row in template1_270:
	print row

compare4 = compare(template1_270, target)

And my output:
******************************************************************************
--------------------
Template 1 over 0 graden geroteerd:
[0, 0, 0, 0, 0, 0]
[0, 1, 1, 1, 1, 0]
[0, 1, 1, 1, 1, 0]
[0, 0, 0, 1, 1, 0]
[0, 0, 0, 1, 1, 0]
[0, 0, 0, 0, 0, 0]
Comparison score bedraag: 36
--------------------
Template 1 over 90 graden geroteerd:
[0, 0, 0, 0, 0, 0]
[0, 1, 1, 1, 1, 0]
[0, 1, 1, 1, 1, 0]
[0, 1, 1, 0, 0, 0]
[0, 1, 1, 0, 0, 0]
[0, 0, 0, 0, 0, 0]
Comparison score bedraag: 48
--------------------
Template 1 over 180 graden geroteerd:
[0, 0, 0, 0, 0, 0]
[0, 1, 1, 0, 0, 0]
[0, 1, 1, 0, 0, 0]
[0, 1, 1, 1, 1, 0]
[0, 1, 1, 1, 1, 0]
[0, 0, 0, 0, 0, 0]
Comparison score bedraag: 48
--------------------
Template 1 over 270 graden geroteerd:
[0, 0, 0, 0, 0, 0]
[0, 0, 0, 1, 1, 0]
[0, 0, 0, 1, 1, 0]
[0, 1, 1, 1, 1, 0]
[0, 1, 1, 1, 1, 0]
[0, 0, 0, 0, 0, 0]
Comparison score bedraag: 36
** Load Time: 0.01 seconds

Thanks for the help so far!

Gribouillis 1,391 Programming Explorer Team Colleague · Answer 6 · 2009-01-15T19:52:14+00:00

I give you a function to print your lists vertically, and a more general rotation function

def vprint(L):
  "prints a list vertically"
  print("[%s]" % ",\n ".join(repr(x) for x in L))

def rotated(grid, clock=1):
    # rotated(grid, n) --> a new grid rotated n times 90 degrees clockwise
    clock %= 4
    if clock == 0:
        return list(grid)
    elif clock == 1:
        return[list(x) for x in zip(*reversed(grid))]
    elif clock == 2:
        return [list(reversed(x)) for x in reversed(grid)]
    else:
        return [list(x) for x in reversed(zip(*grid))]

for i in range(4):
    vprint(rotated(template1, i))
    print("")

I'll try to understand your comparison algorithm.

tillaart36 2 Light Poster · Answer 7 · 2009-01-15T20:27:14+00:00

I tried to explain my 'problem' or task in an earlier thread:

http://www.daniweb.com/forums/thread166744.html

but I didnt get alot of response there :) , maybe my explanation there wasn't quite clear enough.

Gribouillis 1,391 Programming Explorer Team Colleague · Answer 8 · 2009-01-15T20:27:34+00:00

Here are a few more utility functions. One question is how do you want to use the score ? What conclusions do you draw from the comparison ?

def entries(grid):
    "iterator over a grid's entries"
    return (x for line in grid for x in line)

def pair_score(x, y):
    "returns the score of a pair of numbers in {0, 1}"
    return x + y + 1 if (x == y) else -1

from itertools import izip
def grid_score(g1, g2):
    "returns the score of 2 grids with the same shape"
    return sum(pair_score(*t) for t in izip(entries(g1), entries(g2)))

print grid_score(target, rotated(template1, 2))

Gribouillis 1,391 Programming Explorer Team Colleague · Answer 9 · 2009-01-16T13:16:22+00:00

I tried another algorithm which is used in information theory to measure similarities between files or other data (adn sequences, etc). The algorithm is heuristic and works like this. Suppose that you have a string A and a compression algorithm which produces a (shorter) string of length C(A). Then you define a "distance" between 2 strings A and B by d(A,B)= 1 - (C(A)+C(B) - C(A+B))/max(C(A), C(B)) . The idea is that if there are similarities between the 2 strings, the compression algorithm will detect it and produce a shorter compression for the concatenation A+B.
For concrete compression algorithms, this distance is not a true distance in the mathematical sense, but it can still be used to detect similarites. It should be somewhere between 0 and 1 and small if A and B are close.
OK. For your problem, I use the function zlib.compress to compress the data. For your target grid for example, I compress the string

"000000201111020110002011000201111020000002"

I represented the end of a line by a '2' digit. Here is the code (I defined my own array class)

from zlib import compress 

class Array (list ):
  def __init__ (self ,*args ):
    list .__init__ (self ,*args )
    for x in (min (self .entries ),max (self .entries )):
      assert x in (0 ,1 )
  @staticmethod 
  def fromString (theString ):
    lines =(list (x )for x in theString .strip ().split ())
    return Array (list (int (x )for x in line )for line in lines )
  def rotated (self ,n ):
# an array rotated clockwise n times
    n %=4 
    if n ==0 :
      L =self 
    elif n ==1 :
      L =[list (x )for x in zip (*reversed (self ))]
    elif n ==2 :
      L =[list (reversed (x ))for x in reversed (self )]
    else :
      L =[list (x )for x in reversed (zip (*self ))]
    return Array (L )
  @property 
  def entries (self ):
    return (x for line in self for x in line )
  @property 
  def bits (self ):
    return "".join ("".join (str (x )for x in line )+"2"for line in self )
  @property 
  def compressed (self ):
    return compress (self .bits )
  def distance (self ,other ):
    ab =len (compress (self .bits +other .bits ))
    a ,b =len (self .compressed ),len (other .compressed )
    return 1.0 -(0.0 +a +b -ab )/max (a ,b )

temp1 =Array .fromString ("000000 011110 011110 000110 000110 000000")
target =Array .fromString ("000000 011110 011000 011000 011110 000000")

for n in range (4 ):
  print "%d rotations, distance = %.4f"%(n ,target .distance (temp1 .rotated (n )))
print target .distance (target )

"""
my output -->
0 rotations, distance = 0.3478
1 rotations, distance = 0.3913
2 rotations, distance = 0.3043
3 rotations, distance = 0.2609
0.0869565217391
"""

We see that the best match is obtained for 3 rotations clockwise.
I don't know if this algorithm will be very efficient, you should check it with a large set of patterns.

tillaart36 2 Light Poster · Answer 10 · 2009-01-22T16:35:28+00:00

I've 'overwritten' your code to get some feeling of what you plan to do.

I understand how you can input strings, and with the fromString() method make an array of these. The rotated() and entries() methods I also understand but I'm having some trouble with reading and understanding the bits(), compressed() and distance() method so I'm planning to go and read more into that right now.

I also have a bit of trouble with understanding what you are doing exactly, I have an idea about it but I want to understand it more exact for being able to understand the algorithm and to judge the usefullness when I'm trying to test more patterns.

Maybe when I understand more about the compress and bits algorithm I can understand better what exactly is done here.

However you asked me for the conclusions I like to draw from comparing different templates with one target. The goal where I want to implement this code is the program I referred to earlier. The user of the program should be able to import (or design it himself) a future plan where buildings are referred to as a cluster of cells with the same value. These clusters are the targets here. Besides this program I develop a series of 3d models from which I generate or define templates (by the use of cells etc). What we want is that from all templates available for that building type, the algorithm goes comparing which template's shape 'fits' best on the shape of the target/cluster. So I need to somehow compare the shape or layout of 1 cluster with n templates and figure out based on form what kind of template fits best on this cluster. It doesnt have to be an exact fit but we want a somehow desirable result.

You can think of it as the games where children have all kinds of objects and a box with shapes cut out and where they try to put the objects in the box ;) , or the way objects can be placed in games like command and conquer, they have to fit somewhere to being able to build. Hope this explanation is a bit more clear to see what I'm trying to make.

Gribouillis 1,391 Programming Explorer Team Colleague · Answer 11 · 2009-01-22T16:50:21+00:00

I found a reference where a compression distance similar as the one I used is described. See http://paginas.fe.up.pt/~%20ssn/2008/Cilibrasi05.pdf. For the formula in the above code, I downloaded a pdf, but it's in french :)

Gribouillis 1,391 Programming Explorer Team Colleague · Answer 12 · 2009-01-22T17:09:38+00:00

I just realized that at the end of the above pdf, there is a link to an existing library for these algorithms, here http://www.complearn.org/. Nothing for python it seems, but if there is a C library it should be wrappable with swig for example.

tillaart36 2 Light Poster · Answer 13 · 2009-01-28T17:01:12+00:00

Ok I went back to my old code and classes and I wanna implement some comparing methods in this code. The goal is to test these different methods so I can see if all do a reasonable job of what I want them to do and to see if I maybe can see what works best for me.

I was impressed with the rotated function you supposed earlier but I'm having some trouble to implement them in my old code because there I use a different class then your array class.

def rotated(self, n):
	# roteer een array n keer (met klok mee).
	n %= 4
	if n == 0:
		L = self
	elif n == 1:
		L = [list(x) for x in zip(*reversed(self))]
	elif n == 2:
		L = [list(reversed(x)) for x in reversed(self)]
	else:
		L = [list(x) for x in reversed(zip(*self))]
	return Array(L)

How should I rewrite this so it works for my Grid class? I specify a grid with width and height and all cell values are 0, then I change some values of them and I want to be able to use the rotated method so I can rotate this grid.

import csv

class Grid(object):
    def __init__(self, width, height):
        self.grid = []
        self.width = width
        self.height = height
        self.length = width * height
        for x in range(self.height):
            col = []
            for y in range(self.width):
                col.append(Cell(x, y, self.grid))
            self.grid.append(col)
    
    def __getitem__(self, (x, y)):
        return self.grid[x][y]
    
    def fillBlock(self, left, right, top, bottom, value):
        for x in range(left, right):
            for y in range(top, bottom):
                self[x, y].value = value
    
    def firstFreeCell(self):
        for y in range(self.height):
            for x in range(self.width):
                if self.grid[x][y].clusterId == -1:
                    return self.grid[x][y]
        return None
    
    def floodFill(self, x, y, clusterId, landUse):
        if (x < 0 or y < 0 or x >= self.width or y >= self.height):
            return
        
        cell = self.grid[x][y]
        cluster = Cluster()
        if (cell.clusterId != -1 or cell.value != landUse):
            return
        
        cell.setClusterId(clusterId)
        cluster.add(cell)
        
        self.floodFill(x-1, y, clusterId, landUse)
        self.floodFill(x+1, y, clusterId, landUse)
        self.floodFill(x, y-1, clusterId, landUse)
        self.floodFill(x, y+1, clusterId, landUse)
    
    def analyze(self):
        freeCell = self.firstFreeCell()
        clusterId = 0
        
        while freeCell != None:
            self.floodFill(freeCell.x, freeCell.y, clusterId, freeCell.value)
            freeCell = self.firstFreeCell()
            clusterId += 1
    
    def printClusterId(self, clusterId):
        for y in range(self.height):
            for x in range(self.width):
                if self.grid[x][y].clusterId == clusterId:
                    print 'ClusterId:', clusterId, '=>', 'Cell-coordinates:', 
                    '(', self.grid[x][y].x, ',', self.grid[x][y].y, ')', 
                    'with a landUse value of:', self.grid[x][y].value
        print "No cells with such clusterId left or the clusterId is not defined yet..."
    
    def load(cls, filename):
        print "Loaded csv file"
        loadGrid = []
        reader = csv.reader(open(filename), delimiter=';')
        for line in reader:
            loadGrid.append(line)
        width = len(loadGrid[0])
        height = len(loadGrid)
        grid = Grid(width, height)
        for x in range(width):
            for y in range(height):
                grid[x, y].value = loadGrid[y][x]
        return grid
    load = classmethod(load)

    def printGrid(self):
        for y in range(self.height):
            for x in range(self.width):
                print self[x, y].value,
            print

    def rotated(self, n):
        n %= 4
        if n == 0:
            L = self
        if n == 1:
            L = [list(x) for x in zip(*reversed(self))]
        return Grid(self.width, self.height)

class Cell(object):
    def __init__(self, x, y, grid):
        self.x = x
        self.y = y
        self.grid = grid
        self.value = 0
        self.clusterId = -1
    
    def setClusterId(self, clusterId):
        self.clusterId = clusterId
    
    def getClusterId(self):
        return self.clusterId

class Cluster(object):
    def __init__(self):
        self.cells = []
    
    def __getitem__(self):
        return self.cells[x][y]

    def add(self, i):
        self.cells.append(i)

my_grid = Grid(6, 6)
my_grid.printGrid()
print ('-'*12)
my_grid.fillBlock(2, 4, 2, 4, 2)
my_grid.printGrid()
print ('-'*12)
my_grid.rotated(0)
my_grid.printGrid()

Gribouillis 1,391 Programming Explorer Team Colleague · Answer 14 · 2009-01-28T21:24:54+00:00

I tried to implement this by defining a method toList which converts your grid into a list of tuples, and a static method fromList which creates a new Grid from a list of tuples. I also exchanged x and y in your method printGrid. I think there is a little ambiguity between rows and columns in your code. Try to work with rectangular grids instead of square grids. However, here is a possible version

import csv

class Grid(object):
    def __init__(self, width, height):
        self.grid = []
        self.width = width
        self.height = height
        self.length = width * height
        for x in range(self.height):
            col = []
            for y in range(self.width):
                col.append(Cell(x, y, self.grid))
            self.grid.append(col)
    
    def __getitem__(self, (x, y)):
        return self.grid[x][y]
    
    def fillBlock(self, left, right, top, bottom, value):
        for x in range(left, right):
            for y in range(top, bottom):
                self[x, y].value = value
    
    def firstFreeCell(self):
        for y in range(self.height):
            for x in range(self.width):
                if self.grid[x][y].clusterId == -1:
                    return self.grid[x][y]
        return None
    
    def floodFill(self, x, y, clusterId, landUse):
        if (x < 0 or y < 0 or x >= self.width or y >= self.height):
            return
        
        cell = self.grid[x][y]
        cluster = Cluster()
        if (cell.clusterId != -1 or cell.value != landUse):
            return
        
        cell.setClusterId(clusterId)
        cluster.add(cell)
        
        self.floodFill(x-1, y, clusterId, landUse)
        self.floodFill(x+1, y, clusterId, landUse)
        self.floodFill(x, y-1, clusterId, landUse)
        self.floodFill(x, y+1, clusterId, landUse)
    
    def analyze(self):
        freeCell = self.firstFreeCell()
        clusterId = 0
        
        while freeCell != None:
            self.floodFill(freeCell.x, freeCell.y, clusterId, freeCell.value)
            freeCell = self.firstFreeCell()
            clusterId += 1
    
    def printClusterId(self, clusterId):
        for y in range(self.height):
            for x in range(self.width):
                if self.grid[x][y].clusterId == clusterId:
                    print 'ClusterId:', clusterId, '=>', 'Cell-coordinates:', 
                    '(', self.grid[x][y].x, ',', self.grid[x][y].y, ')', 
                    'with a landUse value of:', self.grid[x][y].value
        print "No cells with such clusterId left or the clusterId is not defined yet..."
    
    def load(cls, filename):
        print "Loaded csv file"
        loadGrid = []
        reader = csv.reader(open(filename), delimiter=';')
        for line in reader:
            loadGrid.append(line)
        width = len(loadGrid[0])
        height = len(loadGrid)
        grid = Grid(width, height)
        for x in range(width):
            for y in range(height):
                grid[x, y].value = loadGrid[y][x]
        return grid
    load = classmethod(load)

    def printGrid(self):
        for x in range(self.height):
            for y in range(self.width):
                print self[x, y].value,
            print

    def rotated(self, n):
        L = self.toList()
        n %=4 
        if n ==0 :
            pass 
        elif n ==1 :
            L =[list (x )for x in zip (*reversed (L))]
        elif n ==2 :
            L =[list (reversed (x ))for x in reversed (L)]
        else :
            L =[list (x )for x in reversed (zip (*L))]
        return Grid.fromList(L)

    def toList(self):
        return [tuple(cell.value for cell in row) for row in self.grid]

    @staticmethod
    def fromList(L):
        height, width = len(L), len(L[0])
        self = Grid(width, height)
        for x in range(height):
            for y in range(width):
                 self[(x,y)].value = L[x][y]
        return self

class Cell(object):
    def __init__(self, x, y, grid):
        self.x = x
        self.y = y
        self.grid = grid
        self.value = 0
        self.clusterId = -1
    
    def setClusterId(self, clusterId):
        self.clusterId = clusterId
    
    def getClusterId(self):
        return self.clusterId

class Cluster(object):
    def __init__(self):
        self.cells = []
    
    def __getitem__(self):
        return self.cells[x][y]

    def add(self, i):
        self.cells.append(i)

my_grid = Grid(6, 7)
my_grid.printGrid()
print ('-'*12)
my_grid.fillBlock(2, 4, 2, 4, 2)
print
my_grid.printGrid()
print ('-'*12)
my_grid.rotated(0)
my_grid.printGrid()
print
my_grid[(2,4)].value = 3
print(my_grid.toList())
Grid.fromList(my_grid.toList()).printGrid()
for i in range(4):
    print
    my_grid.rotated(i).printGrid()

tillaart36 2 Light Poster · Answer 15 · 2009-01-29T17:14:50+00:00

Hi thanks, for the help, I'm not sure if i could have come up with that but as most of the times when i see the code I understand it.

Yes the printGrid isn't working perfectly I noticed this earlier too. When I define a grid with width = 7 and height = 6 the printGrid should iterate over these coordinates and for each coordinate print the value.

I tried messing around with the loop but I couldnt get it to work yet. It should iterate as (x, y)

0,0 to 6,0
1,0 to 6,1
2,0 to 6,2
3,0 to 6,3
4,0 to 6,4
5,0 to 6,5

but every way I change the loop I can't get this to work. This needs some more thinking :) . Anyways thanks again for the help!

tillaart36 2 Light Poster · Answer 16 · 2009-02-03T18:07:58+00:00

You just have your x and y mixed up:

>>> def printGrid(w,h):
...     for y in xrange(h):
...         for x in xrange(w):
...             print '%d,%d' % (x,y),
...         print
...     
>>> printGrid(w=7,h=6)
0,0 1,0 2,0 3,0 4,0 5,0 6,0
0,1 1,1 2,1 3,1 4,1 5,1 6,1
0,2 1,2 2,2 3,2 4,2 5,2 6,2
0,3 1,3 2,3 3,3 4,3 5,3 6,3
0,4 1,4 2,4 3,4 4,4 5,4 6,4
0,5 1,5 2,5 3,5 4,5 5,5 6,5
>>>

Back from a few days of vacation. Yes this loop will work but when I want to print the cellvalues with the printGrid method it won't. It has to do with the __getitem__ method and it raises an Indexerror : list index out of range error as soon as width and height of the grid arent the same anymore.

...
    def printGrid(self):
        for y in range(self.height):
            for x in range(self.width):
                print self.grid[x][y].value,
            print
...
my_grid = Grid(10, 11)
my_grid.printGrid()

gives an error as soon as it gets to self.grid[0][10].value. So when the y value gets larger then the highest x value in this loop it gives this error.
Is it wrong to use the __getitem__ method for loops with different lengths? I tried to read more about the special __getitem__ method but I get too much technical talk instead of a clear answer.

Thanks in advance

tillaart36 2 Light Poster · Answer 17 · 2009-02-03T20:20:03+00:00

Hmm,

the method works when I code it like:

def printGrid(self):
    for y in range(len(self.grid)):
        for x in range(len(self.grid[0])):
            print self.grid[x][y].value,
        print

if anyone can explain me why it works when i use len(self.grid) for self.height and len(self.grid[0]) for self.width i would like to hear it, i'm kinda confused...

Gribouillis 1,391 Programming Explorer Team Colleague · Answer 18 · 2009-02-03T20:54:32+00:00

I think you should follow, the usual convention in python. When you have a list of lists like this

L = [
  [1, 2],
  [3, 4],
  [5, 6],
  [7, 8],
  [9, 10],
]

then len(L) == 5 and this is also the height of your double array. L[0] is the list [1, 2] which is of length 2, and this is also the width of your double array. If you want to traverse the grid, you can write a loop

for x in range(len(L)):
  for y in range(len(L[0])):
    print L[x][y],
  print

I think it's a bad idea to have fields height and width in your Grid class because you duplicate information. A better way to do it is to define properties like this

class Grid(object):
    ...
    @property
    def height(self):
        return len(self.grid)
    @property
    def width(self):
        return len(self.grid[0])

You can then write my_grid.heigth in methods as if you had a field height in the Grid object.

Gribouillis 1,391 Programming Explorer Team Colleague · Answer 19 · 2009-02-04T13:06:36+00:00

In object oriented programming, a property is a 'dynamic attribute'. It looks like a normal attribute when you use it, but it's actually computed each time that you access it's value. Here is an example of a class with 2 properties

class Square(object):
    def __init__(self, side):
        self.side = side
    @property
    def area(self):
        return self.side ** 2
    @property
    def perimeter(self):
        return 4 * self.side

sq = Square(3)
print(sq.side) # should print 3
print(sq.area) # should print 9
print(sq.perimeter) # should print 12

tillaart36 2 Light Poster · Answer 20 · 2009-02-04T20:30:51+00:00

I've added a levenshtein function in my code to see if it comes up with some same results as my old compare method did.

It seems very much so and I think the results of both are 1:1 with different templates / rotations. But for now I need to compare a target and template in their string representation with this function:

def levenshtein(s1, s2):
	d_curr = range(len(s2) + 1)
	for i, c1 in enumerate(s1):
		d_prev, d_curr = d_curr, [i]
		for j, c2 in enumerate(s2):
			d_curr.append(min(d_prev[j] + (c1 != c2),
			d_prev[j + 1] + 1,
			d_curr[j] + 1))
	print d_curr[len(s2)]

Earlier it was mentioned how to convert a string to an array but now I kinda need to have a function that lets me convert the array to a string, so that this string can be used in the levenshtein function. I know you can do it by some sort of function like this:

def toString(list):
	return "".join(["%s" % el for el in list])

but how do I make it so that an array of arrays can be turned into a string? If this function makes me able to turn the earlier forms of target and template into a string I can then compare templates (csv files) much more easier (now I tried it with some handwritten strings but it would be faster if I can use my old csv templates). I'm looking at your fromString() function and thinking of how I could do the opposite but I'm not quite sure yet :) Also because in your example it was in it's own array class and I want to implement it for my grid class this gives me some troubles.

EDIT:
I now do something like I wanted by this code:

def levenshtein(s1, s2):
	flat_s1 = flatten(s1)
	flat_s2 = flatten(s2)
	string_s1 = toString(flat_s1)
	string_s2 = toString(flat_s2)
	print string_s1
	print string_s2
	d_curr = range(len(s2) + 1)
	for i, c1 in enumerate(s1):
		d_prev, d_curr = d_curr, [i]
		for j, c2 in enumerate(s2):
			d_curr.append(min(d_prev[j] + (c1 != c2),
			d_prev[j + 1] + 1,
			d_curr[j] + 1))
	print d_curr[len(s2)]

def flatten(q):
	flat_q = []
	for x in q:
		if type(x) in (list, tuple):
			flat_q.extend(flatten(x))
		else:
			flat_q.append(x)
	return flat_q

def toString(list):
	return "".join(["%s" % el for el in list])

target = [[0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0],
[0, 1, 1, 0, 0, 0],
[0, 1, 1, 0, 0, 0],
[0, 1, 1, 1, 1, 0],
[0, 0, 0 ,0 ,0 ,0]]

template1a = [[0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0],
[0, 1, 1, 1, 1, 0],
[0, 0, 0, 1, 1, 0],
[0, 0, 0, 1, 1, 0],
[0, 0, 0, 0, 0, 0]]

template1d = [[0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 1, 0],
[0, 0, 0, 1, 1, 0],
[0, 1, 1, 1, 1, 0],
[0, 1, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 0]]

template4b = [[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0],
[0, 1, 1, 1, 1, 0],
[0, 1, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 0]]

template8c = [[0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0],
[0, 0, 0, 1, 1, 0],
[0, 0, 0, 1, 1, 0],
[0, 1, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 0]]

levenshtein(target, template1a)
levenshtein(target, template1d)
levenshtein(target, template4b)
levenshtein(target, template8c)

Is this a good way of doing it? I first flatten the list as proposed earlier in this thread. This flattened list I convert to a string and this string I put in the levenshtein function. It would be nice though if I could still see in the string where a line breaks.

tillaart36 2 Light Poster · Answer 21 · 2009-03-17T15:51:11+00:00

Ok,

so in the meantime I've worked some more on this script and it's functionality. I now need to be able to use files from lists which I define in another script.

# example walking a directory/folder
# create a list of all files with a given extension in
# a given directory and all its subdirectories (full path name)

import os

def walk_dir(root_dir, extension):
    """
    Walks the specified directory root and all its subdirectories
    and returns a list containing all files with extension ext
    """
    file_list = []
    #dir_list = []
    towalk = [root_dir]
    while towalk:
        root_dir = towalk.pop()
        for path in os.listdir(root_dir):
            path = os.path.join(root_dir, path).lower()
            if os.path.isfile(path) and path.endswith(extension):
                file_list.append(path)
            elif os.path.isdir(path):
                #dir_list.append(path)
                towalk.append(path)
    return file_list


# use the root directory of your choice
path = os.getcwd()
living_types = [name for name in os.listdir(path) if os.path.isdir(os.path.join(path, name)) ]
root_dir = r"G:\Afstuderen\Library\1"
extension = '.csv'
csv_list_1 = walk_dir(root_dir, extension)
csv_list_1.sort()
extension = '.max'
max_list_1 = walk_dir(root_dir, extension)
max_list_1.sort()
root_dir = r"G:\Afstuderen\Library\2"
extension = '.csv'
csv_list_2 = walk_dir(root_dir, extension)
csv_list_2.sort()
extension = '.max'
max_list_2 = walk_dir(root_dir, extension)
max_list_2.sort()
root_dir = r"G:\Afstuderen\Library\3"
extension = '.csv'
csv_list_3 = walk_dir(root_dir, extension)
csv_list_3.sort()
extension = '.max'
max_list_3 = walk_dir(root_dir, extension)
max_list_3.sort()
root_dir = r"G:\Afstuderen\Library\4"
extension = '.csv'
csv_list_4 = walk_dir(root_dir, extension)
csv_list_4.sort()
extension = '.max'
max_list_4 = walk_dir(root_dir, extension)
max_list_4.sort()

This is a file where I make lists of csv files and max files in subdirectories, and a list of the subdirectories. What do I need to do that I can use these lists in another file? I did something like this:

import os, glob
import csv
import sys

sys.path.append('G:\Afstuderen\Library')
import data_structure_6

print csv_list_1

But it returns the error:

Traceback (most recent call last):
File "<string>", line 11, in ?
File "niks.py", line 8, in ?
print csv_list_1
NameError: name 'csv_list_1' is not defined

Am I missing something here? Thanks in advance for the help...

tillaart36 2 Light Poster · Answer 22 · 2009-03-17T16:11:02+00:00

Oops found it,

I had to tell python from data_structure_6 import *

I thought when importing a file you could allready use all defined lists in this file but I guess I was wrong :P

Gribouillis 1,391 Programming Explorer Team Colleague · Answer 23 · 2009-03-17T16:25:40+00:00

from data_structure_6 import cvs_list_1

is considered better (import only the symbols that you need if possible). The import * may import many names which you don't need.

tillaart36 2 Light Poster · Answer 24 · 2009-03-17T16:36:53+00:00

tillaart36 2 Light Poster

16 Years Ago

Ok I understand, thanks :D

Comparing lists and likeness

Recommended Answers Collapse Answers

All 33 Replies

Recommended Answers