0

Hello!

I would like to sort (by number) a file of data that looks like below:

ABGH SDFDS 123
SDFS sDF 12
...

Can somebody help me?

Thanks

4
Contributors
7
Replies
8
Views
8 Years
Discussion Span
Last Post by bvdet
0

Maybe it's not so clear what I wrote. I have a file that contains three columns -two with names and third with numbers. I just want to sort lines according to numbers.

1

You can split and create new list with the numbers as the first part of the rec and then use the builtin sort method, or convert to a list of lists as below. Whichever is easier in this case.

data_as_lists = [ line.strip().split() for line in data ]     ## convert to a list of lists
data_as_lists.sort(key=operator.itemgetter(-1))  ## sort on last element
for el in data_as_lists :            ## print each item in the list
   print el
Votes + Comments
nice code
0

Maybe it's not so clear what I wrote. I have a file that contains three columns -two with names and third with numbers. I just want to sort lines according to numbers.

Yes, but as wooee says (though I don't really understand his code), you need to split into a list before you can sort it. I'd personally split it at '\n' and then again at ' ', so you have a list of lists, and can sort the data according to column 1, 2, or 3 based on array indexes. :)

Actually, I think wooee is doing exactly that. Read more here: http://wiki.python.org/moin/HowTo/Sorting

0

I guess the above example can appear obfuscated to some, so here is another sort. This sorts the number as an integer, not a string.

##   simulate reading data from file
data = [ "ABGH SDFDS 123\n", "SDFS sDF 12\n", "DEF SDFDS 124\n",
         "ABGH SDFDS 2\n", "SDFS sDF 10\n", "DEF SDFDS 100\n",
         "DEF SDFDS 1\n"]

stop_k = len(data)
stop_j = stop_k - 1
for j in range(0, stop_j):
   rec=data[j]
   substrs = rec.split()
   lowest = int(substrs[2])  ## lowest number so far
   print "starting rec is", lowest, substrs
   ctr = j  ## contains the element number of the smallest value
   for k in range(j+1, stop_k):
      rec=data[k]  ## check the next rec until the end
      substrs = rec.split()
      this_test = int(substrs[2])
      print "   comparing", lowest, this_test
      ## data sorted as an integer
      if this_test < lowest:
         ctr=k  ## store lowest element's number and swap after all recs have been checked
         lowest = this_test ## this will now be compared to whatever recs are left
   ## swap the values once after all comparisons have been made
   if j != ctr:
      data[j], data[ctr] = data[ctr], data[j]

print "\n", data
0

And for completeness, creating a second list of lists, with the int as the first element so you can use Python's built in sort.

##   simulate reading data from file
data = [ "ABGH SDFDS 123\n", "SDFS sDF 12\n", "DEF SDFDS 124\n",
         "ABGH SDFDS 2\n", "SDFS sDF 10\n", "DEF SDFDS 100\n",
         "DEF SDFDS 1\n"]

sort_list = []
for rec in data:
   substrs = rec.split()
   sort_list.append( [int(substrs[2]), rec] )

print "sort_list original =", sort_list
sort_list.sort()
print "\nsort_list sorted =", sort_list
0

Here is another way, also using Python's list sort method.

data = [ "ABGH SDFDS 123\n", "SDFS sDF 12\n", "DEF SDFDS 124\n",
         "ABGH SDFDS 2\n", "SDFS sDF 10\n", "DEF SDFDS 100\n",
         "DEF SDFDS 1\n", "XYZ DXDER 0\n", "SDFS aDF 12\n"]

def sort_on_col(a, b):
    # function to sort three column data (2,1,0)
    # column 2 is integer
    x = cmp(int(a.split()[2]), int(b.split()[2]))
    if not x:
        y = cmp(a.split()[0], b.split()[0])
        if not y:
            return cmp(a.split()[1], b.split()[1])
        return y
    return x

data.sort(sort_on_col)

for item in data:
    print item,

>>> XYZ DXDER 0
DEF SDFDS 1
ABGH SDFDS 2
SDFS sDF 10
SDFS aDF 12
SDFS sDF 12
DEF SDFDS 100
ABGH SDFDS 123
DEF SDFDS 124
This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.