Hi everyone, I'm new here so I hope I've put this in the right place.

I'm looking for a little help with a python program I've written. I'm trying to create a timeline from SQLite databases. I take information from a calls database, then an sms database, write those to a csv file which I then want to sort based on the date column.

What the program should do:

  • take data from an SQLite table
  • print this data to a csv file
  • read the data from that csv file again
  • sort the data on the unix timestamp column
  • print the newly sorted data to a new csv file

The problem: if I try to print out the data to the new csv file and split each line using a for loop, the sorted function seems to have truncated a line so there are not as many variables in the line as I have told the program there are. I have no idea why it's doing this. I've double and triple checked that the original csv file is correct. Any help would be greatly appreciated

#!/usr/bin/env python

import csv,sys,operator,os, sqlite3, time

f = open('unsorted.csv','w')
g = open('timeline.csv','w')
h = open('blah.txt','w')


con = sqlite3.connect("/mnt/analysis/data/data/com.android.providers.contacts/databases/contacts2.db") 
c = con.cursor()
c.execute('select date,number,_id,type from calls')

for row in c:
	date,number,callid,calltype = row

	#convert date to human readable format
	realdate = time.ctime(date/1000)

	#interpret call type
	realtype = "none"
	calltype = str(calltype)
	if calltype == "1":
		realtype = "Call received"
	if calltype == "2":
		realtype = "Call made"
	if calltype == "3":
		realtype = "Missed call"
	
	f.write(str(date) + "," + realdate + "," + str(callid) + "," + str(realtype) + "," + str(number) + "\n")

con = sqlite3.connect("/mnt/analysis/data/data/com.android.providers.telephony/databases/mmssms.db") 
c = con.cursor()
c.execute('select date,address,_id,type from sms')

for row in c:
	date,address,callid,calltype = row

	realdate = time.ctime(date/1000)
	realtype = ""
	calltype = str(calltype)
	if calltype == "1":
		realtype = "Text received"
	if calltype == "2":
		realtype = "Text sent"
	if calltype == "3":
		realtype = "Draft text"
	

	f.write(str(date) + "," + realdate + ","  + str(callid) + "," + realtype + "," +  str(address) + "\n")

	
data = csv.reader(open('unsorted.csv'),delimiter = ',')

#itemgetter is 0 because the unix timestamp is in the 1st column
sortedlist = sorted(data,key=operator.itemgetter(0),reverse=False)

#take the info line by line from sortedlist and separate into variables so they can be printed to a csv
g.write("Unix timestamp,Date,ID,Type, Misc\n")
for item in sortedlist:
	unix,date,iden,calltype,misc = item
	g.write(unix + "," + date + "," + iden + "," + calltype + "," + misc + "\n")
	
#just print the sortedlist without splitting lines into variables	
for item in sortedlist:
	h.write(str(item))
	h.write("\n")

I get the following error:
unix,date,iden,calltype,misc = item
ValueError: need more than 2 values to unpack

From what I understand, this means that I've told it to split item into 5 variables but there's only 2 being supplied

Here's the end of the unsorted csv file, so you can see where the call data ends and the sms starts: (I've put X's instead of the real phone numbers to print it here)
1302971105851,Sat Apr 16 17:25:05 2011,3318,Call received,XXXXXXXX
1302991884760,Sat Apr 16 23:11:24 2011,3319,Call received,XXXXXXX
1303018741819,Sun Apr 17 06:39:01 2011,3320,Call made,XXXXXXXX
1104770564325,Mon Jan 3 16:42:44 2005,1,Text received,From3
1104770734612,Mon Jan 3 16:45:34 2005,2,Draft text,None
1312544050806,Fri Aug 5 12:34:10 2011,3,Text received,XXXXX

And now the sorted list:

You can see that the 2nd line is truncated, which is why I'm getting the ValueError, because that line only contains 2 variables, not the 5 I expect it to contain. Also, the third sms message doesn't appear in the sorted list for some reason.....

Sorry this has been such a long post but I really need help with this so want to make it really clear what the problem is

Thanks for any help!

I have no problem with your sample info with simple Python way without unnecessary complications:

data = """
1302971105851,Sat Apr 16 17:25:05 2011,3318,Call received,XXXXXXXX
1302991884760,Sat Apr 16 23:11:24 2011,3319,Call received,XXXXXXX
1303018741819,Sun Apr 17 06:39:01 2011,3320,Call made,XXXXXXXX
1104770564325,Mon Jan 3 16:42:44 2005,1,Text received,From3
1104770734612,Mon Jan 3 16:45:34 2005,2,Draft text,None
1312544050806,Fri Aug 5 12:34:10 2011,3,Text received,XXXXX""".splitlines()

data = sorted(d.split(',') for d in data if ',' in d)
for d in data:
    assert len(d) == 5
    print('%14s%26s%5s%16s%10s' % tuple(d))

""" Output:
 1104770564325   Mon Jan 3 16:42:44 2005    1   Text received     From3
 1104770734612   Mon Jan 3 16:45:34 2005    2      Draft text      None
 1302971105851  Sat Apr 16 17:25:05 2011 3318   Call received  XXXXXXXX
 1302991884760  Sat Apr 16 23:11:24 2011 3319   Call received   XXXXXXX
 1303018741819  Sun Apr 17 06:39:01 2011 3320       Call made  XXXXXXXX
 1312544050806   Fri Aug 5 12:34:10 2011    3   Text received     XXXXX
"""

Edited 5 Years Ago by pyTony: n/a

Thanks for the reply. I think I understand the loop but how do I actually fill the "data" variable? I'm reading my data from a csv file, not hard coding it like your example. So I tried:

data = csv.reader(open('unsorted.csv'),delimiter = ',')
data.splitlines()

but I get an error saying csv.reader object has no splitlines attribute. I get a similar error if I try

data = open('unsorted.csv')
data.splitlines()

If I leave out the data.splitlines(), I don't get an error but nothing is printed to the terminal.

Thanks

Did you read the documentation?

http://docs.python.org/library/csv.html#csv.reader

Return a reader object which will iterate over lines in the given csvfile
...
Each row read from the csv file is returned as a list of strings. No automatic data type conversion is performed.

So, there you go. list objects have no method splitlines, that's why you're getting an error.

For methods you can use, read
http://docs.python.org/library/csv.html#reader-objects

Reading documentation properly is a skill that's very important.

Edited 5 Years Ago by Enalicho: n/a

Hi, I did read through the documentation and I understand that I get the error because I can't call that method on a csv reader object. What I don't understand is what alternative I can use.

I've tried to merge pyTony's code with mine. So I have:

data = csv.reader(open('unsorted.csv',delimiter = ',')
data = sorted(d.split9',') for d in data if ',' in d)
for d in data:
         assert len(d) == 5
         print('%14s%26s%5s%16s%10s' % tuple(d))

When I try this, nothing gets printed. If I print 'data' before the for loop, I get a printout of [] (guessing this is just an empty list). So my problem I'm assuming is the way I'm reading in the data from the csv but I don't understand how to read it properly.

data = csv.reader(open('unsorted.csv'),delimiter = ',')

What do you think this does? I'll give you a clue, it doesn't do what you want it to. Read the csv reader documentation _again_, especially the part I quoted.

Try this to see what you get -

data = csv.reader(open('unsorted.csv','rb'),delimiter = ',')
for row in data:
    print row

Hopefully that will help clear things up for you.

This article has been dead for over six months. Start a new discussion instead.