I have to get three pieces of information from the files i have
1. The name of the owner of the file which appears after the pattern tag
<foaf:name>
2. The ID of the owner of the file which is embedded in the filename, e.g.,
http%3A%2F%2Ftalk.ie%2Fvbulletin%2Ffoaf.php%3Fu%3D12
belongs to user 12. The pattern Fu%3D always appears before the ID in the
filename and not anywhere else in the file name.
3. The people known by the owner of the file, this may be more than one
person. The pattern foaf.php?u can be used to find these IDs

Here is the dir2.txt file:
http%3A%2F%2Ftalk.ie%2Fvbulletin%2Ffoaf.php%3Fu%3D12.txt
http%3A%2F%2Ftalk.ie%2Fvbulletin%2Ffoaf.php%3Fu%3D4.txt
http%3A%2F%2Ftalk.ie%2Fvbulletin%2Ffoaf.php%3Fu%3D374.txt
http%3A%2F%2Ftalk.ie%2Fvbulletin%2Ffoaf.php%3Fu%3D103.txt
http%3A%2F%2Ftalk.ie%2Fvbulletin%2Ffoaf.php%3Fu%3D57.txt
http%3A%2F%2Ftalk.ie%2Fvbulletin%2Ffoaf.php%3Fu%3D98.txt

Here is an example of what is in one of these files:

<?xml version="1.0" encoding="iso-8859-1" ?>
<rdf:RDF
      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
      xmlns:foaf="http://xmlns.com/foaf/0.1/">
<foaf:name>Donna</foaf:name>
<foaf:nick>Donna</foaf:nick>
<foaf:knows>
   <foaf:Person rdf:about="http://talk.ie/vbulletin/foaf.php?u=21#person">
      <foaf:nick>Kath</foaf:nick>
   </foaf:Person>
  </foaf:knows>
   <foaf:Person rdf:about="http://talk.ie/vbulletin/foaf.php?u=3673#person">
      <foaf:nick>Mick</foaf:nick>
   </foaf:Person>
  </foaf:knows>
</foaf:Person>

This code takes the user id's and stores them in an output file and i need to edit it so that it extracts the 3 pieces of info needed:

def add_userid(filename):
   currfile = open(filename)

   searchterm="foaf.php?u="
   length=len(searchterm)

   userid = ''
   for line in currfile:
      found = line.find(searchterm)
      if found != -1:
         position = found + length
         i = position
         while i < len(line) and line[i] != '#':
            userid = userid + line[i]
            i += 1
         print "ID of user is", userid
         writetofile("output.txt", userid)

def writetofile(filename, userid):
   currfile = open(filename, 'a+')

   currfile.write(userid)
   currfile.write('\n') # to save each ID on a line

   currfile.close()

def readfiles(filename):
   filelist = open(filename)
   for line in filelist:
      files = line[:-1]
      print files
      add_userid(files)

>>> readfiles('dir2.txt')
http%3A%2F%2Ftalk.ie%2Fvbulletin%2Ffoaf.php%3Fu%3D12.txt
ID of user is 21
ID of user is 213673
http%3A%2F%2Ftalk.ie%2Fvbulletin%2Ffoaf.php%3Fu%3D4.txt
ID of user is 98
ID of user is 98194
ID of user is 98194265
ID of user is 98194265343
ID of user is 98194265343393
ID of user is 98194265343393585
ID of user is 98194265343393585851
ID of user is 981942653433935858511026
ID of user is 9819426534339358585110261163
ID of user is 98194265343393585851102611631172
ID of user is 981942653433935858511026116311721353
ID of user is 9819426534339358585110261163117213531955
ID of user is 98194265343393585851102611631172135319552160
ID of user is 981942653433935858511026116311721353195521602300
ID of user is 9819426534339358585110261163117213531955216023002563
ID of user is 98194265343393585851102611631172135319552160230025633091
ID of user is 981942653433935858511026116311721353195521602300256330913116
ID of user is 9819426534339358585110261163117213531955216023002563309131163289
ID of user is 98194265343393585851102611631172135319552160230025633091311632894091
ID of user is 981942653433935858511026116311721353195521602300256330913116328940915013
ID of user is 9819426534339358585110261163117213531955216023002563309131163289409150135419
ID of user is 98194265343393585851102611631172135319552160230025633091311632894091501354196202
http%3A%2F%2Ftalk.ie%2Fvbulletin%2Ffoaf.php%3Fu%3D374.txt
ID of user is 4
ID of user is 435
ID of user is 43544
ID of user is 4354448
ID of user is 435444852
ID of user is 43544485254
ID of user is 4354448525473
ID of user is 435444852547398
ID of user is 435444852547398108
ID of user is 435444852547398108109
ID of user is 435444852547398108109111
ID of user is 435444852547398108109111136
ID of user is 435444852547398108109111136156
http%3A%2F%2Ftalk.ie%2Fvbulletin%2Ffoaf.php%3Fu%3D103.txt
ID of user is 17954
ID of user is 179544
ID of user is 17954498
http%3A%2F%2Ftalk.ie%2Fvbulletin%2Ffoaf.php%3Fu%3D57.txt
ID of user is 4
ID of user is 41724
ID of user is 4172498
http%3A%2F%2Ftalk.ie%2Fvbulletin%2Ffoaf.php%3Fu%3D98.txt
ID of user is 422
ID of user is 42259856

It gives the first id correctly but then adds the next one it finds to it. e.g
ID of user is 422
ID of user is 42259856
When it should be
ID of user is 422
ID of user is 59856

Recommended Answers

All 3 Replies

At the for loop, you need to add a userid = "" or a del userid I would do it right after line 8 of your code.

This code takes the owners name, the owners id, and the id's of the people the owner knows:

def add_userid(filename):
   currfile = open(filename)
   searchterm="foaf.php?u="
   length=len(searchterm)

   userid = ''
   for line in currfile:
      userid = 'Knows user with ID:'
      found = line.find(searchterm)
      if found != -1:
         position = found + length
         i = position
         while i < len(line) and line[i] != '#':
            userid = userid + line[i]
            i += 1
         print userid
         writetofile("output.txt", userid)

def writetofile(filename, userid):
   currfile = open(filename, 'a+')
   currfile.write(userid)
   currfile.write('\n') # to save each ID on a line
   currfile.close()

def readfiles(filename):
   filelist = open(filename)
   for line in filelist:
      files = line[:-1]
      print files
      add_ownerid(filename)
      add_ownername(files)
      add_userid(files)

def add_ownerid(filename):
   currfile = open(filename)
   searchterm="Fu%3D"
   length=len(searchterm)
   
   ownerid = ''
   for line in currfile:
      ownerid = 'OwnerID is:'
      found = line.find(searchterm)
      if found != -1:
         position = found + length
         i = position
         while i < len(line) and line[i] != '.txt':
            ownerid = ownerid + line[i]
            i += 1
            ownerid = ownerid.replace('.txt','') #gets rid of .txt
         print ownerid
         writetofile("output.txt", ownerid)

def add_ownername(filename):
   currfile = open(filename)
   searchterm="<foaf:name>"
   length=len(searchterm)
   
   ownername = ''
   for line in currfile:
      ownername = 'Owner name is:'
      found = line.find(searchterm)
      if found != -1:
         position = found + length
         i = position
         while i < len(line) and line[i] != '<':
            ownername = ownername + line[i]
            i += 1
         print ownername
         writetofile("output.txt", ownername)

This is what it sends to the output.txt:

OwnerID is:12

OwnerID is:4

OwnerID is:374

OwnerID is:103

OwnerID is:57

OwnerID is:98

Owner name is:Donna
Knows user with ID:21
Knows user with ID:3673
OwnerID is:12

OwnerID is:4

OwnerID is:374

OwnerID is:103

OwnerID is:57

OwnerID is:98

Owner name is:Gerard
Knows user with ID:98
Knows user with ID:194
Knows user with ID:265
Knows user with ID:343
Knows user with ID:393
Knows user with ID:585
Knows user with ID:851
Knows user with ID:1026
Knows user with ID:1163
Knows user with ID:1172
Knows user with ID:1353
Knows user with ID:1955
Knows user with ID:2160
Knows user with ID:2300
Knows user with ID:2563
Knows user with ID:3091
Knows user with ID:3116
Knows user with ID:3289
Knows user with ID:4091
Knows user with ID:5013
Knows user with ID:5419
Knows user with ID:6202
OwnerID is:12

OwnerID is:4

OwnerID is:374

OwnerID is:103

OwnerID is:57

OwnerID is:98

Owner name is:rob
Knows user with ID:4
Knows user with ID:35
Knows user with ID:44
Knows user with ID:48
Knows user with ID:52
Knows user with ID:54
Knows user with ID:73
Knows user with ID:98
Knows user with ID:108
Knows user with ID:109
Knows user with ID:111
Knows user with ID:136
Knows user with ID:156
OwnerID is:12

OwnerID is:4

OwnerID is:374

OwnerID is:103

OwnerID is:57

OwnerID is:98

Owner name is:Ann
Knows user with ID:17954
Knows user with ID:4
Knows user with ID:98
OwnerID is:12

OwnerID is:4

OwnerID is:374

OwnerID is:103

OwnerID is:57

OwnerID is:98

Owner name is:rob
Knows user with ID:4
Knows user with ID:1724
Knows user with ID:98
OwnerID is:12

OwnerID is:4

OwnerID is:374

OwnerID is:103

OwnerID is:57

OwnerID is:98

Owner name is:David
Knows user with ID:422
Knows user with ID:59856

But i want my output to be, just for example:
OwnerID is:12
Owner name is: Donna
Knows user with ID:21
Knows user with ID:3673

OwnerID is:98
Owner name is: David
Knows user with ID:422
Knows user with ID:59856

etc

How can i do this? The code all works but its just not doing what i want
Thanks

You want to return the string found from each of the functions, instead of writing it to a file. Then, if conditions are met, write them all to the file. So readfiles() would be similar to the following using your existing structure.

def readfiles(filename):
   filelist = open(filename)
   write_owner = ""
   write_name = ""
   for line in filelist:
      files = line[:-1]
      print files
      owner = add_ownerid(filename)
      if len(owner):      ## owner was found in the line
         write_owner = owner
      name = add_ownername(files)
      if len(name):
         write_name = name
      user = add_userid(files)
      ##  assumes that user id is the cut off point
      if len(user) and len(write_owner):
         write_data(write_owner, write_name, user)
         write_owner = ""
         write_name = ""
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.