I've made a program that can scan through a plain text document for the string "SA001:" and if it's not there, it ads it in at the top. But the problem comes with the saving to the document: It adds some things to the document that I don't know how to get rid of (to see what i mean, point the program to any txt file, tell it to modify it, and then look at the output. ) Also, I'd like it to print a carriage return to the txt file, which I don't know how to do...

Heres the code:

import pickle
import random

saFlag = False  #<--- flag

saLessFile = raw_input("File to scan? ")
#opens the selected file
textf = open(saLessFile, 'r')
str1 = textf.read()
textf.close()

#makes every word in file a list
wordlist = str1.split(None)

for i in range(0, len(wordlist)):
  print wordlist[i],

print "\n\nScan loop follows:"

for i in range(0, len(wordlist)):
  if (wordlist[i].lower() == "sa001:"):
    print "SA001 found:"
    saFlag = True
    for j in range(i, (i+7)):
      print wordlist[j],
    print "..."

if saFlag:
  print "No altering needed\n"

else:
  print "SA not found turn document into the following?"
  wordlist.insert(0, "SA001: This is a test, blah blah blah blah blah!\n")

  #turns the list into a string with a space inbetween each word
  joinList = " ".join(wordlist)
  print joinList

  #asks user if they are sure they wish to modify
  sure = raw_input("Modify?(y or n): ")

  if sure == "y" or sure == "Y":
    #saves the new string over the old file's words, essentially modifying the contents
    file = open(saLessFile, "w")
    pickle.dump(joinList, file)
    file.close()
    print "modified..."
  elif sure == "n" or sure == "N":
    quit

Recommended Answers

All 3 Replies

Your problem is the use of the pickle module. Your using it to dump out the plain text data to a file and that is not the purpose of pickling.. Read up on what pickling actually is.
http://docs.python.org/lib/module-pickle.html

If you replace

pickle.dump(joinList, file)

with...

file.write(joinList)

It will work to an extent except the fact that you lose all newline characters from the original text.


I would also make the recommendation that you use the re module to do a search through the file for the text. Or atleast keep a copy of the original text before you split it and then just concatenate the 'sa001:' with the original file content before dumping it all out to the file.


edit: '\n' is the newline character.

You can also read all of the file into memory, process it once to search for the string, and then process it a second time to write it to a file if necessary.

data = open(filename, "r").readlines()
found = 0
for rec in data:
   if ("sa001:" in rec) or ("SA001:") in rec:
      print "SA001 found"
      found =1
      
## file only changes if string is not found
if not found:
   fp = open(filename+".2", "w")
   fp.write( "SA001:\n")          ## add this line
   for rec in data:                       ## original data
      fp.write(rec)
   fp.close()

With the re module the whole project would be as simple as this...

import re
mFilename = raw_input("File to scan? ")
mFile= open(mFilename).read()


print "Searching File..."
#the re.findall function
    #first parameter : string/regular expression/pattern to search for
    #second parameter: string to search in
    #third is optional: I passed in re.ignorecase to tell the module to ignore the case
if not re.findall('SA001:',mFile,re.IGNORECASE):
        #'SAOO1: is not in the file
      print "Text not found..."
      sure = raw_input("Modify?(y or n): ")
      if sure == "y" or sure == "Y":
          open(mFilename,'w').write("SA001:\n"+mFile)
          print "File Modified..."      
else:
    #'SAOO1: is already in the file
    print "No Altering needed."
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.