I've made a program that can scan through a plain text document for the string "SA001:" and if it's not there, it ads it in at the top. But the problem comes with the saving to the document: It adds some things to the document that I don't know how to get rid of (to see what i mean, point the program to any txt file, tell it to modify it, and then look at the output. ) Also, I'd like it to print a carriage return to the txt file, which I don't know how to do...

Heres the code:

import pickle
import random

saFlag = False  #<--- flag

saLessFile = raw_input("File to scan? ")
#opens the selected file
textf = open(saLessFile, 'r')
str1 = textf.read()

#makes every word in file a list
wordlist = str1.split(None)

for i in range(0, len(wordlist)):
  print wordlist[i],

print "\n\nScan loop follows:"

for i in range(0, len(wordlist)):
  if (wordlist[i].lower() == "sa001:"):
    print "SA001 found:"
    saFlag = True
    for j in range(i, (i+7)):
      print wordlist[j],
    print "..."

if saFlag:
  print "No altering needed\n"

  print "SA not found turn document into the following?"
  wordlist.insert(0, "SA001: This is a test, blah blah blah blah blah!\n")

  #turns the list into a string with a space inbetween each word
  joinList = " ".join(wordlist)
  print joinList

  #asks user if they are sure they wish to modify
  sure = raw_input("Modify?(y or n): ")

  if sure == "y" or sure == "Y":
    #saves the new string over the old file's words, essentially modifying the contents
    file = open(saLessFile, "w")
    pickle.dump(joinList, file)
    print "modified..."
  elif sure == "n" or sure == "N":
10 Years
Discussion Span
Last Post by micdareall

Your problem is the use of the pickle module. Your using it to dump out the plain text data to a file and that is not the purpose of pickling.. Read up on what pickling actually is.

If you replace

pickle.dump(joinList, file)



It will work to an extent except the fact that you lose all newline characters from the original text.

I would also make the recommendation that you use the re module to do a search through the file for the text. Or atleast keep a copy of the original text before you split it and then just concatenate the 'sa001:' with the original file content before dumping it all out to the file.

edit: '\n' is the newline character.


You can also read all of the file into memory, process it once to search for the string, and then process it a second time to write it to a file if necessary.

data = open(filename, "r").readlines()
found = 0
for rec in data:
   if ("sa001:" in rec) or ("SA001:") in rec:
      print "SA001 found"
      found =1
## file only changes if string is not found
if not found:
   fp = open(filename+".2", "w")
   fp.write( "SA001:\n")          ## add this line
   for rec in data:                       ## original data

With the re module the whole project would be as simple as this...

import re
mFilename = raw_input("File to scan? ")
mFile= open(mFilename).read()

print "Searching File..."
#the re.findall function
    #first parameter : string/regular expression/pattern to search for
    #second parameter: string to search in
    #third is optional: I passed in re.ignorecase to tell the module to ignore the case
if not re.findall('SA001:',mFile,re.IGNORECASE):
        #'SAOO1: is not in the file
      print "Text not found..."
      sure = raw_input("Modify?(y or n): ")
      if sure == "y" or sure == "Y":
          print "File Modified..."      
    #'SAOO1: is already in the file
    print "No Altering needed."
This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.