I have a script here that takes a bunch of diff .txt files and plugs&chugs what's in one file into a master "template.txt" list. It basically replaces what's in the [BRACKETS] in the master template list with the other data files. Here's my code:

import re, sys

class Template:
        def __init__(self, text):
                self.text =  text
                self.count = 0
                self.tags = []
                self.sentences = set()
        def __str__(self):
                return "%s:%s:%s" % (self.count, ";".join(self.tags), self.text)

FOLDER = "/projects/Python/"

with open (FOLDER+"templates.txt") as myfile:
    templates = [Template(e.strip()) for e in myfile]

tagdict = {}
for (i, template) in enumerate(templates):
        tags = re.findall (r'\[[^\]]+\]', template.text)
        template.count = len(tags)
        template.tags = tags

        for tag in tags:
                tag = tag[1:-1]
                with open (FOLDER+tag+".txt") as tagfile:
                        tagdict[tag] = [e.strip() for e in tagfile]

for template in templates:
        lengths = []
        for tag in template.tags:
                l = len(tagdict[tag[1:-1]])
                if l > 0: lengths.append(l)
        mintag = min(lengths)

        for i in range(mintag):
                sentence = template.text
                for tag in template.tags:
                        sentence = sentence.replace(tag, tagdict[tag[1:-1]].pop(0))

        for sentence in sorted(template.sentences):
                print sentence

[NAME] lives on [STREET].

Best Western
Holiday Inn

Black Angus

Right now, it would only give me two outputs:
Best Western and Denny's.
Holiday Inn and Applebee's.

because it searches through the minimum list and stops there.. how can I make it so that it uses the maximum of the lists (in this case, RESTAURANT), and then for HOTEL just have it loop through? I tried using maxtag = max(lengths) instead of min(lengths)... but it doesn't seem to be working. Anyone help?

The problem is more one of viewpoint, not coding. You want to get away from the word processing mentality. There is no such thing as lines in a document in a folder (unless the programmer first creates them). We have data (bytes) in two files that we can use or order in any way we want. You want to use the longest file as the primary file and couple that with the records in the shorter file, as far as they will go anyway. Eliminating all of the other code in your program, which is difficult to decipher because there is no explanation of anything anywhere, we could use something like the following.

## simulate reading a file with readlines()
hotel_txt_list = [ "Best Western\n",
                   "Holiday Inn" ]

res_txt_list = [ "Denny's\n",
                 "Black Angus" ]

## find the longest list and use it as the primary list
first_list = []
second_list = []
if len(hotel_txt_list) > len(res_txt_list):
    first_list = hotel_txt_list  ## this is a reference, not a copy
    second_list = res_txt_list
    first_list = res_txt_list
    second_list = hotel_txt_list

##  loop through all of the records of the longest list and match
##  them to the records of the shorter list
stop_2 = len(second_list)
for ctr, rec in enumerate(first_list):
    if ctr < stop_2:   ## matching record found in 2nd list
        print "%-20s" % (second_list[ctr].strip()),
        print "No matching record  ", 
    print first_list[ctr].strip()
This article has been dead for over six months. Start a new discussion instead.