Comparing somewhat irregular data:

Question

dnamgyel

15 Years Ago

Input1:
[འབྲུག་གི་རང་ལུགས་འཆམ་།] [དུས་རབས་བརྒྱད་པའི་ནང་] [གཏེར་འཆམ་དང་] [པད་གླིང་གིང་གསུམ་] [ལ་སོགས་པའི་འཆམ་གྱི་རིགས་ཚུ་ད་ལྟོ་བར་ན་ཡང་] [ཡོངས་གྲགས་སྦེ་] [རྐྱབ་སྲོལ་ཡོདཔ་ཨིན་།] ->actual file (requiring special font support so I have modified the Inputs)
[AB'C'DEF'GH'I'] [JKL'MN'O'|] and so on ..........[ ......|]

Input2:
འབྲུག་ གི་ རང་ལུགས་ འཆམ་ ། གཏེར་འཆམ་ དང་ པད་གླིང་གིང་ གསུམ་ ལ་སོགས་ པའི་ འཆམ་ གྱི་ རིགས་ ཚུ་ ད་ལྟོ་ བར་ན་ ཡང་ ཡོངས་གྲགས་ སྦེ་ རྐྱབ་སྲོལ་ ཡོདཔ་ཨིན་ ། ->actual file

AB'C' DEF' GH'I'| JKL' MN'O'| ...............and so on
I have two inputs in the form of lists that are somewhat irregular. First input is a list of phrases whose content is the same as the second content except for the word boundaries that are missing in it. Now to get the output in the same format as the Input1 along with the word boundaries, I need to compare between the two inputs. I tried doing this:

next_count=0
for p in item1: #list item from Input1
   for q in p:
     t=q.count(' ' ')
     fout.write(str(t))
     for k in item2:#list item from Input2
       next_count=next_count+t
       fout.write('[')
       for v in k:
           for v in range(0, next_count):
               text1_in.append(v)
               fout.write(text1_in)
               fout.write(']')

My motive was to count the syllables by finding no of occurrences of this ' in item1 (i.e, between each [ ]->bracket of phrases) and use the count value to count the no. of ' in the second file. So if the first phrase [AB'C'DEF'GH'I'] has five ' ...so while looping through input 2 I want to count the same no of ' and place the bracket sign to get [AB'C' DEF' GH'I'|] . For second count of the second phrase [JKL'MN'O'|] the count value for Input2 must increase (add previous count to the new count) since Input 2 must be counted from the beginning of the file always.
So subsequently I must have this [AB'C' DEF' GH'I'|] [JKL' MN'O'|]. Since I have just begun using Python, obviously my coding wrong.. What am I doing wrong? Give me some hints as to how to go about this problem. This is the only way I could come up with to compare these two files.

At least give me some hints.

Thanks.

python

1 Contributor
1 Reply
126 Views
6 Hours Discussion Span
Latest Post 15 Years Ago Latest Post by dnamgyel

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

dnamgyel · Answer 1 · 2010-02-02T18:53:00+00:00

Sorry...here I go again.

Input1:
[འབྲུག་གི་རང་ལུགས་འཆམ་།] [དུས་རབས་བརྒྱད་པའི་ནང་] [གཏེར་འཆམ་དང་] [པད་གླིང་གིང་གསུམ་] [ལ་སོགས་པའི་འཆམ་གྱི་རིགས་ཚུ་ད་ལྟོ་བར་ན་ཡང་] [ཡོངས་གྲགས་སྦེ་] [རྐྱབ་སྲོལ་ཡོདཔ་ཨིན་།] ->actual file (requiring special font support so I have modified the Inputs)
[AB'C'DEF'GH'I'] [JKL'MN'O'|] and so on ..........[ ......|]

Input2:
འབྲུག་ གི་ རང་ལུགས་ འཆམ་ ། གཏེར་འཆམ་ དང་ པད་གླིང་གིང་ གསུམ་ ལ་སོགས་ པའི་ འཆམ་ གྱི་ རིགས་ ཚུ་ ད་ལྟོ་ བར་ན་ ཡང་ ཡོངས་གྲགས་ སྦེ་ རྐྱབ་སྲོལ་ ཡོདཔ་ཨིན་ ། ->actual file

AB'C' DEF' GH'I'| JKL' MN'O'| ...............and so on
I have two inputs in the form of lists that are somewhat irregular. First input is a list of phrases whose content is the same as the second content except for the word boundaries that are missing in it. Now to get the output in the same format as the Input1 along with the word boundaries, I need to compare between the two inputs. I tried doing this:

next_count=0
for p in item1: #list item from Input1
   for q in p:
     t=q.count(' ' ')
     fout.write(str(t))
     for k in item2:#list item from Input2
       next_count=next_count+t
       fout.write('[')
       for v in k:
           for v in range(0, next_count):
               text1_in.append(v)
               fout.write(text1_in)
               fout.write(']')

My motive was to count the syllables by finding no of occurrences of this ' in item1 (i.e, between each [ ]->bracket of phrases) and use the count value to count the no. of ' in the second file. So if the first phrase [AB'C'DEF'GH'I'] has five ' ...so while looping through input 2 I want to count the same no of ' and place the bracket sign to get [AB'C' DEF' GH'I'|] . For second count of the second phrase [JKL'MN'O'|] the count value for Input2 must increase (add previous count to the new count) since Input 2 must be counted from the beginning of the file always.
So subsequently I must have this [AB'C' DEF' GH'I'|] [JKL' MN'O'|]. Since I have just begun using Python, obviously my coding wrong.. What am I doing wrong? Give me some hints as to how to go about this problem. This is the only way I could come up with to compare these two files.

At least give me some hints.

Thanks.