String manipulation

Question

jmark13 0 Newbie Poster

12 Years Ago

Hello again, So, I'm trying to work out a problem where there are 6 symbols in a string (A, B, 0, 1, [, ]). The brackets always have to face each other in the string but they can be nested. There are some rules, most of which I have down, but when I get to the part that requires the brackets, I get confused as to how python might handle them efficiently. The main rules are that 'A' cannot be adjacent to '0' or another 'A', 'B' cannot be adjacent to '1' or another 'B' and this is true through the brackets as well. 'A' is "stronger" than '0' and 'B' is "stronger" than '1'... So, a string like 'AA001B001A' becomes 'AB001A' This is easily achieved with the replace method. However, it gets more complicated when the brackets are introduced and this is where I need help.

Also, the symbols inside the brackets are treated like rotating stacks, so, if A[001], then A[100], and if B[A[100]B0] then B[A[100]B] as the 'A' is considered adjacent to the '0'

so how can str='A[A[001]BB1A10]11BA10' become str=A[1A[100]BA]BA10 ?? or rather, how can any string in the first form become a well formed string given those rules? It seems a bit complicated for just the replace method.

Here is the code I have so far:

str='A[A[001]BB1A10]11BA10'

while '0A' in str:
    str=str.replace('0A', 'A')

while 'A0' in str:
    str=str.replace('A0', 'A')

while '1B' in str:
    str=str.replace('1B', 'B')

while 'B1' in str:
    str=str.replace('B1', 'B')

while 'BB' in str:
    str=str.replace('BB', 'B')

while 'AA' in str:
    str=str.replace('AA', 'A')

TIA!

-jmi

python

4 Contributors
13 Replies
137 Views
1 Day Discussion Span
Latest Post 12 Years Ago Latest Post by TrustyTony

All 13 Replies

Gribouillis 1,391 Programming Explorer

12 Years Ago

To start with, here is a function wich transforms your input data into a python list, which should be easier to handle than a string

# stringgame.py

import re
word_re = re.compile(r"[AB01]+")


def sub_func(match):
    """helper for parse()"""
    return "'%s'," % match.group(0)

def parse(data):
    """transform an input string into a python list"""
    x = word_re.sub(sub_func, data)
    x = x.replace("]", "],")
    return eval("[%s]" % x)

if __name__ == "__main__":
    test_data = 'A[A[001]BB1A10]11BA10'
    the_list = parse(test_data)
    print the_list
    
""" my output -->
['A', ['A', ['001'], 'BB1A10'], '11BA10']
"""

Edited 12 Years Ago by Gribouillis because: n/a

TrustyTony 888 pyMod

12 Years Ago

Grib has fine idea, but you should check that first value does not inadvertedly become neighbour of last value (maybe it is allready, I do not catch your rules so well)

Here more incomplete parsing with itertools.groupby, leaving list one level:

import itertools
def is_bracket(c):
     return c in '[]'

s = 'A[A[001]BB1A10]11BA10'
processed = [''.join(gr) for c, gr in itertools.groupby(s, is_bracket)]
print('%s -> %s' % (s, processed))

Edited 12 Years Ago by TrustyTony because: n/a

woooee 814 Nearly a Posting Maven

12 Years Ago

and if B[A[100]B0] then B[A[100]B] as the 'A' is considered adjacent to the '0'

That doesn't make sense to me at least. How is the final "0" adjacent to "A" even when taking the brackets into consideration.

so how can str='A[A[001]BB1A10]11BA10' become str=A[1A[100]BA]BA10

Where does the first "1" come from in the second string? Just to clear things up, if you have "A[0]" does that convert into just "A" or remain the same because of the brackets. And if the brackets don't matter, with "A[100]" do you test for "A0" before or after the rotation inside the brackets. Finding the previous and next letters, and testing for combinations, is fairly straight forward even when taking the brackets into consideration, but one has to be sure of the rules before that can be done.

Edited 12 Years Ago by woooee because: n/a

TrustyTony 888 pyMod

12 Years Ago

For me by hand and considering rotation and effect through brackets the example should transform like this:

str='A[A[001]BB1A10]11BA10'
-> 'A[A[]BA10]1BA10'
-> 'A[0A[]BA1]1BA10'
-> 'A[A[]BA1]1BA10'
-> 'A[[]BA1]1BA10'
-> 'A[A1[]B]1BA10'
-> 'A[1[]B]1BA10'
-> 'A[1[]B]BA10'

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

Gribouillis 1,391 Programming Explorer Team Colleague · Answer 1 · 2011-06-26T03:14:13+00:00

For me by hand and considering rotation and effect through brackets the example should transform like this:
str='A[A[001]BB1A10]11BA10'
-> 'A[A[]BA10]1BA10'
-> 'A[0A[]BA1]1BA10'
-> 'A[A[]BA1]1BA10'
-> 'A[[]BA1]1BA10'
-> 'A[A1[]B]1BA10'
-> 'A[1[]B]1BA10'
-> 'A[1[]B]BA10'

Nothing in the rules indicates that [001] becomes []. I think the first phase is to replace the pattern "0*(A0*)+" by "A" and similarly, "1*(B1*)+" by "B". This can be done before anything else.

TrustyTony 888 pyMod Team Colleague Featured Poster · Answer 2 · 2011-06-26T03:22:38+00:00

A takes out A & 0 also through brackets, B takes out B & 1 also through brackets, things inside bracket can be rotated -> A[anything]B becomes A[]B, so I understood.

Of course from last form B]B would stil become B] or ]B, by these rules or through rotation B takes out first 1.

Gribouillis 1,391 Programming Explorer Team Colleague · Answer 3 · 2011-06-26T03:27:41+00:00

A takes out A & 0 also through brackets, B takes out B & 1 also through brackets, things inside bracket can be rotated -> A[anything]B becomes A[]B, so I understood.
Of course from last form B]B would stil become B] or ]B, by these rules or through rotation B takes out first 1.

The OP should explain the rules more precisely. I'm not sure that you apply the correct rules since the OP says that 'A[A[001]BB1A10]11BA10' becomes 'A[1A[100]BA]BA10'.

By the way, in mathematics, a set of symbols with such kind of rules is called a formal system. Formal systems play a key role in logic.

woooee 814 Nearly a Posting Maven · Answer 4 · 2011-06-26T03:40:24+00:00

For me by hand and considering rotation and effect through brackets the example should transform like this:
str='A[A[001]BB1A10]11BA10'
-> 'A[A[]BA10]1BA10'
-> 'A[0A[]BA1]1BA10'
-> 'A[A[]BA1]1BA10'
-> 'A[[]BA1]1BA10'
-> 'A[A1[]B]1BA10'
-> 'A[1[]B]1BA10'
-> 'A[1[]B]BA10'

You could further reduce the last line, depending on the bracket rules, as the first "1" is contiguous to "B", ignoring the brackets, so you get
A[[]BA10]BA10
It also depends on when you rotate in the process as
A[001]B
could remain as is if rotated first to A[100]B, or changed completely if the removals are done first. It's all muddy to my feeble brain, so if anyone can post a clear set of rules and their heirarchy it would be helpful.

jmark13 0 Newbie Poster · Answer 5 · 2011-06-26T07:39:04+00:00

Thanks for the great interest in helping me here. To clarify, the rules:

1. 1 cannot be adjacent to B
2. 0 cannot be adjacent to A
3. A cannot be adjacent to another A
4. B cannot be adjacent to another B
5. inside the Brackets represent a repeating pattern, so the ends ARE adjacent to each other (to answer wooee) e.g. A[010]=A[100], B[011]0B=B[110]B, but I'm not sure now how or when to make the decision to rotate form the string. I think the goal is the shortest possible string with these rules. So perhaps when no length changes, as is the case with A[010]=A[100], the rule is unnecessary. I think it's good to know the choice is there in case it is helpful like in B[011]0B=B[110]B, but how does the program know which string is shortest without checking every rotation inside the brackets? Also, the answer is inherently non-deterministic, since it's a context sensitive grammar, so don't worry if there are two or more right answers of the same shortest length. Again, the goal is the shortest version of the string.

My goal *IS* a formalization of this procedure, hopefully with a tractable algorithm, so sorry in advance that I can't share it due to it's lack of existence. I can try to explain it informally as best as I can. Please continue to ask questions if you still have them for me.

A[0] DOES convert to just A, and actually [0] also converts to A. Likewise, [1] converts to B; [A] converts to A, and converts to B.

Unfortunately all the codes presented until now have wrong outputs. Here is the procedure of the test string by hand:

str-> A[A[001]BB1A10]11BA10

A[A[001]BB1A10]11BA10 (check for "0A", no change)

A[A[001]BB1A10]BA10 (check for "1B")

A[A[001]BB1A10]BA10 (check for "A0", no change)
A[A[001]BBA10]BA10 (check for "B1")
A[A[001]BBA10]BA10 (check for "AA", no change)
A[A[001]BA10]BA10 (check for "BB")

now proceed with brackets:

A[A00[100]BA10]BA10 (with the inner most bracket, [001]=00[100])
A[A[100]BA10]BA10 (check for adjacencies)

Ok, at this point I'm noticing something interesting because there are several choices that will lead to several forms, all of which are "acceptable." There is also a small error in my final test string.

first notice BA10 repeats, thus we could just lopp off the end to get:
A[A[100]BA10]

I'm also just now noticing that we must keep the 'A[A' because you cant rotate over a bracket substring. Thus, my example has an error and the final string in the example should read
"A[A[100]BA10]"

I know, this is hard. Please forgive me, I'm a musician and this is my hobby!

Also note: brackets should never be incomplete or empty, 'A[[]BA10]BA10' is NOT a well formed word in this grammar.

Gribouillis 1,391 Programming Explorer Team Colleague · Answer 6 · 2011-06-26T10:47:03+00:00

Thanks for the great interest in helping me here. To clarify, the rules:
1. 1 cannot be adjacent to B
2. 0 cannot be adjacent to A
3. A cannot be adjacent to another A
4. B cannot be adjacent to another B
5. inside the Brackets represent a repeating pattern, so the ends ARE adjacent to each other (to answer wooee) e.g. A[010]=A[100], B[011]0B=B[110]B, but I'm not sure now how or when to make the decision to rotate form the string. I think the goal is the shortest possible string with these rules. So perhaps when no length changes, as is the case with A[010]=A[100], the rule is unnecessary. I think it's good to know the choice is there in case it is helpful like in B[011]0B=B[110]B, but how does the program know which string is shortest without checking every rotation inside the brackets? Also, the answer is inherently non-deterministic, since it's a context sensitive grammar, so don't worry if there are two or more right answers of the same shortest length. Again, the goal is the shortest version of the string.
My goal *IS* a formalization of this procedure, hopefully with a tractable algorithm, so sorry in advance that I can't share it due to it's lack of existence. I can try to explain it informally as best as I can. Please continue to ask questions if you still have them for me.
A[0] DOES convert to just A, and actually [0] also converts to A. Likewise, [1] converts to B; [A] converts to A, and converts to B.
Unfortunately all the codes presented until now have wrong outputs. Here is the procedure of the test string by hand:
str-> A[A[001]BB1A10]11BA10
A[A[001]BB1A10]11BA10 (check for "0A", no change)
A[A[001]BB1A10]BA10 (check for "1B")
A[A[001]BB1A10]BA10 (check for "A0", no change)
A[A[001]BBA10]BA10 (check for "B1")
A[A[001]BBA10]BA10 (check for "AA", no change)
A[A[001]BA10]BA10 (check for "BB")
now proceed with brackets:
A[A00[100]BA10]BA10 (with the inner most bracket, [001]=00[100])
A[A[100]BA10]BA10 (check for adjacencies)
Ok, at this point I'm noticing something interesting because there are several choices that will lead to several forms, all of which are "acceptable." There is also a small error in my final test string.
first notice BA10 repeats, thus we could just lopp off the end to get:
A[A[100]BA10]
I'm also just now noticing that we must keep the 'A[A' because you cant rotate over a bracket substring. Thus, my example has an error and the final string in the example should read
"A[A[100]BA10]"
I know, this is hard. Please forgive me, I'm a musician and this is my hobby!
Also note: brackets should never be incomplete or empty, 'A[[]BA10]BA10' is NOT a well formed word in this grammar.

I understand rules 1,2,3,4 where there is no mention of the brackets, but rule 5 is very unclear

"Inside the brackets represent a repeating pattern." <--- what does that mean ?
"B[011]0B=B[110]B" <-- why ?
"A[010]=A[100]" <-- is it also equal to A[001] ? Can you make any rotation inside the bracket ?
"A[A00[100]BA10]BA10 (with the inner most bracket, [001]=00[100])" <-- why ? I don't understand this.
"A[0] DOES convert to just A" <-- This is not implied by the previous rules, is it a new rule ? What are all the rules ?
"[0] also converts to A" <-- same thing

jmark13 0 Newbie Poster · Answer 7 · 2011-06-27T01:30:47+00:00

"Inside the brackets represent a repeating pattern." <--- what does that mean ?

Think of a rotating stack...

"A[010]=A[100]" <-- is it also equal to A[001] ?

Yes

Can you make any rotation inside the bracket ?

I'm not sure, truthfully. I don't think so... but maybe. There seem to be restrictions in special circumstances, and everytime you DO make a rotation, you have to pop some values, e.g. [001]!=[100], [001]==00[100]1. It is only when adjacent to A or B when the rotation becomes fluid... e.g. A[001]B==A[100]B.

What are all the rules ?

Well, I have been working on a paper, for a very long time now, that has been reworked many many times and is now, hopefully entering a final draft stage that formalizes the model that will answer the "why" questions you have. If I told you what it was, you'd probably laugh at me, so I will refrain until it is complete. I may post a draft on arXiv.org sometime in the next two or three weeks, but only if the proof I'm offering is organized well enough where I'm confident others will take it seriously. It sounds dumb, but ALL the rules are in my mind, I have no one to talk to about this stuff, so I'm completely self taught outside of forums such as this and reddit, which means, even if I could tell someone who understands this stuff the proof VERBALLY with a pen and paper, I have a difficult time writing it down so it LOOKS NEAT via LaTeX. I live in Massachusetts with colleges abound, I've tried to meet with about 6 different professors, (none of whom give the time of day to 'independent researchers' who are getting their degree in a different field.) I was only able to get one meeting with a friend of a friend who happened to be a model theorist that recently graduated. My ideas were in 'pre-proof' hypothesis stage then and he DID help me model my work through that meeting, he moved to Hawaii, though and that was about 2 years ago now. I do now, seem to have a method for a solid proof of at least ONE important math problem. It's just that I ALSO want a method using this model for a different math problem... so close... can taste...

Math, is not something we do because we love it or like it, or even want to make money, it is more of an unhealthy obsession like smoking, drugs or sex addiction.

Gribouillis 1,391 Programming Explorer Team Colleague · Answer 8 · 2011-06-27T02:33:37+00:00

Math, is not something we do because we love it or like it, or even want to make money, it is more of an unhealthy obsession like smoking, drugs or sex addiction.

I woudn't agree about math being unhealthy. Math is a part of science. Science is about discovering and studying the laws of nature. Math is addictive like any other exciting activity.

You can't automate a system with a program if you can't describe precisely the rules, so we'll wait for your set of rules.

If you're having a difficult time writing down your ideas with latex, I suggest that you turn to texmacs which is easy, wysiwyg, and produces the same quality. See http://texmacs.org/ .

TrustyTony 888 pyMod Team Colleague Featured Poster · Answer 9 · 2011-06-27T02:39:56+00:00

Maybe then you could do worse than confide to mathematician Dr Gribouillis, if he is interested and you are in too unsure condition to formulate clear rules. If you give enough details he/we surely could help you to formulate formal grammar rules. Otherwise we are too much in dark how the duplication happens during rotate and not knowing the purpose of all does not also help things.

You would need to give us massive amount of example cases to catch what this rotation with duplication out of brackets works, to understand for example why you duplicate (the number is not reduced inside the brackets) 1 one to right and 2 0s to left, while shifting 1 two places to left inside the brackets. Looks somehow that you are poping value to other side brackets and duplicating and pushing it in to do the rotation....

[001} -> [00]1 -> [100]1

But you have those two zeroes also.

String manipulation

Recommended Answers Collapse Answers

All 13 Replies

Recommended Answers