I have a file named test.txt
I get the file

``````file=open("test.txt","r")
file.close()
print obj
a=a

b=b

c=c

d=e
e=d

e=f
f=e

f=g
g=h``````

All I want to do with this obj is that, I've to create a regular expression such that,

1.If the left number matches the right number, it should become a single number.ie., a=a should become a.

2.Then d=e & e=d means the same. In this case any one of them must be removed. So as for e=f & f=e.

3. Notice the newlines. Some have \n ,some have \n\n and some have \n\n\n . Make everything into a singe \n for each.

The output should be

``````a
b
c
d=e
e=f
f=g
g=h``````

Thank you very much. That was very helpful. How am I going to achieve the 1st and 2nd conditions. I am still trying, but i couldnt figure out a regular expression...

You can write pseudo code to build the regular expression. You want to match this

``pattern: …``

## All 4 Replies

lets start easy:

``````lines = []
with open('test.txt', 'r') as f:
for x in f:
if x.strip() # lose empty lines
lines.append(x.strip())
for line in lines:
print(line)``````

This just eliminates the blank lines, then prints out the remainder. Of course you will want to do some more work. You will probably want to do something like `lhs,rhs = line.split('=')` at some point.

Thank you very much. That was very helpful. How am I going to achieve the 1st and 2nd conditions. I am still trying, but i couldnt figure out a regular expression...

Thank you very much. That was very helpful. How am I going to achieve the 1st and 2nd conditions. I am still trying, but i couldnt figure out a regular expression...

You can write pseudo code to build the regular expression. You want to match this

``````pattern:
either:
symbol1 equal symbol2
newline
symbol2 equal symbol1
or:
symbol3 equal symbol3
or:
symbol4 equal symbol5
newlines (0 or more)``````

Each of these elements has an equivalent regex pattern:

``````symbol1 ->  (?P<symbol1>[a-z])
symbol2 ->  (?P<symbol2>[a-z])
repeated symbol1  -> (?P=symbol1)
repeated symbol2  -> (?P=symbol2)
equal -> [=]
newline -> \n
zero or more -> *``````

This should give you hints to build the regular expression.

commented: Thank you very much!! I think i am nearing the answer. +0

lets start easy:

``````lines = []
with open('test.txt', 'r') as f:
for x in f:
if x.strip() # lose empty lines
lines.append(x.strip())
for line in lines:
print(line)``````

This just eliminates the blank lines, then prints out the remainder. Of course you will want to do some more work. You will probably want to do something like `lhs,rhs = line.split('=')` at some point.

Good advices. Also good of not giving ready solution, as OP must solve the problem by RE.

So I am free to post two of my non-RE solutions:

``````import itertools as it
with open("test.txt","r") as datasource:
c,d = '',''
for ab in datasource:
if '=' in ab:
a,b =  ab.rstrip().split('=')
if a == b:
print a
else:
if (a,b) != (d,c):
print '='.join((a,b))
c, d = a, b

print 60 * '-'
with open("test.txt","r") as source:
datasource = (sorted(d.rstrip().split('='))
for d in source if '=' in d)
print '\n'.join(sorted(set(a if a==b else a+'='+b for a,b in datasource)))``````
commented: Thank you very much. That solved the problem. +0
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts learning and sharing knowledge.