comparing domain name in two text files and print in to third one

Question

srinu_1 0 Newbie Poster

11 Years Ago

hi,

please help me to solve the error for:

text1.txt:
line1 <data>
line2 <items>
line3 <match name="item1" rhs="domain.com"></match>
line4 <match name="item2" rhs="domainn.com"></match>
line5 <match name="item2" rhs="1010data.com"></match>
line6 </items>
line7 </data>

text2.txt:
line1 djshjsdf
line2 sdfngjfg

check domain.com,domain.com,1010data.com in text2.com, if not there print domain.com,domain.com,1010data.com in to the 3rd text file(text3.txt)

import re
with open('C:\\Users\\Desktop\\m\\test1.txt', 'r') as f_in:
    with open('C:\\Users\\Desktop\\m\\test_compare.txt', 'r') as f_compare:
        with open('C:\\Users\\Desktop\\m\\result.txt', 'w') as f_out:
            d1 = f_in.read()
            d2 = f_compare.read()
            for match in re.finditer(r'rhs="(.*)"', d1):
                    if match not in d2:                         
                            f_out.write('{}\n'.format(match.group(1)))

while running the above code, it throws an error

Traceback (most recent call last):
  File "C:\Python27\comparetwofiles\src\compare\notepadtest.py", line 11, in <module>
    if match not in d2:                         
TypeError: 'in <string>' requires string as left operand, not _sre.SRE_Match

can anyone help me to fix the above error.....

python regex

Edited 11 Years Ago by srinu_1 because: code changed

3 Contributors
2 Replies
374 Views
1 Day Discussion Span
Latest Post 11 Years Ago Latest Post by snippsat

All 2 Replies

rrashkin 41 Junior Poster in Training

11 Years Ago

I'm not sure how you're supposedto use re.finditer but not this way. The elements of the returned list are match objects, not strings. I suggest you use findall instead. If I do this:

 lst1=re.findall(r'rhs="(.*)"',d1)

I get this:

['domain.com', 'domainn.com', '1010data.com']

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

snippsat 661 Master Poster · Answer 1 · 2013-12-31T11:20:04+00:00

You first post.
http://www.daniweb.com/software-development/python/threads/470301/problem-in-copying-domain-name-from-one-notepad-to-another-using-regex
You have postet more info about the task,but it still not as clear as it should be.
You can try this.

import re

with open('1.txt') as f1,open('2.txt') as f2,open('result.txt', 'w') as f_out:
    f1 = re.findall(r'rhs="(.*)"', f1.read())
    f2 = re.findall(r'rhs="(.*)"', f2.read())
    diff = [i for i in f1 if i not in f2]
    #print diff
    f_out.write(','.join(diff))

comparing domain name in two text files and print in to third one

Recommended Answers Collapse Answers

All 2 Replies

Recommended Answers