snippsat 661 Master Poster

Look at this post

EDWIN_4 commented: thanks +0
snippsat 661 Master Poster

either

I guess ungalcrys has left Python a long time ago.
This is his only post over 4 year ago,look at dates.

snippsat 661 Master Poster

---

snippsat 661 Master Poster

If you really wanted it in list format you could do this to it:

Yes i agree if that's really what BingityBongity want,
but a plain list with no connetion between vaules seems not right.
Just use dict.items() then key and value from dict are together in tuple.

>>> d = {'this': 3, 'that': 2, 'and': 1}
>>> d.items()
[('this', 3), ('and', 1), ('that', 2)]

Just use dict from collections.Counter is the best way.

snippsat 661 Master Poster

Nice, but it shortens the list if the length is not a multiple of 3.

Yes that's right,an other solution is to use map().

>>> mylist = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
>>> mylist_1 = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> map(None, *[iter(mylist)]*3)
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11)]
>>> map(None, *[iter(mylist_1)]*3)
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, None)]

So here is None used as fill value.
If this this is not desirable,then other soultion in this post are fine.

snippsat 661 Master Poster
>>> zip(*[iter(mylist)]*3)
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11)]
snippsat 661 Master Poster

Setuptool can be found here.
Correct environment variable is ;C:\python33\;C:\python33\scripts\;
Look that's is correct by looking at Path.

snippsat 661 Master Poster

Use PyUserInput
Install Pywin32 and Pyhook
Install pip
Then do pip install PyUserInput

from pykeyboard import PyKeyboard

k = PyKeyboard()    
# pressing a key
k.press_key('H')
# which you then follow with a release of the key
k.release_key('H')
Gribouillis commented: good idea +14
entropicII commented: This helpful so far but I have no idea how to use pip which probably sounds pretty stupid so please can you just tell me how to use pip +0
snippsat 661 Master Poster
In [34]:    
codeList = '''\
<ScanCode>
    <Code>Superman 12345</Code>
    <User>Clark kent</User>
    <DateTime></DateTime>
    <Comment/>
    <LineNumber></LineNumber>
    <SeqNumber></SeqNumber>
</ScanCode>'''

In [35]:    
import xml.etree.ElementTree as ET

In [37]:    
xml = ET.fromstring(codeList)

In [41]:    
xml.find('User').text
Out[41]:
'Clark kent'

In [42]:    
xml.find('Code').text
Out[42]:
'Superman 12345'
snippsat 661 Master Poster

the pring codelist prints out the cancodes/scancode/code section to show that it has fount something.

Is it in "xml" foramt?
Post a sample.

If i iterate over codelist(for el in codeList:),
like you do i can not use find() anymore.

In [25]:

for el in xml:
    print type(el)
    print el.attrib['key']
    print el.text        

<class 'xml.etree.ElementTree.Element'>
value
text
snippsat 661 Master Poster

Why lambda expression?
It can make code less readable.

All you should need to do is import string and apply in to your string's first character, and string.digits, and you should be able to use that as the lambda expression.

I agree,but there is no need to import string.
All string method are always available.
import string is only needed for make alfabet.

Just a test,and when the topic is lambda expression.

>>> print((lambda x: x[0].isdigit())('1hello'))
True
>>> print((lambda x: x[0].isdigit())('hello'))
False

>>> print((lambda x: x.startswith(('1','2','3')))('2hello'))
True
>>> print((lambda x: x.startswith(('1','2','3')))('3hello'))
True
>>> print((lambda x: x.startswith(('1','2','3')))('4hello'))
False

>>> import re
>>> print((lambda x: re.match(r'\d+', x))('987hello'))
<_sre.SRE_Match object at 0x02B01D40>
>>> print((lambda x: re.match(r'\d+', x))('hello'))
None

chophouse first you make a ordinary function,because that is eaiser.
Then if that dos not do it for you look into lambda expression or functools.partial
Or the more pythonic way as PyTony suggests.

Just one more,here a generator(yield).
Wish make code memory efficiency(only 1 line in memory).

def foo():
    with open('start_digit.txt') as f:
        for line in f:
            if line[0].isdigit():
                yield line.strip()

for line in foo():
    print line
Gribouillis commented: +1 for function foo() +14
snippsat 661 Master Poster
codeList = root.findall(".//ScanCodes/ScanCode/Code")
print codeList

What dos "print codelist" output.
So here as a demo i use find() to find "value" and "text"

in [1]:    
xml = '''\
    <foo>
       <bar key="value">text</bar>
    </foo>'''

In [2]:    
import xml.etree.ElementTree as ET

In [3]:    
xml = ET.fromstring(xml)

In [4]:    
xml.find('./bar').attrib['key']   
Out[4]:
'value'

In [5]:
xml.find('./bar').text    
Out[5]:
'text'
snippsat 661 Master Poster
snippsat 661 Master Poster

That 100 line long fileBreak function is really not good at all.
You should split it up,functions should be around 10-15 lines.

Do not try to do to much in a single function.
Do a couple of task and have a clear return value.
This make code much eaiser to read and test.

snippsat 661 Master Poster

it does for the files but i jsut got told they arent files but subfolders that have the numbers in the names which makes this so much more confussing for me.

Use os.walk() it recursive scan all folder and subfolders.
Example.

import os
import re

search_pattern = r'\d'
target_path = os.path.abspath(".") #current folder
for path, dirs, files in os.walk(target_path):
    for folder_name in dirs:
        if re.findall(search_pattern, folder_name):
            print folder_name # Folder with numbers
            print(os.path.join(path, folder_name)) # Full path
snippsat 661 Master Poster

Yes is called default argument.

def say(msg, arg='something'):
    print 'first argument <{}> default argument <{}>'.format(msg, arg)

>>> say('hello')
first argument <hello> default argument <something>
>>> say('hello', 'world')
first argument <hello> default argument <world>

Many arguments.

def foo(*arg):
    return sum(arg)

>>> foo(1,2,3,4,5,6,7,8,9,10)
55  

Keyword arguments.

def bar(**kwargs):
    for key, value in kwargs.items():
        print '{}: {}'.format(key, value)

>>> bar(first_name='Clark kent', last_name='Superman')
first_name: Clark kent
last_name: Superman
snippsat 661 Master Poster

Re upload your code with working indentation.

timetraveller1992 commented: good call. esp since it's python! +0
snippsat 661 Master Poster

You have convert all input to integers,so answer(ans) is also an integer.
Can look at some improvement.

num1 = int(raw_input("Enter an integer"))
num2 = int(raw_input("Enter a second integer"))
num3 = int(raw_input("Enter a third integer"))
print "The sum of the 3 integers entered is: {}".format(sum((num1, num2, num3)))

So in this version use build in sum() and string formatting.

>>> a = 1, 2, 3
>>> a
(1, 2, 3)
>>> sum(a)
6
>>> print '{} {}'.format('my score is' , sum(a))
my score is 6

If you think of it,it's really the same answer repeating 3 times.

result = []
for times in range(3):
    result.append(int(raw_input("Enter an integer: ")))
print "The sum of the 3 integers entered is: {}".format(sum(result))
ddanbe commented: Super! +15
snippsat 661 Master Poster

Dont use list and dict as variable names,words are reseverd bye Python.
To clean it up.

my_file = open("words.txt")
lst = [] #Outside the loop
for word in my_file:
    word = word.strip()
    lst.append(word)
my_file.close()

#searching for the word in list
search_word = "fox"
if search_word in lst:
    print 'Word found: <{}>\nDictionary {}'.format(search_word, lst)
else:
    print "You typed a word that dosen't exist"

'''Output-->
Word found: <fox>
Dictionary ['bear', 'wolf', 'fox']
'''

Don't place all code in one big function.
Do some execerice on function,and try to keep function small and do a specific task and return result out.
This make code eaiser to read an test.

def read_contend(dictionary):
    '''Read contend and return a list'''
    with open(dictionary) as f_obj:
        word_lst = [word.strip() for word in f_obj]
        return word_lst

def search_contend(search_word, word_lst):
    '''Search for a word in input contendt'''
    if search_word in word_lst:
        return search_word
    return "You typed a word that dosen't exist"

if __name__ == '__main__':
    dictionary = 'words.txt'
    search_word = "fox"
    word_lst = read_contend(dictionary)
    print search_contend(search_word, word_lst)
snippsat 661 Master Poster

Slice it out can be another option.

with open('06 Rooster.mp3', 'rb') as f:
    f.seek(-128,2)
    tag_content = f.read(128)
    title = tag_content[3:33]
    print title # Rooster

You do know that there are several libraries that can do this(Mutagen,eyeD3...)?

import eyed3

audiofile = eyed3.load("06 Rooster.mp3")
print audiofile.tag.title
print audiofile.tag.artist
print audiofile.info.bit_rate_str

'''Output-->
Rooster
Alice In Chains
~195 kb/s
'''

Your other post.
http://www.daniweb.com/software-development/python/threads/479773/mp3-meta-data

snippsat 661 Master Poster

Take a look at Click from the creator of Flask Armin Ronacher.

Gribouillis commented: Very interesting! +14
snippsat 661 Master Poster

Here a version of iglob,that can take multiple file extensions.

#iter_glob
from glob import iglob
from itertools import chain
import os, sys

def iter_glob(path=None, *args):
    '''A iglob version that return an iterator fully lazy evaluated
     and can handle multiple file extensions.
    #--- Usage ---#
    from iter_glob import iter_glob
    for filename in multi_glob(folder_path, '*.py', '*.txt'):
        print filename
    '''
    try:
        os.chdir(path)
    except Exception as error:
        print 'Wrong folder path {},try again'\
        .format(str(error).split(':',1)[-1].strip())
        sys.exit()
    return chain.from_iterable(iglob(pattern) for pattern in args)

Test:

>>> from iter_glob import  iter_glob
>>> for filename in iter_glob(r'C:\temp', '*.py', '*.txt'):
...     print filename
...     
micc.py
module9.py
tid_delta_csv.py
pass.txt
VRayLog.txt

>>> for filename in iter_glob(r'C:\temp999', '*.py', '*.txt'):
...     print filename
...     
Wrong folder path 'C:\\temp999',try again
snippsat 661 Master Poster

This is a no-op :)

Yes i can agree with that :)

#multi_glob.py
from glob import glob
import os

def multi_glob(path=None, *args):
    '''glob that can handle multiple file extensions option.
    #--- Usage ---#
    from multi_glob import multi_glob
    multi_glob(folder_path, '*.py', '*.txt')
    '''
    try:
        os.chdir(path)
    except OSError:
        return 'Wrong folder path,try again'
    f_lst = []
    for files in args:
        f_lst.extend(glob(files))
    return f_lst

Test:

>>> from multi_glob import multi_glob
>>> print multi_glob(r'C:\temp', '*.py', '*.txt')
['micc.py', 'module9.py', 'tid_delta_csv.py', 'pass.txt', 'VRayLog.txt']
>>> print multi_glob(r'C:\temp999', '*.py', '*.txt')
Wrong folder path,try again
snippsat 661 Master Poster

I give it a try for fun.

#multi_glob.py
from glob import glob
import os

def multi_glob(path=None, *args):
    '''glob that can handle multiple file extensions option.
    #--- Usage ---#
    from multi_glob import multi_glob
    multi_glob(folder_path, '*.py', '*.txt')
    '''
    try:
        os.chdir(path)
    except OSError:
        os.chdir(os.getcwd())
        print 'Wrong path using script(.py) folder path'
    f_lst = []
    for files in args:
        f_lst.extend(glob(files))
    return f_lst

Test:

>>> from multi_glob import multi_glob
>>> print multi_glob(r'C:\temp999', '*.py', '*.txt')
Wrong path using script(.py) folder path
['02252014-162200.709116.txt', 'saladsfilecost.txt']

>>> print multi_glob(r'C:\temp', '*.py', '*.txt')
['micc.py', 'module9.py', 'tid_delta_csv.py', 'pass.txt', 'VRayLog.txt']

>>> help(multi_glob)
Help on function multi_glob in module __main__:

multi_glob(path=None, *args)
    glob that can handle multiple file extensions option.
    #--- Usage ---#
    from multi_glob import multi_glob
    multi_glob(folder_path, '*.py', '*.txt')
snippsat 661 Master Poster

I was thinking that if a <title> has no corresponding <pos>, delete that title, but I don't know how to do that. Can anyone suggest a solution?

Some hint use fetchNextSiblings() if return empty list,then decompose() that title tag.

xml = '''\
<page>
<title>dasher</title>
<pos>red</pos>
</page>
<page>
<title>dancer</title>
<pos>red</pos>
<pos>blue</pos>
</page>
<page>
<title>coconut</title>
</page>
<page>
<title>rudolph</title>
<pos>red</pos>
<pos>brown</pos>
<pos>red</pos>
</page>'''


from bs4 import BeautifulSoup

soup = BeautifulSoup(xml)
title = soup.find_all('title')

for index, item in enumerate(title):
    print index, item  
'''
0 <title>dasher</title>
1 <title>dancer</title>
2 <title>coconut</title>
3 <title>rudolph</title>
'''

for index, item in enumerate(title):
    print index, item.fetchNextSiblings() 
'''
0 [<pos>red</pos>]
1 [<pos>red</pos>, <pos>blue</pos>]
2 []
3 [<pos>red</pos>, <pos>brown</pos>, <pos>red</pos>]
'''

for index, item in enumerate(title):
    if item.fetchNextSiblings() == []:
        print item.decompose()
'''None'''

for index, item in enumerate(title):
    print index, item  
'''
0 <title>dasher</title>
1 <title>dancer</title>
2 <None></None>
3 <title>rudolph</title>
'''
snippsat 661 Master Poster
import os
import xml.etree.ElementTree as ET

tree = ET.parse("test.xml")
root = tree.getroot()
for element in root.iter('Path'):
    print os.path.basename(element.text)

'''Output-->
Riched32.dll
napinsp.dll
test.exe
'''

Fix so "exe" not is in output.

import os
import xml.etree.ElementTree as ET

tree = ET.parse("test.xml")
root = tree.getroot()
for element in root.iter('Path'):
    file_name = os.path.basename(element.text)
    #jpg just an example that you can have more values
    if not file_name.endswith(('.exe', '.jpg')): 
        print file_name

'''Output-->
Riched32.dll
napinsp.dll
'''
snippsat 661 Master Poster

You don't know that print element.text just was an example?
A simple test and you should be able to figure very basic stuff like this out.
You just use os.path.basename(element.text)

import os
import xml.etree.ElementTree as ET

f_out = open('my_file.txt', 'w')
tree = ET.parse("test.xml")
root = tree.getroot()
for element in root.iter('Path'):
    f_out.write('{}\n'.format(os.path.basename(element.text)))
f_out.close()
snippsat 661 Master Poster

Some hint,using a parser in standard library ElementTree.
Most of the time i use BeautifulSoup or lxml for parsing.

import os
import xml.etree.ElementTree as ET

tree = ET.parse("test.xml")
root = tree.getroot()
for element in root.iter('Path'):
    print element.text
    print os.path.basename(element.text)

'''Output-->
C:\Windows\system32\Riched32.dll
Riched32.dll
C:\Windows\system32\napinsp.dll
napinsp.dll
'''

But not duplicated .dll words

Use set()

and my .csv file

Try to do something yourself,post code if stuck.

snippsat 661 Master Poster

url is set to None,then you get this error message.

>>> url = 'http://www.google.com'
>>> url[0:4]
'http'
>>> url = None
>>> url[0:4]
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
TypeError: 'NoneType' object has no attribute '__getitem__'

A tips is to just use print on url to see what you get.
So if there are no more url it will return None.
You can catch error and pass it out,then it will try to go futher.

>>> try:
...     url = None
... except TypeError:    
...     pass
snippsat 661 Master Poster

Here is a differnt way,with a little change of good answer given by pyTony.
Using next() to skip first line(header).

def files_to_one(new_file, *files):
    with open(new_file, 'w') as fout:
        for f in (files):
            with open(f) as fin:
                next(fin)
                for line in fin:
                    fout.write(line+'\n')
snippsat 661 Master Poster

.I have not come across the curly brackets yet on this line of code

New string formatting came out in Python 2.6
Gribouillis has much about it here
A quick look at at old and new.

>>> old = 'My name is %s' % 'foshan'
>>> old
'My name is foshan'
>>> 
>>> new = 'My name is {}'.format('foshan')
>>> new
'My name is foshan'

I think Python doc is usaually is ok,but with new string formatting not so good.
Here a coulpe of good link.
http://ebeab.com/2012/10/10/python-string-format/
http://www.cs.cmu.edu/~nschneid/pythonathan/py-format-string-ref.pdf

snippsat 661 Master Poster

You can do this print(t[0:3] in s) read my post again.

You are checking if list ['16', '24', '30'] is in s and that is False.

And see how i make it True.

This should equal:One True for the 40 match the rest would be False:

31 40 45 57 58 - 01
01 13 21 22 40 - 31

You have to make a rule for this.
Because there are 4 elements that's in both list.

>>> s = '31 40 45 57 58 - 01'.split()
>>> t = '01 13 21 22 40 - 31'.split()
>>> [item for item in t if item in s]
['01', '40', '-', '31']

>>> #Or with set() as Grib posted
>>> ys = set(s)
>>> yt = set(t)
>>> print(ys & yt)
set(['31', '01', '-', '40'])

you can see the first number is 01 this 01 has no bearing on the sixth 01 on the first line

I guess the same is for 31,then you can take away last 2 element.

>>> s = '31 40 45 57 58 - 01'.split()[:-2]
>>> s
['31', '40', '45', '57', '58']
>>> t = '01 13 21 22 40 - 31'.split()[:-2]
>>> t
['01', '13', '21', '22', '40']
>>> [item for item in t if item in s]
['40']

>>> ys = set(s)
>>> yt = set(t)
>>> print(ys & yt)
set(['40'])

This should equal:One True for the 40 match the rest …

snippsat 661 Master Poster

Some hint.

>>> s = ['06', '15', '26', '34', '36', '-', '16']
>>> t = ['16', '24', '30', '34', '43', '-', '20']
>>> s[0]
'06'
>>> t[0]
'16'
>>> s[0] == t[0]
False
>>> s[3] == t[3]
True
>>> s[5] == t[5]
True

But I dont know why it came out <False> when <16> is in <s>

>>> t[0:3]
['16', '24', '30']
>>> t[0:3] in s
False

>>> #To make it True
>>> s = [['16', '24', '30'],'06', '15', '26', '34', '36', '-', '16']
>>> t[0:3] in s
True

You are checking if list ['16', '24', '30'] is in s and that is False.

Iterate over both list and compare.

>>> s = ['06', '15', '26', '34', '36', '-', '16']
>>> t = ['16', '24', '30', '34', '43', '-', '20']
>>> for x,y in zip(s, t):
        print x == y

False
False
False
True
False
True
False

Or make it a list,with use of comprehension.

>>> [x == y for x,y in zip(s, t)]
[False, False, False, True, False, True, False]
snippsat 661 Master Poster

. The given md5 hash isn't inside <strong> tags. Its simple plain text

Even if it's in plain text,it's inside HTML on a website.
Then using a parser is the right tool,of course there is possible get one value like this with regex.
If you need help you have to post where it are,or some lines where value is present.

Here are some example for web scripting in python:

This has nothing to do with web-scraping.

snippsat 661 Master Poster

And is there any other/better way to scrape particular data ?

There is a better way,and HTML and regex is not the best way.
Read bobince funny and good answer her.

Python has two star paser Beautiful Soup and lxml
A couple of example parsing a <strong> tag.

from bs4 import BeautifulSoup

html = """\
<html>
<head>
   <title>html page</title>
</head>
<body>
  <strong>Hello world</strong>
</body>
</html>
"""
soup = BeautifulSoup(html)
tag = soup.find('strong')
print tag.text #--> Hello world

With lxml let say a website has <strong> tag contain text Hello world.

from lxml.html import parse

page = parse('http://www.website.com').getroot()
print page.xpath('//strong')[0].text  #--> Hello world
snippsat 661 Master Poster

Sorry but the code is not good at all.
Now you have a while True loop start at line 14 with a lot of code in it.
You have to use function or class when code get longer.

Look at code in first part here
See how function get_name() show_name() process_name() are short and give a hint what they do,this make code easy to test and read.

A single function should pretty much always be less than 15 lines,and even that is pushing it.
In your code you have all code in global space,wish make it hard to read and test.

snippsat 661 Master Poster

if we want to convert 'xyz' to uppercase, why dont we enter the argument inside the function

Because uppercase() is a method that belong to string class and only works on strings.
The Built-in Functions work on many class/objects.
Let's see how len() work on string and list.

>>> s = 'hello'
>>> len(s)
5

>>> lst = [1,2,3]
>>> len(lst)
3

Both string and list has methods that belong only to string and list class.
Look at append() for list.

>>> lst = [1,2,3]
>>> lst.append(4)
>>> lst
[1, 2, 3, 4]

append() work only for list,so no point that it should built in function.

s = 'hello'
>>> s.append('world')
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
AttributeError: 'str' object has no attribute 'append'

Use dir(str) or dir(list) to see methods that belong to string and list class.

snippsat 661 Master Poster

Good idèe bye slate to update objects namespace.
You can also just work with dictionary if you need result variable2 + variable3 wish is 30.
Herer a little compressed version.

>>> s = 'variable1=5&varible2=3&variable3=27&varible4=1'
>>> d = dict((k[0], k[1]) for k in [i.split('=') for i in s.split('&')])
>>> int(d['varible2']) + int(d['variable3'])
30
snippsat 661 Master Poster

That way is a little unpythonic way to do it rrashkin.

d = {'Seven': 22, 'Six': 0, 'Three': 35, 'Two': 0, 'Four': 45, 'Five': 34, 'Eight': 0}
for k,v in d.items():
    if v == 0:
        del d[k]

print d
#--> {'Five': 34, 'Four': 45, 'One': 10, 'Seven': 22, 'Three': 35}

For both dictionary and list it's not so normal to use del.
Here create a new dictionary using dict comprehension,as you see no del.

>>> d = {'Seven': 22, 'Six': 0, 'Three': 35, 'Two': 0, 'Four': 45, 'Five': 34, 'Eight': 0}
>>> {k:v for k,v in d.items() if v != 0}
{'Five': 34, 'Four': 45, 'One': 10, 'Seven': 22, 'Three': 35}

Just to show the same with a list.

>>> lst = [1, 2, 0, 4, 0, 3, 0, 2, 0, 2]
>>> [i for i in lst if i != 0]
[1, 2, 4, 3, 2, 2]
snippsat 661 Master Poster

cx_Freeze works Python 2 and 3.
gui2exe for all installers into a GUI interface.

As mention by Gribouillis on Linux it's ofen no problem to just give away .py files.

In my project wx_nrk for downloading all on norwegian TV.
I choiced to compiled to exe for Windows user and for Linux user i just made read_me file with step how to use it.
I could have make a compiled version for Linux but it's more easy for Linux user because Python is pre_installed.

snippsat 661 Master Poster

One with Requests
So popular that it can be in standar libary in future,maybe in Python 3.4.

When file is big as here(over 400mb) is a good choice to not load all in memory.
So here here it download in chunks of 4096 bytes.

import requests

url = "http://www.cs.cmu.edu/~enron/enron_mail_20110402.tgz"
r = requests.get(url, stream=True)
file_name = url.split('/')[-1]
file_size = r.headers["Content-Length"]

with open(file_name, "wb") as f_out:
    for block in r.iter_content(4096):
        if not block:
            break
        f_out.write(block)
    print "Downloading: {} Bytes: {}".format(file_name, file_size)
snippsat 661 Master Poster

Make a list index out of range error.

>>> lst = [1,2,3]
>>> lst[2]
3
>>> lst[3]
Traceback (most recnt call last):
  File "<interactive input>", line 1, in <module>
IndexError: list index out of range

So it should be easy to understand if you look at list(print or repr).
Then you see that index you try use is out of list range.

This can mean that meta.getheaders("Content-Length") is returning an empty list.
The if you index it with [0],the empty list give back list index out of range error.
which might happen if something went wrong in the urlopen call.

snippsat 661 Master Poster

You first post.
http://www.daniweb.com/software-development/python/threads/470301/problem-in-copying-domain-name-from-one-notepad-to-another-using-regex
You have postet more info about the task,but it still not as clear as it should be.
You can try this.

import re

with open('1.txt') as f1,open('2.txt') as f2,open('result.txt', 'w') as f_out:
    f1 = re.findall(r'rhs="(.*)"', f1.read())
    f2 = re.findall(r'rhs="(.*)"', f2.read())
    diff = [i for i in f1 if i not in f2]
    #print diff
    f_out.write(','.join(diff))
snippsat 661 Master Poster

It is better to use re.finditer when iterate over text and save to file.
Look like this.

import re

data = '''\
line2 jdsbvbsf
line3 <match name="item1" rhs="domain.com"></match>
line4 <match name="item2" rhs="domainn.com"></match>
line5 <match name="item2" rhs="1010data.com"></match>'''

with open('result.txt', 'w') as f_out:
    for match in re.finditer(r'rhs="(.*)"', data):
        f_out.write('{}\n'.format(match.group(1)))

'''Output-->
domain.com
domainn.com
1010data.com
'''
snippsat 661 Master Poster

but absolutelly no doubt that building web on PHP will be faster/easier,

That is of course a lot bullshit for someone that use Python.
I use Python for web development and don't like messy PHP.
PHP: a fractal of bad design

I wrote a little about Python and web here.
http://www.daniweb.com/software-development/python/threads/468399/learning-python

snippsat 661 Master Poster

The Problem with Integer Division
Not much else to say if you read that post.
One tips is to use from __future__ import division for Python 2.x.
Python 3.x has made chage to integer-division.

#Python 2.7
>>> 10 / 3
3
>>> from __future__ import division
>>> 10 / 3
3.3333333333333335

--

#Python 3.3
>>> 10 / 3
3.3333333333333335
Gribouillis commented: good link +14
snippsat 661 Master Poster

find_all()

The find_all() method scans the entire document looking for results,

snippsat 661 Master Poster

if there are more number of parameters, out of which we need to print version and type only

can you please help with the code to get out this

You have to post an example of how the other parameters look.
There are many ways to do this,if data is unformatet/changing regex can be and option.

import re

data = '''\
abc : 123
version: 7.0.0.9
type : NAS
ggg.2 : 4444
version: 8
type : FOO
hg.1234: test'''

result = re.findall(r"version.*|type.*", data)
print result
print zip(*[iter(result)]*2)
print [tuple(x.strip() for x in y.split(":")) for y in result] 

"""Output-->
['version: 7.0.0.9', 'type : NAS', 'version: 8', 'type : FOO']
[('version: 7.0.0.9', 'type : NAS'), ('version: 8', 'type : FOO')]
[('version', '7.0.0.9'), ('type', 'NAS'), ('version', '8'), ('type', 'FOO')]
"""
snippsat 661 Master Poster

Give an example or link og what you try to extract/parse.
As mention by krystosan it's XML data,and there are good tool for this in Python.
And library that is only for parsing RSS like Universal Feed Parser
I like both Beautifulsoup and lxml.
A quick demo with Beautifulsoup.

from bs4 import BeautifulSoup

rss = '''\
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
<channel>
<title>Python</title>
<link>http://www.reddit.com/r/Python/</link>
<description>
news about the dynamic, interpreted, interactive, object-oriented, extensible programming language Python
</description>'''

soup = BeautifulSoup(rss)
title_tag = soup.find('title')
description_tag = soup.find('description')
print title_tag.text
print description_tag.text

"""Output-->
Python

news about the dynamic, interpreted, interactive, object-oriented, extensible programming language Python
"""
snippsat 661 Master Poster

Fun this odd/even stuff,one with ternary operator.
Did like the clean look of this one.

>>> from collections import Counter
>>> from random import sample
>>> Counter("Odd" if i % 2 else "Even" for i in sample(range(0, 1000), 100))
Counter({'Even': 56, 'Odd': 44})