Hello all,

suppose I have a string which i have retrieved using the regex like the following

string_feature = '<div class="BVRRLabel BVRRRatingNormalLabel">Customer Rating</div><div class="BVRRLabel BVRRRatingNormalLabel">Value for Price</div>
<div class="BVRRLabel BVRRRatingNormalLabel">Picture Quality</div>
<div class="BVRRLabel BVRRRatingNormalLabel">Ease of Use</div>
<div class="BVRRLabel BVRRRatingNormalLabel">Features</div>'

What will be the regular expression to get the following type of list

list_feature = ['Customer Rating','Value for Price','Picture Quality','Ease of Use','Features']

Recommended Answers

All 2 Replies

Regex an html are not best friend.
Read the best answer out there(bobince)
http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags

A parser is the right tool for this,here is an example with BeautifulSoup.

from BeautifulSoup import BeautifulSoup

html = """
string_feature = '<div class="BVRRLabel BVRRRatingNormalLabel">Customer Rating</div><div class="BVRRLabel BVRRRatingNormalLabel">Value for Price</div>
<div class="BVRRLabel BVRRRatingNormalLabel">Picture Quality</div>
<div class="BVRRLabel BVRRRatingNormalLabel">Ease of Use</div>
<div class="BVRRLabel BVRRRatingNormalLabel">Features</div>"""

soup = BeautifulSoup(html)
tag = soup.findAll('div')
print [tag[i].text for i in range(len(tag))]

"""Output-->
[u'Customer Rating', u'Value for Price', u'Picture Quality', u'Ease of Use', u'Features']
"""

It`s possible to get what you what with regex to,as i posted over there are better tool.
But for fun here is the regex solution.

import re

html = '''\
'<div class="BVRRLabel BVRRRatingNormalLabel">Customer Rating</div><div class="BVRRLabel BVRRRatingNormalLabel">Value for Price</div>
<div class="BVRRLabel BVRRRatingNormalLabel">Picture Quality</div>
<div class="BVRRLabel BVRRRatingNormalLabel">Ease of Use</div>
<div class="BVRRLabel BVRRRatingNormalLabel">Features</div>'''

find_text = re.findall(r'\>(\w.+?)\<', html)
print find_text
#--> ['Customer Rating', 'Value for Price', 'Picture Quality', 'Ease of Use', 'Features']
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.