954,174 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Regex problem

Hello... I have problem with regex, I want to get the tables from another site... the regex is re.compile(r'(.*?)') but when I get the tables I don't know how to return the value into the tables.

#html = the page
tbl = re.compile(r'<table.*?>(.*?)</table>')
return tbl.sub('', html) #return html without <table>...</table>
# how to return the value only of <table> tags... without the other tags ?


Thanks.

Krstevski
Junior Poster
110 posts since May 2009
Reputation Points: 17
Solved Threads: 5
 

Hello... I have problem with regex, I want to get the tables from another site... the regex is re.compile(r'(.*?)') but when I get the tables I don't know how to return the value into the tables.

#html = the page
tbl = re.compile(r'<table.*?>(.*?)</table>')
return tbl.sub('', html) #return html without <table>...</table>
# how to return the value only of <table> tags... without the other tags ?

Thanks.


I suggest

the_list = tbl.findall(html)

There is also finditer which returns a sequence of match objects.

Gribouillis
Posting Maven
Moderator
2,781 posts since Jul 2008
Reputation Points: 1,024
Solved Threads: 691
 

I suggest

the_list = tbl.findall(html)

There is also finditer which returns a sequence of match objects.

Thanks man, it works :)

Krstevski
Junior Poster
110 posts since May 2009
Reputation Points: 17
Solved Threads: 5
 

This question has already been solved

Post: Markdown Syntax: Formatting Help
You