Hello... I have problem with regex, I want to get the tables from another site... the regex is re.compile(r'<table.*?>(.*?)</table>') but when I get the tables I don't know how to return the value into the tables.

#html = the page
tbl = re.compile(r'<table.*?>(.*?)</table>')
return tbl.sub('', html) #return html without <table>...</table>
# how to return the value only of <table> tags... without the other tags ?

Thanks.

Recommended Answers

All 2 Replies

Hello... I have problem with regex, I want to get the tables from another site... the regex is re.compile(r'<table.*?>(.*?)</table>') but when I get the tables I don't know how to return the value into the tables.

#html = the page
tbl = re.compile(r'<table.*?>(.*?)</table>')
return tbl.sub('', html) #return html without <table>...</table>
# how to return the value only of <table> tags... without the other tags ?

Thanks.

I suggest

the_list = tbl.findall(html)

There is also finditer which returns a sequence of match objects.

I suggest

the_list = tbl.findall(html)

There is also finditer which returns a sequence of match objects.

Thanks man, it works :)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.