parse html,lose many lines why?

luofeiyu -3 Newbie Poster

12 Years Ago

here is my code:

import urllib
import lxml.html

equitydown="http://sc.hkex.com.hk/gb/www.hkex.com.hk/chi/market/sec_tradinfo/stockcode/eisdeqty_c.htm"
file=urllib.urlopen(equitydown).read()
root=lxml.html.document_fromstring(file')

rdata = root.xpath('//tr[@class="tr_normal" and (.//img)]')
for data in rdata:
data.getparent().remove(data)

root1=lxml.html.tostring(root)
my=open('c:\\hk1.html','w')
my.write(root1)
my.close()

when i open c:\hk1.html，comparing it with
http://sc.hkex.com.hk/gb/www.hkex.com.hk/chi/market/sec_tradinfo/stockcode/eisdeqty_c.htm

there is a problem ,many lines in the
http://sc.hkex.com.hk/gb/www.hkex.com.hk/chi/market/sec_tradinfo/stockcode/eisdeqty_c.htm
such as
06830 华众控股 2,000 #
06838 盈利时 2,000 #
06868 天福 1,000 #
06880 豪特保健 2,000 #
06883 新濠博亚娱乐 300 #

can't find in the c:\hk1.html,why??

xml

1 Contributor
0 Replies
51 Views

Be the first to reply

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.