Menu
Menu
DaniWeb
Log In
Sign Up
Read
Contribute
Meet
Search
Search
About 429 results for
beautifulsoup
- Page 1
Re: Anybody know how to speed up beautifulsoup?
Programming
Software Development
14 Years Ago
by vegaseat
BeautifulSoup
is a third party module for Python2 that allows you to access even badly coded HTML code. What do you want to do with it?
Re: Python - writing to file
Programming
Software Development
14 Years Ago
by snippsat
… as unicode strings. In order to convert
BeautifulSoup
's unicode strings to human readable strings, you have to …
Re: Insatlling module when multiple version of pyhon is available
Programming
Software Development
12 Years Ago
by vegaseat
BeautifulSoup
requires the sgmllib module, which has been removed in Python 3.
Re: Ignoring Comments When Parsing XML?
Programming
Software Development
13 Years Ago
by snippsat
BeautifulSoup
is a famous python HTML/XML parser. [url]http://www.crummy.com/software/
BeautifulSoup
/[/url]
BeautifulSoup
is only one file
BeautifulSoup
.py. build parser like minidom,elementtree should work. If not 2 of the best is
BeautifulSoup
and lmxl. [url]http://codespeak.net/lxml/[/url]
Re: New python 3 modules
Programming
Software Development
14 Years Ago
by vegaseat
BeautifulSoup
works fine with Python30 if you copy
BeautifulSoup
.py (version3.0.7a or lower) and sgmllib.py (find it typically in C:\Python25\Lib) to a separate directory and convert both programs with 2to3.py This rather obvious approach was overlooked by the
BeautifulSoup
folks.
Re: Working with html files
Programming
Software Development
15 Years Ago
by Gribouillis
BeautifulSoup
has functions [icode]find[/icode], [icode]findAll[/icode], and related functions which should help you. Try to learn how to use them.
Re: Parsing HTML with Python
Programming
Software Development
13 Years Ago
by ultimatebuster
BeautifulSoup
and urllib2
Re: Question and Answer APIs
Programming
Software Development
13 Years Ago
by Tech B
Beautifulsoup
or even regex could lighten the load. I think there is even an html parser in the standard lib.
Re: New python 3 modules
Programming
Software Development
11 Years Ago
by Gribouillis
BeautifulSoup
4.1.3 is out since August 20, 2012. It is compatible with python 2.6+ and python 3 !
BeautifulSoup does not retrieves all 'a' tags.
Programming
Software Development
12 Years Ago
by Huakalero
…the full code: from urllib2 import urlopen from
BeautifulSoup
import
BeautifulSoup
import re class cuapi(): def __init__(self):…;_blank\"'), self.cureHTML())] self.soup =
BeautifulSoup
(html, convertEntities=
BeautifulSoup
.HTML_ENTITIES, markupMassage=myMassage) return self.soup def …
Re: BeautifulSoup to extract multiple TD tags within TR
Programming
Software Development
12 Years Ago
by sys73r
…suggestion plus some more data to test [CODE]from
BeautifulSoup
import
BeautifulSoup
html = '''\ <tr id="index_table_12345"… "build/bdist.macosx-10.7-intel/egg/
BeautifulSoup
.py", line 601, in __getitem__ KeyError: 0… some tests [CODE]>>> soup =
BeautifulSoup
(html) >>> tag = soup.findAll('td') …
BeautifulSoup and accented words
Programming
Software Development
12 Years Ago
by Huakalero
… weird characters. Code: [CODE]from urllib2 import urlopen from
BeautifulSoup
import
BeautifulSoup
page = urlopen("http://www.cinepolis.com/_CARTELERA/cartelera.aspx…?ic=2") html = page.read() soup =
BeautifulSoup
(html) complejos = soup.findAll('span',{'class':'TitulosBlanco'}) compList = [] for …
BeautifulSoup to extract multiple TD tags within TR
Programming
Software Development
12 Years Ago
by sys73r
… = urllib2.urlopen('http://www.NotAvalidURL.com').read() soup =
BeautifulSoup
(data) table = soup("tr", {'class' : 'index_table_in' }) print table[…
Re: BeautifulSoup to extract multiple TD tags within TR
Programming
Software Development
12 Years Ago
by snippsat
… 1,string 2.... just iterate over the content. [CODE]from
BeautifulSoup
import
BeautifulSoup
html = '''\ <tr id="index_table_12345" class="…; <!--td></td--></tr>''' soup =
BeautifulSoup
(html) tag = soup.findAll('td') #all "td" tag…
Re: BeautifulSoup to extract multiple TD tags within TR
Programming
Software Development
12 Years Ago
by snippsat
[CODE]from
BeautifulSoup
import
BeautifulSoup
html = '''\ <tr id="index_table_12345" class="index_table_in&…; <!--td></td--></tr>''' soup =
BeautifulSoup
(html) tag = soup.findAll('a') #all "a" tag…
Re: BeautifulSoup to extract multiple TD tags within TR
Programming
Software Development
12 Years Ago
by sys73r
… got it working: [CODE]import urllib2 from
BeautifulSoup
import
BeautifulSoup
data = urllib2.urlopen('http://').read() soup =
BeautifulSoup
(data) tag = soup.findAll('a') #all…
Re: BeautifulSoup and accented words
Programming
Software Development
12 Years Ago
by Gribouillis
Didn't you forget the argument [icode]convertEntities=
BeautifulSoup
.HTML_ENTITIES[/icode] in
BeautifulSoup
() ?
Re: BeautifulSoup to extract multiple TD tags within TR
Programming
Software Development
12 Years Ago
by sys73r
…; <!--td></td--></tr>''' soup =
BeautifulSoup
(html) tag = soup.findAll('a') #all "a" tag…
problem parsing webpage using BeautifulSoup
Programming
Software Development
12 Years Ago
by hemant_rajput
…from it, but the problem is ,
Beautifulsoup
while parsing not returning the whole content of…: [CODE] from urllib2 import urlopen from
BeautifulSoup
import
BeautifulSoup
#reading the webpage source webpage = urlopen('… webpage content into variable named soup using
beautifulsoup
soup =
BeautifulSoup
(''.join(webpage)) print soup #finding all …
Re: problem parsing webpage using BeautifulSoup
Programming
Software Development
12 Years Ago
by snippsat
…drop to simulate javascript and use regex(because
Beautifulsoup
cant find stuff in javascript) [CODE]…from urllib2 import urlopen from
BeautifulSoup
import
BeautifulSoup
import re webpage = urlopen('http://www.…santabanta.com/photos/aalesha/10066001.htm') soup =
BeautifulSoup
(webpage) #print soup bac_img = re.search(r&…
Re: problem parsing webpage using BeautifulSoup
Programming
Software Development
12 Years Ago
by hemant_rajput
…drop to simulate javascript and use regex(because
Beautifulsoup
cant find stuff in javascript) [CODE]…from urllib2 import urlopen from
BeautifulSoup
import
BeautifulSoup
import re webpage = urlopen('http://www.…santabanta.com/photos/aalesha/10066001.htm') soup =
BeautifulSoup
(webpage) #print soup bac_img = re.search(r&…
Help with Navigating BeautifulSoup Tree
Programming
Software Development
13 Years Ago
by kshw
…and return it? Thanks [CODE]import re import urllib2 from
BeautifulSoup
import
BeautifulSoup
, NavigableString html = ['<html><head>&…NavigableString): print str(current) Text += str(current) return Text soup =
BeautifulSoup
(''.join(html)) Page_Text = ParseContent(soup) print "Text after function…
Help with Python Threading Library with BeautifulSoup.
Programming
Software Development
9 Years Ago
by John A.
… Queue import threading import urllib2 import time from
BeautifulSoup
import
BeautifulSoup
hosts = ["http://waoanime.tv"] queue…from queue chunk = self.out_queue.get() soup =
BeautifulSoup
(chunk) #parse the chunk for line in soup.findAll…
HTML Scraper: Urllib2 / BeautifulSoup / Regex Help
Programming
Software Development
15 Years Ago
by katamole
… i have ironed out the following problems): [code]from
BeautifulSoup
import
BeautifulSoup
import urllib2 import re #get source code of page (function…/find?s=" + searchstring print url source = fetchsource(url) soup =
BeautifulSoup
(source) filmlink = soup.find('a', href=re.compile("title…
Re: HTML Scraper: Urllib2 / BeautifulSoup / Regex Help
Programming
Software Development
15 Years Ago
by Gribouillis
… is coming through ok. [ICODE]rating_source = fetchsource(pagelink) soup =
BeautifulSoup
(rating_source) ratingregexp = re.compile(r"^[^/]*/10$") rating_element = soup…= fetchsource("http://www.imdb.com/title/tt0071853/") soup =
BeautifulSoup
(source) ratingregexp = re.compile(r"^[^/]*/10$") rating_element = …
newb: BeautifulSoup
Programming
Software Development
16 Years Ago
by jobs
I am trying to use
BeautifulSoup
: soup =
BeautifulSoup
(page) td_tags = soup.findAll('td') i=0 for td in …
Re: HTML Scraper: Urllib2 / BeautifulSoup / Regex Help
Programming
Software Development
15 Years Ago
by Gribouillis
…&q=" + searchstring print url source = fetchsource(url) soup =
BeautifulSoup
(source) filmlink = soup.find('a', href=re.compile(r"…
Re: HTML Scraper: Urllib2 / BeautifulSoup / Regex Help
Programming
Software Development
15 Years Ago
by katamole
… source is coming through ok. [ICODE]rating_source = fetchsource(pagelink) soup =
BeautifulSoup
(rating_source) ratingregexp = re.compile(r"^[^/]*/10$") rating_element = soup…
Re: newb: BeautifulSoup
Programming
Software Development
16 Years Ago
by jobs
[code="Python"] soup =
BeautifulSoup
(page) td_tags = soup.findAll('td') i=0 for td in …
Re: Anybody know how to speed up beautifulsoup?
Programming
Software Development
14 Years Ago
by gunbuster363
… is an example (Python2 code) ... [code]import urllib from
BeautifulSoup
import
BeautifulSoup
, SoupStrainer html = urllib.urlopen("http://python.org").read… = SoupStrainer('a') # create a list a_tags = [tag for tag in
BeautifulSoup
(html, parseOnlyThese=a_tag)] # show all the a_tag lines for line…
1
2
3
8
Next
Last
Search
Search
Forum Categories
Hardware/Software
Programming
Digital Media
Community Center
Latest Content
Newest Topics
Latest Topics
Latest Posts
Latest Comments
Top Tags
Topics Feed
Social
Meet People
Forums
Top Members
Community Functions
DaniWeb Premium
Newsletter Archive
Markdown Syntax
Community Rules
Developer APIs
Connect API
Forum API Docs
Tools
SEO Backlink Checker
Legal
Terms of Service
Privacy Policy
FAQ
About Us
Advertise
Contact Us
© 2024 DaniWeb® LLC