'beautiful-soup' Forum Topics | DaniWeb

8 Topics

	Topic Title
	how to get SRC of img tag 8 Years Ago 8 Years Ago Share on Facebook Share on Twitter Share on LinkedIn Hello. This is my code: from bs4 import BeautifulSoup import urllib2 url = urllib2.urlopen('http://www.website_address.com') soup = BeautifulSoup(url) images = soup.find_all('img') Now how can I get the "src" of img tags? Software Development beautiful-soup img python src tags 0 2 17K
	filtering normal readable text from junk 9 Years Ago 9 Years Ago Share on Facebook Share on Twitter Share on LinkedIn I'm using beautifulsoup to grab text from HTML files. Buts its not perfect: For example it seems to keep css and javascript code that was added haphazardly. My overall goal is to make a list of words and their frequency to compare and contrast html files to categorize them. Dealing … Software Development beautiful-soup python 0 3 277
	Returning only tags with certain siblings (Beautiful Soup) 10 Years Ago 10 Years Ago Share on Facebook Share on Twitter Share on LinkedIn Hi everyone, I'm trying to extract text from between tags but only in certain conditions. <title> and <pos> are both children of <page>, but neither one is nested inside the other (i.e., they're siblings). Each <page> always has one <title> and zero to 5 <pos> sections. What I need to … Software Development beautiful-soup nested-loop python scraping 0 3 2K
	Help with Python Threading Library with BeautifulSoup. 10 Years Ago 10 Years Ago Share on Facebook Share on Twitter Share on LinkedIn Hey guys, I'm trying to get all links on a website using BeautifulSoup, Queue, Threading, and urllib2. I am specifically looking for links that lead to other pages of the same site. It runs for a few seconds, going through about 3 URLs before giving me the error: Traceback (most … Software Development beautiful-soup python urllib2 0 8 729
	Parse an HTML doc 11 Years Ago 11 Years Ago Share on Facebook Share on Twitter Share on LinkedIn Hi, I have a HTML page in one variable. I need build a mehod that will extract a tag content (dif extract_tag(self, tag_name)). For example, given webpage: <div id="mw-page-base" class="noprint"></div> <div id="mw-head-base" class="noprint"></div> <!-- content --> <div id="content" class="mw-body"> <a id="top"></a> <div id="mw-js-message" style="display:none;"></div> <!-- sitenotice --> <div id="siteNotice"><!-- centralNotice … Software Development beautiful-soup grep parse python regex tags 0 1 249
	problem parsing webpage using BeautifulSoup 12 Years Ago 12 Years Ago Share on Facebook Share on Twitter Share on LinkedIn Hi, i've used the Beautifulsoup module to parse the site and grab the img tag from it, but the problem is , Beautifulsoup while parsing not returning the whole content of the given url. The truncated content contain the image location I want to download: [CODE] from urllib2 import urlopen … Software Development beautiful-soup images parse python 0 4 447
	BeautifulSoup does not retrieves all 'a' tags. 12 Years Ago 12 Years Ago Share on Facebook Share on Twitter Share on LinkedIn In the line 72 of the code i do a findAll to retrieve all 'a' tags that have a 'horariosCarteleraUnderline' class and that have an href url that contains `?ic=[code]&` where code is a common code used to identifie the movie start time. It should retrieve all movie times, but … Software Development beautiful-soup html-parser python 0 1 219
	BeautifulSoup and accented words 12 Years Ago 12 Years Ago Share on Facebook Share on Twitter Share on LinkedIn Hi, I am using beautiful soup to get data from a webpage. With help I was able to get a list of cities with correct accents. Now am trying to get a list of movie theaters in a selected city but these come with no accents, but with weird characters. … Software Development accents beautiful-soup character python spanish 0 2 858

The End.