Beautiful Soup and Python Error Programming Software Development by jacob501 I am trying to use Beautiful Soup to scrape a website, Locationary.…urllib2.urlopen('http://www.locationary.com/').read() soup = BeautifulSoup(page) print soup.prettify() [/code] However when I add…/Raleigh/Noodles_%26_Company-p1022884996.jsp').read() soup = BeautifulSoup(page) print soup.prettify() [/code] With the above code… Beautiful Soup default parser Programming Software Development by rwe0 … using Beautiful Soup 4, python 3.x on a project just to learn it. 1. soup = BeautifulSoup(s ) # use default parser 2. soup = BeautifulSoup… HTML table parse problem (beautiful soup) Programming Software Development by njparton …problem entering the website and downloading the html, it's beautiful soup that's tripping me up. My python code and …LoginButton')).read() #Beautiful Soup bit, helps with deciphering webpage encoding, which otherwise isn't obvious soup = BeautifulSoup(''.join(page)) soup = soup.prettify() soup.findAll('table') except… Re: HTML table parse problem (beautiful soup) Programming Software Development by njparton …'t work out how to do this next step using beautiful soup. For example the first row I'm interested in contains… webscraping with beautiful soup (extracting images) Programming Software Development by bkjfdghiuds … import re def cleanHtml(i): i = str(i) # Convert the Beautiful Soup Tag to a string bS = BeautifulSoup(i) # Pass the string… to Beautiful Soup to strip out html # Find all of the text between… Search particular text in HTML using beautiful soup and python Programming Software Development by Aung Myat … I just started learning Python and Beautiful Soup. I am developing a script to …BeautifulSoup from BeautifulSoup import NavigableString def Check_Tester(): soup = BeautifulSoup(urllib2.urlopen("http://compat.sing…amp;ORDERBY=sporder").read()) Key_Word = soup.findAll('td',text='Running') def main(): Check_Tester()… Re: Search particular text in HTML using beautiful soup and python Programming Software Development by tomstratton I'm not familiar with Beautiful Soup but I assume that soup (in your code) is a text string. If I'm … Re: Returning only tags with certain siblings (Beautiful Soup) Programming Software Development by Afroula … AttributeError telling me that there was no next sibling (despite soup.prettify showing clearly that <title> and <pos… your suggestion, though. I have a lot to learn about Beautiful Soup! Re: webscraping with beautiful soup (extracting images) Programming Software Development by snippsat …/index.php?/topic/95817-2-for-1-pics/page__st__500') soup = BeautifulSoup(url) links = soup.findAll('img', src=True) for link in links… Re: Beautiful Soup and Python Error Programming Software Development by Gribouillis I get a better result with [code=python] page = urllib2.urlopen('http://www.locationary.com/place/en/US/North_Carolina/Raleigh/Noodles_&_Company-p1022884996.jsp').read() [/code] (I replaced %26 with &) Re: Beautiful Soup and Python Error Programming Software Development by jacob501 [QUOTE=Gribouillis;1712666]I get a better result with [code=python] page = urllib2.urlopen('http://www.locationary.com/place/en/US/North_Carolina/Raleigh/Noodles_&_Company-p1022884996.jsp').read() [/code] (I replaced %26 with &)[/QUOTE] Oh well...thats what my code looks like already. Daniweb just changed it a little...putting it on … Re: Beautiful Soup and Python Error Programming Software Development by Gribouillis [QUOTE=jacob501;1712668]Oh well...thats what my code looks like already. Daniweb just changed it a little...putting it on one line doesn't change anything for me...I still get the weird result ("&lsaquo (DOT)) Or do you mean that that link worked for you and you got the HTML from it?[/QUOTE] I mean did you replace the %26 in the url by… Re: Beautiful Soup and Python Error Programming Software Development by jacob501 [QUOTE=Gribouillis;1712669]I mean did you replace the %26 in the url by & ?[/QUOTE] Oh wow!!! Thank you so much! I looked through your code for differences at first but barely missed this. Thanks!! It works now. Re: Beautiful Soup and Python Error Programming Software Development by Gribouillis hehe Re: Beautiful Soup and Python Error Programming Software Development by jacob501 [QUOTE=Gribouillis;1712672]hehe[/QUOTE] :) Re: Beautiful Soup and Python Error Programming Software Development by jacob501 Crap...its not working anymore. Re: Beautiful Soup and Python Error Programming Software Development by jacob501 Why did it only work once?? Re: Beautiful Soup and Python Error Programming Software Development by Gribouillis [QUOTE=jacob501;1712678]Why did it only work once??[/QUOTE] I don't know why it worked only once. Obviously the content has a special encoding. I did this [code=python] from urllib import urlretrieve urlretrieve('http://www.locationary.com/place/en/US/North_Carolina/Raleigh/Noodles_%26_Company-p1022884996.jsp', 'myfile.jsp') [/code] Then when… Re: Beautiful Soup and Python Error Programming Software Development by jacob501 [QUOTE=Gribouillis;1712704]I don't know why it worked only once. Obviously the content has a special encoding. I did this [code=python] from urllib import urlretrieve urlretrieve('http://www.locationary.com/place/en/US/North_Carolina/Raleigh/Noodles_%26_Company-p1022884996.jsp', 'myfile.jsp') [/code] Then when I cat myfile.jsp in a terminal it… Re: Beautiful Soup and Python Error Programming Software Development by Gribouillis [QUOTE=jacob501;1712709]Sorry. I'm kind of new to all this prgramming stuff. What is a BOM and how will it help?[/QUOTE] The BOM is the 2 first bytes of the file. It's used to detect encoding (see wikipedia). In our case, I found \x1f\x8b, and google tells me that this marks files compressed with gzip. Indeed my linux system detects a compressed … Re: Beautiful Soup and Python Error Programming Software Development by jacob501 Oh. Okay. I ran it a few times to check and it worked! Thanks! Now I know what a BOM is too! Re: Beautiful Soup and Python Error Programming Software Development by Gribouillis [QUOTE=jacob501;1712721]Oh. Okay. I ran it a few times to check and it worked! Thanks! Now I know what a BOM is too![/QUOTE] You can also uncompress it without using a temporary file like this [code=python] from urllib2 import urlopen from gzip import GzipFile from cStringIO import StringIO fobj = urlopen('http://www.locationary.com/place/en/US/… Re: Beautiful Soup default parser Programming Software Development by Gribouillis Startpage led to [this blog entry](https://medium.com/p/f2fa442daf99). Perhaps you have lxml on one OS and not on the other. Re: Beautiful Soup default parser Programming Software Development by rwe0 Thank you. Specifing 'html.parser' explicitly made it work. Yes, I had installed lxml on my linux system and had no idea the default had been switched. web scraping with beautiful soup Programming Software Development by Geethu_2 …urlopen(urls[0]).read() except: print urls[0] soup=BeautifulSoup(htmltext) urls.pop(0) print len(urls) …print link + " -not found" # soup = BeautifulSoup(page) # text_ang= str(soup.get_text().replace('\n',' ').replace('\t',' ').replace('\r',' … Returning only tags with certain siblings (Beautiful Soup) Programming Software Development by Afroula …('sourcedata.xml') fixed = open('ftemp.txt','w') soup = BeautifulSoup(file, "lxml") divTag = soup.find_all("page") for tag in… Re: Returning only tags with certain siblings (Beautiful Soup) Programming Software Development by snippsat …> </page>''' from bs4 import BeautifulSoup soup = BeautifulSoup(xml) title = soup.find_all('title') for index, item in enumerate(title): print… Re: HTML table parse problem (beautiful soup) Programming Software Development by sneekula So, what are your errors? Re: webscraping with beautiful soup (extracting images) Programming Software Development by TrustyTony > titleString = '<span rel='lightbox'><img src='(.*)' alt='Posted Image' class='bbc_img' />' Use double quotes to include the single quotes inside the string (I fixed it when moving your code here) But listIterator is not defined, what is it? Maybe # Print out the results to screen for t in findPatTitle: print… Re: Search particular text in HTML using beautiful soup and python Programming Software Development by TrustyTony Yes, that's true Beautifullsoup can take regular expressions. I have not used it myself but there seems to be quite a lot documentation and examples around. Where you learned to use it? Maybe there is something to learn still, from for example [url]http://www.crummy.com/software/BeautifulSoup/download/2.x/documentation.html[/url]