![]() |
| ||
| Extract URL How can I extract URLs from webpage and want that all url should be from some specific site only like "www.abc.com/32432/file.zip" it should search abc.com and the extenstion can zip,rar,001 any help? |
| ||
| Re: Extract URL In Firefox/Windows you would put http://www.abc.com into the address bar, once the site was loaded, press CTRL-U to bring up the source, then CTRL-C/CTRL-V on whatever urls you want. :twisted: I think the concept you're looking for is a website scraper, there are a lot of different options for doing this from regular expressions, to xpath, which is one of my personal favorites. Come back with some conceptual code and I'll be more than happy to help you work through it. |
| ||
| Re: Extract URL If you want to extract the url's from the page then I have an existing script that not only extracts to links to other pages but also links to pictures and other media. My script is as follows: function getlinks($url) {May be badley written but does the job. So I shall see if I can do a preg_match function.======================= Edit: I have now written a function that will extract the links more efficiently and is as follows: <?And the function as you can see returns an array of the links. |
| ||
| Re: Extract URL Alright, since you responded with a great example of how to do it with regular expressions, I guess I can provide an xpath example using the DOM as i mentioned previously. <?php http://images.google.com/imghp?hl=en&tab=wi The only thing to be aware of here, is urls that are relative and not full paths. You would need to put some logic in place to add the domain back to them if its not there already. |
| All times are GMT -4. The time now is 12:40 pm. |
Forum system based on vBulletin Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
©2003 - 2009 DaniWeb® LLC