I'm not sure how crazy this sounds, but bear with me. I have a long list of links that are people's names. Inside each link is a set of information, where an email address is located after other information. I need to make (or just find in general) something that will open each link and find the email and copy it. This sounds kinda ludacris to me, but I'm not as well informed on web scraping as you guys. The ultimate goal is to get the emails into the corresponding cell in excel (Mac, 2016). I have no idea how to go about this and have found zero solutions thus far. Any ideas? Thanks in advance.

Edit: I should also mention that the list of links we're using were put in by hand. There's no "master list" besides the one I made myself, but that one doesn't have links. I could make one, but the script/program would still need to go through and find the email.

Recommended Answers

All 2 Replies

Member Avatar for diafol

One approach: you could use DOMDocument or SimpleXML with or without XPath to get the DOM nodes (links) and then cut them to shreds with a preg_match. If you had posted the page, we could have had a look at the DOM, to see if it was a regular pattern. BTW - I'm suggesting PHP functions here, but it could be done with any server-side language.

//EDIT - just a quick search: http://stackoverflow.com/a/4423796/4629068

Loads of examples via Google. Not sure what you were searching for.

If you had posted the page, we could have had a look at the DOM, to see if it was a regular pattern.

See, I would, but I believe the actual information isn't information I'm allowed to publish. Essentially, I'm trying to make my own for security reasons. Looking into the PHP functions you suggested. Thanks, mate.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.