954,597 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Parse through text to get domain names/URLs

I am trying to get Domain names/URLs from the title and description in ebay, but the text varies a lot.

Note that the url/domain names are limited to *.com, *.net, *.info, etc.

For instance, I'd have to parse through text such as:

IPAD SCREEN LENS.COM 3 4 5 Letter LLLL CCC Domain Name [have to get IPADSCREENLENS.COM here]

Argyle Socks.info - Website DOMAIN NAME For Sale! [have to get ARGYLESOCKS.INFO]

Buyer Work.com Website DOMAIN NAME For Sale NR![have to get BUYERWORK.COM]

Besides this, I was able to get some "possible" URLs listed like so:

Buyer Work.com Website DOMAIN NAME For Sale NR!

URLs in title: Array
(
[0] => WORK.COM
)
URLs in description: Array
(
[0] => BUYERWORK.COM
[1] => T0.GSTATIC.COM
[2] => DESERTPROPERTYSPECIALIST.COM
[3] => BUYER.JP
[4] => AUCTIONINSIGHTS.INFO
[5] => UYERWORK.COM
[7] => APPRAISE.PH
[14] => O.GI
[17] => AUCTIONLINK.TO
[19] => INDEX.HTML
)


Houses View.com Website DOMAIN NAME For Sale NR!
URLs in title: Array
(
[0] => VIEW.COM
)
URLs in description: Array
(
[0] => HOUSESVIEW.COM
[1] => ARTOFTHESTATE.CO.UK
[2] => LONDON.JP
[3] => AUCTIONINSIGHTS.INFO
[5] => APPRAISE.PH
[12] => O.GI
[15] => AUCTIONLINK.TO
[17] => INDEX.HTML
)

I was thinking of plainly removing the spaces, but I think that would complicate it more.

Any suggestions would be really appreciated.

Regards,
John

jrhitokiri
Newbie Poster
19 posts since Sep 2010
Reputation Points: 10
Solved Threads: 0
 

maybe this help..
removing the spaces before and after text you can use trim() function

codewall
Junior Poster in Training
80 posts since Dec 2010
Reputation Points: 13
Solved Threads: 11
 
maybe this help.. removing the spaces before and after text you can use trim() function

the trim() function only removes excessive whitespaces and not single spaces. Anyway, doing so would complicate it a bit more, and I may end up with more erratic output

e.g.
Houses View.com Website DOMAIN NAME For Sale NR!
becomes
HousesView.comWebsiteDOMAINNAMEForSaleNR!

Buyer Work.com Website DOMAIN NAME For Sale NR!
becomes
BuyerWork.comWebsiteDOMAINNAMEForSaleNR!

etc etc..

Thanks for the reply though. Assures me that there is someone who reads this thread.:>

Regards,

John

jrhitokiri
Newbie Poster
19 posts since Sep 2010
Reputation Points: 10
Solved Threads: 0
 

This article has been dead for over three months

Post: Markdown Syntax: Formatting Help
You
View similar articles that have also been tagged: