I am trying to get Domain names/URLs from the title and description in ebay, but the text varies a lot.

Note that the url/domain names are limited to *.com, *.net, *.info, etc.

For instance, I'd have to parse through text such as:

IPAD SCREEN LENS.COM 3 4 5 Letter LLLL CCC Domain Name [have to get IPADSCREENLENS.COM here]

Argyle Socks.info - Website DOMAIN NAME For Sale! [have to get ARGYLESOCKS.INFO]

Buyer Work.com Website DOMAIN NAME For Sale NR![have to get BUYERWORK.COM]

Besides this, I was able to get some "possible" URLs listed like so:

Buyer Work.com   Website DOMAIN NAME For Sale NR!

URLs in title: Array
(
    [0] => WORK.COM
)
URLs in description: Array
(
    [0] => BUYERWORK.COM
    [1] => T0.GSTATIC.COM
    [2] => DESERTPROPERTYSPECIALIST.COM
    [3] => BUYER.JP
    [4] => AUCTIONINSIGHTS.INFO
    [5] => UYERWORK.COM
    [7] => APPRAISE.PH
    [14] => O.GI
    [17] => AUCTIONLINK.TO
    [19] => INDEX.HTML
)


Houses View.com Website DOMAIN NAME For Sale NR!
URLs in title: Array
(
    [0] => VIEW.COM
)
URLs in description: Array
(
    [0] => HOUSESVIEW.COM
    [1] => ARTOFTHESTATE.CO.UK
    [2] => LONDON.JP
    [3] => AUCTIONINSIGHTS.INFO
    [5] => APPRAISE.PH
    [12] => O.GI
    [15] => AUCTIONLINK.TO
    [17] => INDEX.HTML
)

I was thinking of plainly removing the spaces, but I think that would complicate it more.

Any suggestions would be really appreciated.

Regards,
John

Recommended Answers

All 2 Replies

maybe this help..
removing the spaces before and after text you can use trim() function

maybe this help..
removing the spaces before and after text you can use trim() function

the trim() function only removes excessive whitespaces and not single spaces. Anyway, doing so would complicate it a bit more, and I may end up with more erratic output

e.g.
Houses View.com Website DOMAIN NAME For Sale NR!
becomes
HousesView.comWebsiteDOMAINNAMEForSaleNR!

Buyer Work.com Website DOMAIN NAME For Sale NR!
becomes
BuyerWork.comWebsiteDOMAINNAMEForSaleNR!

etc etc..

Thanks for the reply though. Assures me that there is someone who reads this thread.:>

Regards,

John

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.