Hi everyone,

Ive built an email marketing tool and have found that some users when editing the content via the FCK editor they enter in additional parameters. IE = Target class style etc.

I currently only want the application to process the href="" content only with the output contining the URL so i can process it into my database to build a tracking URL.

Regular expressions are what i am using to edit the content transparently to the user however it is still returning the additional tag parameters.

below is my code

<cfsavecontent variable="arguments.pageContent">
<p><a target="_self" id="testID" name="myTest" lang="EU" accesskey="1" tabindex="1" title="test title" type="my content" charset="UTF-8" class="testClassName" style="testStyle sheet" href="http://www.facebook.com"><img width="1047" height="583" alt="" src="/siteMediaFiles/CELEBRITY%20COLLECTION%20HOME%20PAGE.jpg" /></a>&nbsp;<a onclick="window.open(this.href,'','resizable=yes,location=yes,menubar=yes,scrollbars=no,status=no,toolbar=no,fullscreen=no,dependent=no,status'); return false" href="http://www.google.com"><img width="793" height="481" alt="" src="/siteMediaFiles/Artisan-Riad-EMAIL.jpg" /></a><a target="_blank" href="http://www.gmail.com"><img width="289" height="609" alt="" src="/siteMediaFiles/BANNER.jpg" /></a></p>
</cfsavecontent>

<cfset matches = reMatch("<[aA].*?>.*?</[aA]>",arguments.PageContent)>

	<cfset mylinks = arrayNew(1)>
        <cfloop index="a" array="#matches#">
            <cfset myURL = rereplace(a, '">.*?</a>',"","all")>
            <cfset myURL = rereplace(myUrl, '<a.*href="',"","all")>
            <cfset arrayAppend(myLinks,myURL)>    
        </cfloop>
<cfdump var="#myURL#">

<cfdump var="#myLinks#">

Recommended Answers

All 4 Replies

Not sure I'm following. Do you want to remove all the attributes in the anchor tag (<a>), except the href attribute/value? Or do you want to remove whatever is inside the anchor tag (tags, text, etc.)? Confused a bit, but when I see consecutive REMatches/REReplaces, I can usually reduce it to just 1 REMatch using 1 regexp.

... or do you want to extract the href URL for tracking purposes? Your goal wasn't clear to me either.

Hi

Yes i would like to extract the url for tracking purposes.
The issue is that when i paste in my html tags i dont have any other attributes prior to the href attribute.

example

<a href="http://www.google.com">Google</a>

but when my users use the HTML from other applications like photoshop or the text editor within the CMS it generates tags like

<a target="_parent" onclick="somefunction()" href="http://www.google.com">google</a>

so what i would like to do is strip out all attributes and values between a & href="" to ensure that i only get the required string.

any ideas ?

Ok, but do you just want to store the URL ie http://www.somesite.com *only* or the whole anchor string ie

<a href="http://www.somesite.com"><img width="1047" height="583" alt="" src="/siteMediaFiles/CELEBRITY%20COLLECTION%20HOME%20PAGE.jpg" /></a>
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.