Hi Everyone,

I am currently working on a site that will have lots of European and Middle-Eastern town names that include accented and other special characters. I have never had to deal with these characters before and I foolishly assumed that setting everything to utf-8 would take care of it all for me but I have been pulling my hair out over this for a few hours now and I am hoping that someone here can help me out.

I am calling a web service from GeoNames.org with grid coordinates as arguments. The result is returned in JSON format which I then run through the DeserializeJSON function and pull out the value that I am interested in. However, when I then output that value any special characters are replaced with odd characters like the square root symbol.

I have checked the raw JSON string and found that the characters all look in-tact but somewhere along the way it is getting mangled. The characters are also not displayed correctly if I do a cfdump of the response:


Here's an example of a WS request: http://ws.geonames.org/findNearbyPlaceNameJSON?lat=45.421&lng=0.8382〈=en

and here's a trimmed down version of my page:

<cfprocessingdirective pageEncoding="utf-8"> 
			
	<cfhttp charset="utf-8" url="http://ws.geonames.org/findNearbyPlaceNameJSON?lat=#Left(getLatLon.lat,6)#&lng=#Left(getLatLon.lon,6)#&lang=en" result="response" />
			
	<cfset response = DeSerializeJSON(response.fileContent) />
			
	<cfdump var="#response#">			
			
	<cfset townName = response.geonames[1].name />


	<cfcontent type="text/html; charset=utf-8">
		<cfoutput>#ToString(townName)#</cfoutput>
	</cfcontent>


I realise that I have gone a bit OTT with the utf-8's but it has driven me to it :)

Any help would be much appreciated.
Thanks,
Paulo.

Recommended Answers

All 6 Replies

Hi Paolo!

Any luck with your problem? I am facing the same issues when I try to translate text with Google Translate API. The JSON response I got it messes up all the special characters. I am not sure if it's the API or the deserializeJSON() responsible for the mess, but I think it must be the deserializeJSON().

Don't know about your code. But Paolo's example shows a java.io.ByteArrayOutputStream as the fileContent.

<cfdump var="#response#">

A possible fix is explicitly convert it to UTF8 first. Then deserialize.

...
<!--- response is a java.io.ByteArrayOutputStream for some reason?
      convert it to UTF8 string --->
<cfset utf8Response = response.fileContent.toString("UTF8")>
<cfset response = DeSerializeJSON(utf8Response) />
<cfset townName = response.geonames[1].name />
...

... or

...
<cfset response = DeSerializeJSON(ToString(response.fileContent.toByteArray(), "UTF8")) />
<cfset townName = response.geonames[1].name />
<cfcontent type="text/html; charset=utf-8"/>
<cfoutput>#townName#</cfoutput><br />

Try using utf-16.

Huh?

@aargh - YOU'RE THE MAN! It works great now! Thank you!

You're welcome. I don't know why the charset="utf-8" isn't taking effect. But at least there is a work around!

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.