0

We know we already have built-in Trim methods, but trimming doesn't get rid of internal, (and unwanted) extra spaces. -So this is where Normalize method comes to play. It trims left, it trims right, but most importantly it also trims on the inside, one could say: "it trims inside-out". In fact it treats the text-string exactly as HTML parser does.

I had hard time to actually decide how to name the method but I finally did.

String.Normallize();

Using it is strait and simple

[string_var].Normalize();

I've also taken care to make it usable in a manner like:

"".Normalize("   this   string   contains to many   spaces   ");

//to return:

>> "this string contains to many spaces"//not anymore.

Which might prove useful in rare occasions, and most probably never. Yet it doesn't hurt having it at hand, even though using it like:

"   this   string   contains to many   spaces   ".Normalize();

as with other existing methods would also be possible, but I don't consider it as clean and as readable as previous.

Yet when working with strings, - speed is always an issue...
So I wrote a test and run a few tests. Turns out that 'blazing' is not an overstatement.

To make a browser 'sweat' and escape some pseudo-optimization cheats I took a string of 1024bytes [1KB] x 100 000 iterations = 10MB worth of data processed and the results were; - well, very satisfactory. (~3 seconds). Because to open, (that is) render a page of 10MB (plain text) [locally], would most probably require more, or at least the same amount of time.

The test-string is 'a worst case scenario' highly atomized; every 2-letter "word" is separated by 2 white-space characters.

The code, which as it turns out, could also be used as a >>real-world<< browser benchmark (which will be provided here) takes only the bare algorithm of the method presented and adds some extra optimization code necessary for this lengthy string iteration [according to my experience] to be the fastest possible.

The loop used, is my fastest.
For the regexp pattern, -constructor is used,( which in modern browsers still provides a little improvement although barely noticeable).
The result assignment line is enclosed in (), where the improvement is very noticeable. etc...

The code is below; the results taken from my machine:

First click time:
Op 3.112 seconds
IE 3.136 seconds
Sa 3.316 seconds
Fx 3.402 seconds
Ch 4.801 seconds
[all latest release browser versions]

(your actual speed scale results will defer depending on your hardware)

p.s.:
the second click changes the string and the results but that's not very important because the second click will work on already normalized string.

The Test Page:

<!doctype html>
<html>
<head>
	<title>String Normalize: 100MB worth data</title>
	<style>
		#cnt { word-wrap: break-word }
	</style>
	<script>

	function go(){
	var cnt = document.getElementById('cnt');		
		var s = cnt.innerHTML;	

		var re = new RegExp("\\S+","gi");

		var c, endT, iter=100000;
		var start = new Date();
		while(iter--){ //the actual workplace
		 (c = s.match(re).join(' '));
		}
		endT = new Date();

		return cnt.innerHTML=
			"parsed in: "+
				((endT.valueOf()-start.valueOf())/1000)+
			' seconds!'+'<br>'+c.fontcolor('red');
		}	

	onclick=function(){go()}
	</script>
</head>
<body>
<p>click: test/result...</p>
<pre id='cnt'>oo  pp  qq  rr  ss  tt  uu  vv  xx  yy  zz  00  11  22  33  44  55  66  77  88  99  oo  pp  qq  rr  ss  tt  uu  vv  xx  yy  zz  00  11  22  33  44  55  66  77  88  99  oo  pp  qq  rr  ss  tt  uu  vv  xx  yy  zz  00  11  22  33  44  55  66  77  88  99  oo  pp  qq  rr  ss  tt  uu  vv  xx  yy  zz  00  11  22  33  44  55  66  77  88  99  oo  pp  qq  rr  ss  tt  uu  vv  xx  yy  zz  00  11  22  33  44  55  66  77  88  99  oo  pp  qq  rr  ss  tt  uu  vv  xx  yy  zz  00  11  22  33  44  55  66  77  88  99  oo  pp  qq  rr  ss  tt  uu  vv  xx  yy  zz  00  11  22  33  44  55  66  77  88  99  oo  pp  qq  rr  ss  tt  uu  vv  xx  yy  zz  00  11  22  33  44  55  66  77  88  99  oo  pp  qq  rr  ss  tt  uu  vv  xx  yy  zz  00  11  22  33  44  55  66  77  88  99  oo  pp  uu  vv  xx  yy  zz  00  11  22  33  44  55  66  77  88  99  oo  pp  uu  vv  xx  yy  zz  00  11  22  33  44  55  66  77  88  99  oo  pp  uu  vv  xx  yy  zz  00  11  22  33  44  55  66  77  88  99  oo  pp  uu  vv  xx  yy  zz  00  11  22  33  44  55  66  77  88  00</pre>

</body>
</html>

All suggestions and remarks are welcome.
Have fun.

String.prototype.Normalize=
/*b.b Troy III p.a.e*/
function(x){return(x||this).match(/\S+/gi).join(' ')}
2
Contributors
4
Replies
10
Views
5 Years
Discussion Span
Last Post by Troy III
0

I had to change my mind..., 'get inconsistent results with this...[problem]! -Would you mind checking if you are being able to prototype the String object on your Firefox?
'cause I'm having some problems with mine.

0

Nope the inf was correct!
Sorry I had to delete it
">>"You can't prototype the Sting and other built-in objects in Firefox!"<<"
Not anymoe!
I wonder though: - Is Firefox missing its old NN4.7 days - and thee reputation[?!]

Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.