943,547 Members | Top Members by Rank

Ad:
  • PHP Discussion Thread
  • Unsolved
  • Views: 3616
  • PHP RSS
You are currently viewing page 3 of this multi-page discussion thread; Jump to the first page
Sep 28th, 2009
0

Re: Dehasher script malfunctioning

Click to Expand / Collapse  Quote originally posted by cwarn23 ...
I just discovered that when using ascii_base() on the crc32 hash of 9 and 0 they both end up with blank strings which is a bit of a bug. For now I might try and refine my compression function. I can't really do much to edit your function because it has so many elements I haven't seen before.
You'll get blank strings in some cases. However, if you check the string length, you'll noticed it is comprised of chars. Not all characters in the ASCII table are visible. You however, still have bytes in the string.

However, I've noticed that the function does not work for very large integers due to PHP not being able to do arithmetic on them.

There are some work arounds to this in the comments on:
http://www.php.net/manual/en/function.base-convert.php

If you have bcmath enabled, you can rely on it to do the arithmetic correctly.

Below is the function modified to use BCMath.

PHP Syntax (Toggle Plain Text)
  1. if (!function_exists('bcdiv')) {
  2. //echo "No BC Math\n";
  3. function bcdiv($dividend, $divisor) {
  4. $quotient = floor($dividend/$divisor);
  5. return $quotient;
  6. }
  7. function bcmod($dividend, $modulo) {
  8. $remainder = $dividend%$modulo;
  9. return $remainder;
  10. }
  11. } else {
  12. //echo "Using BC Math\n";
  13. }
  14.  
  15. /**
  16.  * Convert Decimal to a base less then 255 comprised of ASCII chars
  17.  *
  18.  * @param Int $num
  19.  * @param Int $base (2-255)
  20.  * @return ASCII String
  21.  */
  22. function base255($num, $base = 255) {
  23. if ($num < 0) $num = -$num;
  24. $ret = array();
  25. while($num > $base) {
  26. $rem = bcmod($num, $base);
  27. $num = bcdiv($num, $base);
  28. $ret[] = chr($rem);
  29. }
  30. $ret[] = chr($num);
  31. return implode('', array_reverse($ret));
  32. }

I renamed it to base255 so it makes more sense. It should now give you correct values if you have bcmath.

I just profiled the function. It doesn't seem to use much memory at all. Just around 260Kb at the most. I tested both with and without BCMath. Are you sure it isn't something else?
Last edited by digital-ether; Oct 7th, 2009 at 12:19 pm.
Moderator
Reputation Points: 457
Solved Threads: 101
Nearly a Posting Virtuoso
digital-ether is offline Offline
1,250 posts
since Sep 2005
Sep 28th, 2009
1

Re: Dehasher script malfunctioning

Click to Expand / Collapse  Quote originally posted by cwarn23 ...
I can't really do much to edit your function because it has so many elements I haven't seen before.
It is actually very basic.

The only odd operations used are:

% - modulo or remainder
chr() - return the character represented by a number in ASCII table
floor() - round down the float to an int

The modulo returns the remainder after dividing

eg: 5%2 = 1
ie: 5/2 = 2 remainder 1

chr(96) = a
The letter a is represented by the number 96 in ASCII

Something like:

$chr = array(96=>'a', 97=>'b' ... 255);
so chr(96) = $chr[96];

And floor just removes everything after the decimal point.

eg: 5/2 = 2.5
floor(2.5) = 2


Here is the function with comments:

PHP Syntax (Toggle Plain Text)
  1. /**
  2.  * Convert Decimal to a base less then 255 comprised of ASCII chars
  3.  *
  4.  * @param Int $num
  5.  * @param Int $base (2-255)
  6.  * @return ASCII String
  7.  */
  8. function base255($num, $base = 255) {
  9. // remove the negative sign by multiplying by -1 if $num is negative
  10. if ($num < 0) $num = -$num;
  11. // an array to hold the digits of the new number
  12. $ret = array();
  13. // while the number is larger then our base, we just keep dividing it by the base
  14. while($num > $base) {
  15. // get the remainder after dividing by the base
  16. $rem = bcmod($num, $base);
  17. // divide by the base to move up one unit
  18. $num = bcdiv($num, $base);
  19. // the remainders of each division, make up the new number
  20. // we save the character the remainder represents in ASCII so we only have to save one character, instead of the number
  21. $ret[] = chr($rem);
  22. }
  23. // since the number is less then the base, it is the remainder itself
  24. $ret[] = chr($num);
  25. // we reverse the order of chars, since we started calculating remainders from the smallest unit
  26. return implode('', array_reverse($ret));
  27. }

I think its simplest to look at it when converting base 10 to base 10.

123 would be:

123/10 = 12 R 3
12/10 = 1 R 2
1
---------------
1 R2 R3 or 123

In order to do the first line with PHP:
123/10 = 12 R 3
We need to do:

PHP Syntax (Toggle Plain Text)
  1. $number = floor(123/10); // 12
  2. $remainder = 123%10; // 3

I hope that helps.
Moderator
Reputation Points: 457
Solved Threads: 101
Nearly a Posting Virtuoso
digital-ether is offline Offline
1,250 posts
since Sep 2005
Sep 28th, 2009
0

Re: Dehasher script malfunctioning

I managed to make a better function which doesn't have the gap symbols and is as follows:
php Syntax (Toggle Plain Text)
  1. function compress_string($string) {
  2. $str=array();
  3. $charconvert=array('a'=>1,'b'=>2,'c'=>3,'d'=>4,'e'=>5,'f'=>6,'1'=>7,'2'=>8,'3'=>9,'4'=>10,'5'=>11,'6'=>12,'7'=>13,'8'=>14,'9'=>15,'0'=>16);
  4. $arr=str_split($string,2);
  5. while (!empty($arr)) {
  6. for ($i=0;isset($arr[$i]);$i++) {
  7. $char=str_split($arr[$i],1);
  8. unset($arr[$i]);
  9. $v=($charconvert[$char[0]]*$charconvert[$char[1]])+32;
  10. if ($v<256) {
  11. $str[]=chr($v);
  12. } else {
  13. $str[]=chr($charconvert[$char[0]]);
  14. $arr[$i]=$char[1];
  15. $arr=implode('',$arr);
  16. }
  17. }
  18. }
  19. return implode('',$str);
  20. }
The above function I made compresses it to half the size and skips the first 32 characters on the ascii table which are useless to me. I will try this function for a few days and see how it works and hopefully this will be the function.
Sponsor
Featured Poster
Reputation Points: 410
Solved Threads: 258
Occupation: Genius
cwarn23 is offline Offline
3,004 posts
since Sep 2007
Sep 28th, 2009
0

Re: Dehasher script malfunctioning

Previous post Edit:
I discovered my function had a few memory leeks and fixed it to end up being the following:
php Syntax (Toggle Plain Text)
  1. function compress_string($string) {
  2. $str=array();
  3. $charconvert=array('a'=>1,'b'=>2,'c'=>3,'d'=>4,'e'=>5,'f'=>6,'1'=>7,
  4. '2'=>8,'3'=>9,'4'=>10,'5'=>11,'6'=>12,'7'=>13,'8'=>14,'9'=>15,'0'=>16);
  5. $arr=str_split($string,2);
  6. while (!empty($arr[0]) || $arr[0]===0) {
  7. for ($i=0;isset($arr[$i]);$i++) {
  8. $char=str_split($arr[$i],1);
  9. $arr[$i]='';
  10. $v=($charconvert[$char[0]]*$charconvert[$char[1]])+32;
  11. if ($v<256) {
  12. $str[]=chr($v);
  13. } else {
  14. $str[]=$char[0];
  15. $arr[$i]=$char[1];
  16. $arrs=implode('',$arr);
  17. unset($arr);
  18. $arr=str_split($arrs,2);
  19. unset($arrs);
  20. }
  21. unset($v);
  22. }
  23. }
  24. unset($arr,$char,$charconvert);
  25. return implode('',$str);
  26. }
Sponsor
Featured Poster
Reputation Points: 410
Solved Threads: 258
Occupation: Genius
cwarn23 is offline Offline
3,004 posts
since Sep 2007
Sep 28th, 2009
0

Re: Dehasher script malfunctioning

Click to Expand / Collapse  Quote originally posted by cwarn23 ...
Previous post Edit:
I discovered my function had a few memory leeks and fixed it to end up being the following:
php Syntax (Toggle Plain Text)
  1. function compress_string($string) {
  2. $str=array();
  3. $charconvert=array('a'=>1,'b'=>2,'c'=>3,'d'=>4,'e'=>5,'f'=>6,'1'=>7,
  4. '2'=>8,'3'=>9,'4'=>10,'5'=>11,'6'=>12,'7'=>13,'8'=>14,'9'=>15,'0'=>16);
  5. $arr=str_split($string,2);
  6. while (!empty($arr[0]) || $arr[0]===0) {
  7. for ($i=0;isset($arr[$i]);$i++) {
  8. $char=str_split($arr[$i],1);
  9. $arr[$i]='';
  10. $v=($charconvert[$char[0]]*$charconvert[$char[1]])+32;
  11. if ($v<256) {
  12. $str[]=chr($v);
  13. } else {
  14. $str[]=$char[0];
  15. $arr[$i]=$char[1];
  16. $arrs=implode('',$arr);
  17. unset($arr);
  18. $arr=str_split($arrs,2);
  19. unset($arrs);
  20. }
  21. unset($v);
  22. }
  23. }
  24. unset($arr,$char,$charconvert);
  25. return implode('',$str);
  26. }
Multiplication is associative, so you'll get numerous collisions.

PHP Syntax (Toggle Plain Text)
  1. $v=($charconvert[$char[0]]*$charconvert[$char[1]])+32;

eg:

PHP Syntax (Toggle Plain Text)
  1. compress_string('42'); // p
  2. compress_string('24'); // p

I'm not sure what you're after, so the alternatives I've given are generalizations.
Moderator
Reputation Points: 457
Solved Threads: 101
Nearly a Posting Virtuoso
digital-ether is offline Offline
1,250 posts
since Sep 2005
Sep 28th, 2009
0

Re: Dehasher script malfunctioning

Thanks for pointing that out but I should be able to code a reader that can filter the incorrect matches. So as you pointed out the string "aecd" would also have the same result as "eadc" but wouldn't "aecd" wouldn't match "aced". Also the normal work around if there were enough symbols is the following line.
PHP Syntax (Toggle Plain Text)
  1. $v=($charconvert[$char[0]]*$charconvert[$char[1]])+32-$tmp;
However I have another work around which is when pulling the the data from the database, to rehash the original data and to see if it matches what was requested. An example is as follows:
php Syntax (Toggle Plain Text)
  1. $_GET['q']=trim($_GET['q']);
  2. function compress_string($string) {
  3. $str=array();
  4. $charconvert=array('a'=>1,'b'=>2,'c'=>3,'d'=>4,'e'=>5,'f'=>6,'1'=>7,'2'=>8,'3'=>9,'4'=>10,'5'=>11,'6'=>12,'7'=>13,'8'=>14,'9'=>15,'0'=>16);
  5. $arr=str_split($string,2);
  6. while (!empty($arr[0]) || $arr[0]===0) {
  7. for ($i=0;isset($arr[$i]);$i++) {
  8. $char=str_split($arr[$i],1);
  9. $arr[$i]='';
  10. if (empty($charconvert[$char[1]])) {
  11. $tmp=1; } else {
  12. $tmp=$charconvert[$char[1]];
  13. }
  14.  
  15. $v=($charconvert[$char[0]]*$tmp)+32;
  16. if ($v<256) {
  17. $str[]=chr($v);
  18. } else {
  19. $str[]=$char[0];
  20. $arr[$i]=$char[1];
  21. $arrs=implode('',$arr);
  22. unset($arr);
  23. $arr=str_split($arrs,2);
  24. unset($arrs);
  25. }
  26. unset($v,$tmp);
  27. }
  28. }
  29. unset($arr,$char,$charconvert);
  30. return implode('',$str);
  31. }
  32. if ($_GET['hash']>0) {
  33. $r=mysql_query('SELECT `id` FROM `hash` WHERE `'.$hash.'`="'.mysql_real_escape_string(compress_string($_GET['q'])).'"');
  34. } else {
  35. $r=mysql_query('SELECT `id` FROM `hash` WHERE `sha1`="'.mysql_real_escape_string(compress_string(substr($_GET['q'],0,4).hash('crc32',$_GET['q']).hash('crc32b',$_GET['q']))).'"');
  36. }
  37. if (mysql_num_rows($r)==0) {
  38. echo '<table border=0 cellpadding=3 cellspacing=0 bgcolor="#D0D0D0"><tr bgcolor="#D0D0D0"><td bgcolor="#D0D0D0"><b>No Results found for '.htmlentities($_GET['q'],ENT_QUOTES).'</b></td></tr></table>'."\r\n";
  39. } else {
  40. echo '<table border=1 cellpadding=2 cellspacing=0 style="border-top:1px; border-top-color:#FFFFFF"><tr bgcolor="#D0D0D0" style="font-family:arial; font-weight:bolder; border-top:1px; border-top-color:#FFFFFF"><td bgcolor="#D0D0D0">Tanslation</td><td bgcolor="#D0D0D0">SHA1</td><td bgcolor="#D0D0D0">Crc32</td><td bgcolor="#D0D0D0">Crc32b</td></tr>'."\r\n";
  41. while ($data=mysql_fetch_assoc($r)) {
  42. if ($_GET['q']==hash($hash,$data['id'])) {
  43. echo '<tr><td bgcolor="#D0FFFF"><textarea style="width:'.((strlen($data['id'])*10)).'px; height:16px; overflow-y:hidden;" scrolling=no>'.$data['id'].'</textarea></td><td>'.hash('sha1',$data['id']).'</td><td>'.hash('crc32',$data['id']).'</td><td>'.hash('crc32b',$data['id'])."</td></tr>\r\n";
  44. }
  45. }
  46. echo "</table>\r\n";
  47. }
Also I did a test and for some reason my script does not always suffer from that bug or at least on my test. But I did alter the function a to the following:
php Syntax (Toggle Plain Text)
  1. function compress_string($string) {
  2. $str=array();
  3. $charconvert=array('a'=>1,'b'=>2,'c'=>3,'d'=>4,'e'=>5,'f'=>6,'1'=>7,'2'=>8,'3'=>9,'4'=>10,'5'=>11,'6'=>12,'7'=>13,'8'=>14,'9'=>15,'0'=>16);
  4. $arr=str_split($string,2);
  5. while (!empty($arr[0]) || $arr[0]===0) {
  6. for ($i=0;isset($arr[$i]);$i++) {
  7. $char=str_split($arr[$i],1);
  8. $arr[$i]='';
  9. if (empty($charconvert[$char[1]])) {
  10. $tmp=1; } else {
  11. $tmp=$charconvert[$char[1]];
  12. }
  13.  
  14. $v=($charconvert[$char[0]]*$tmp)+32;
  15. if ($v<256) {
  16. $str[]=chr($v);
  17. } else {
  18. $str[]=$char[0];
  19. $arr[$i]=$char[1];
  20. $arrs=implode('',$arr);
  21. unset($arr);
  22. $arr=str_split($arrs,2);
  23. unset($arrs);
  24. }
  25. unset($v,$tmp);
  26. }
  27. }
  28. unset($arr,$char,$charconvert);
  29. return implode('',$str);
  30. }
So I guess I'm just lucky that bug doesn't happen on my all the time but still will add that second validator. I also calculated that in 13 days I can fill 30 GB of dehashing data calculating to at least 5 digits. So the script works for my needs even with it's bug of reverse characters having same match. Because mysql can still filter the results from millions of rows to a few dozen it should still do the job.
Sponsor
Featured Poster
Reputation Points: 410
Solved Threads: 258
Occupation: Genius
cwarn23 is offline Offline
3,004 posts
since Sep 2007
Sep 29th, 2009
0

Re: Dehasher script malfunctioning

Click to Expand / Collapse  Quote originally posted by cwarn23 ...
Thanks for pointing that out but I should be able to code a reader that can filter the incorrect matches. So as you pointed out the string "aecd" would also have the same result as "eadc" but wouldn't "aecd" wouldn't match "aced". Also the normal work around if there were enough symbols is the following line.
PHP Syntax (Toggle Plain Text)
  1. $v=($charconvert[$char[0]]*$charconvert[$char[1]])+32-$tmp;
However I have another work around which is when pulling the the data from the database, to rehash the original data and to see if it matches what was requested. An example is as follows:
php Syntax (Toggle Plain Text)
  1. $_GET['q']=trim($_GET['q']);
  2. function compress_string($string) {
  3. $str=array();
  4. $charconvert=array('a'=>1,'b'=>2,'c'=>3,'d'=>4,'e'=>5,'f'=>6,'1'=>7,'2'=>8,'3'=>9,'4'=>10,'5'=>11,'6'=>12,'7'=>13,'8'=>14,'9'=>15,'0'=>16);
  5. $arr=str_split($string,2);
  6. while (!empty($arr[0]) || $arr[0]===0) {
  7. for ($i=0;isset($arr[$i]);$i++) {
  8. $char=str_split($arr[$i],1);
  9. $arr[$i]='';
  10. if (empty($charconvert[$char[1]])) {
  11. $tmp=1; } else {
  12. $tmp=$charconvert[$char[1]];
  13. }
  14.  
  15. $v=($charconvert[$char[0]]*$tmp)+32;
  16. if ($v<256) {
  17. $str[]=chr($v);
  18. } else {
  19. $str[]=$char[0];
  20. $arr[$i]=$char[1];
  21. $arrs=implode('',$arr);
  22. unset($arr);
  23. $arr=str_split($arrs,2);
  24. unset($arrs);
  25. }
  26. unset($v,$tmp);
  27. }
  28. }
  29. unset($arr,$char,$charconvert);
  30. return implode('',$str);
  31. }
  32. if ($_GET['hash']>0) {
  33. $r=mysql_query('SELECT `id` FROM `hash` WHERE `'.$hash.'`="'.mysql_real_escape_string(compress_string($_GET['q'])).'"');
  34. } else {
  35. $r=mysql_query('SELECT `id` FROM `hash` WHERE `sha1`="'.mysql_real_escape_string(compress_string(substr($_GET['q'],0,4).hash('crc32',$_GET['q']).hash('crc32b',$_GET['q']))).'"');
  36. }
  37. if (mysql_num_rows($r)==0) {
  38. echo '<table border=0 cellpadding=3 cellspacing=0 bgcolor="#D0D0D0"><tr bgcolor="#D0D0D0"><td bgcolor="#D0D0D0"><b>No Results found for '.htmlentities($_GET['q'],ENT_QUOTES).'</b></td></tr></table>'."\r\n";
  39. } else {
  40. echo '<table border=1 cellpadding=2 cellspacing=0 style="border-top:1px; border-top-color:#FFFFFF"><tr bgcolor="#D0D0D0" style="font-family:arial; font-weight:bolder; border-top:1px; border-top-color:#FFFFFF"><td bgcolor="#D0D0D0">Tanslation</td><td bgcolor="#D0D0D0">SHA1</td><td bgcolor="#D0D0D0">Crc32</td><td bgcolor="#D0D0D0">Crc32b</td></tr>'."\r\n";
  41. while ($data=mysql_fetch_assoc($r)) {
  42. if ($_GET['q']==hash($hash,$data['id'])) {
  43. echo '<tr><td bgcolor="#D0FFFF"><textarea style="width:'.((strlen($data['id'])*10)).'px; height:16px; overflow-y:hidden;" scrolling=no>'.$data['id'].'</textarea></td><td>'.hash('sha1',$data['id']).'</td><td>'.hash('crc32',$data['id']).'</td><td>'.hash('crc32b',$data['id'])."</td></tr>\r\n";
  44. }
  45. }
  46. echo "</table>\r\n";
  47. }
Also I did a test and for some reason my script does not always suffer from that bug or at least on my test. But I did alter the function a to the following:
php Syntax (Toggle Plain Text)
  1. function compress_string($string) {
  2. $str=array();
  3. $charconvert=array('a'=>1,'b'=>2,'c'=>3,'d'=>4,'e'=>5,'f'=>6,'1'=>7,'2'=>8,'3'=>9,'4'=>10,'5'=>11,'6'=>12,'7'=>13,'8'=>14,'9'=>15,'0'=>16);
  4. $arr=str_split($string,2);
  5. while (!empty($arr[0]) || $arr[0]===0) {
  6. for ($i=0;isset($arr[$i]);$i++) {
  7. $char=str_split($arr[$i],1);
  8. $arr[$i]='';
  9. if (empty($charconvert[$char[1]])) {
  10. $tmp=1; } else {
  11. $tmp=$charconvert[$char[1]];
  12. }
  13.  
  14. $v=($charconvert[$char[0]]*$tmp)+32;
  15. if ($v<256) {
  16. $str[]=chr($v);
  17. } else {
  18. $str[]=$char[0];
  19. $arr[$i]=$char[1];
  20. $arrs=implode('',$arr);
  21. unset($arr);
  22. $arr=str_split($arrs,2);
  23. unset($arrs);
  24. }
  25. unset($v,$tmp);
  26. }
  27. }
  28. unset($arr,$char,$charconvert);
  29. return implode('',$str);
  30. }
So I guess I'm just lucky that bug doesn't happen on my all the time but still will add that second validator. I also calculated that in 13 days I can fill 30 GB of dehashing data calculating to at least 5 digits. So the script works for my needs even with it's bug of reverse characters having same match. Because mysql can still filter the results from millions of rows to a few dozen it should still do the job.
The reason you don't get duplicates is that a SHA1 is very unique. Thus even with the redundancy introduced by the function, it still does not collide with others.

It is the same as just cutting the SHA1 in half, and keeping the first half. You will will have a low probability of collisions.

So a function like:

PHP Syntax (Toggle Plain Text)
  1. function compress($sha) {
  2. $parts = str_split($sha, 20);
  3. return $parts[0];
  4. }

would achieve the same.
Moderator
Reputation Points: 457
Solved Threads: 101
Nearly a Posting Virtuoso
digital-ether is offline Offline
1,250 posts
since Sep 2005

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
Message:
Previous Thread in PHP Forum Timeline: PHP inside HTML files
Next Thread in PHP Forum Timeline: chained select boxes





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC