0

My script merges 18 files and returns all numbers that occur >=13 times within the merger. I timed my script and array_count_values is so slow it accounts for 80% of the 2.35 sec time. The files are large, 200,000 numbers per file, so the merged array is well over 2 million.

Any ideas how I can kick out the array_count_values function or write it in a better way and still get a return of all numbers that occur >= 13 times in the merged array?

Note: I shortened code to reflect only 3 files out of 18 to be merged.

for($b=0; $b<1; $b++)
{
    echo $b."\n";
for($a=0; $a<10; $a++)
{

    for($i=0; $i<30; $i++)//30
{
    $linespreset=file_get_contents("/users/history/".$folder."/".$round."/masterspeedrandom_randompick_less13_".$b."_".$a."_".$i.".txt");

    $holdpreset=explode(" ",$linespreset);
    $holdpreset=array_map("trim", $holdpreset);
$print1=file_get_contents('/users/'.$a.'/masterspeed_round3_xxx_'.$holdpreset[0].'.txt');
$print2=file_get_contents('/users/'.$a.'/masterspeed_round3_xxx_'.$holdpreset[1].'.txt');
$print3=file_get_contents('/users/'.$a.'/masterspeed_round3_xxx_'.$holdpreset[2].'.txt');

$healthy = " ";
$yummy   = "_";
$print1= strtr($print1,$healthy,$yummy);
$print2= strtr($print2,$healthy,$yummy);
$print3= strtr($print3,$healthy,$yummy);

$resultround=$print1."\r\n".$print2."\r\n".$print3."\r\n".$print4."\r\n".$print5."\r\n".$print6."\r\n".$print7."\r\n".$print8."\r\n".$print9."\r\n".$print10."\r\n".$print11."\r\n".$print12."\r\n". $print13."\r\n".$print14."\r\n".$print15."\r\n".$print16."\r\n".$print17."\r\n".$print18;

$somearray = str_word_count($resultround, 1, '1234567890:@&_');

$frequency = array_count_values($somearray);

$result = array_filter($frequency, function ($x) { return $x >=13; });

//fwrite to print out $result array with numbers that occur >=13 times in the merged array

unset($somearray);

}//END OF I
}//END OF A

}//END OF B
4
Contributors
3
Replies
70
Views
1 Month
Discussion Span
Last Post by benanamen
1

So 2.35 seconds? Is this on a SSD or HDD?
I've found payback to be good enough that a move to SSD and more RAM is worth it. I don't see anything outstanding in the code that would make a big difference.

2

Hi,

in addition, have you tried with SplFixedArray? It should be faster than standard arrays. Also if you want to open files from the script, than use fopen() instead of file_get_contents(), because the latter will load the entire file in memory before starting processing, while the former will read in chunks and start the execution immediately.

See: http://php.net/manual/en/class.splfixedarray.php

Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.