0

I have a function that I intended to be able to remove duplicate rows in an excel worksheet. I pass it $objPHPExcel which has gone through the following function

    function checkExtension($inputFileName) {
            if (pathinfo($inputFileName,PATHINFO_EXTENSION) == "csv") {
                    $inputFileType = 'CSV';
                    $objReader = PHPExcel_IOFactory::createReader($inputFileType);
                    return $objPHPExcel = $objReader->load($inputFileName);
            } else {
                    return $objPHPExcel = PHPExcel_IOFactory::load($inputFileName);
            }
    }

The removeDuplicates function is

    function removeDuplicates($objPHPExcel) {
                $worksheet = $objPHPExcel->getActiveSheet();
                foreach ($worksheet->getRowIterator() as $row) {
                    $rowIndex = $row->getRowIndex(); 
                    foreach ($worksheet->getRowIterator($rowIndex + 1) as $testRow) {
                        if ($testRow == $row) {
                                $worksheet->removeRow($rowIndex);
                        }
                    }
                }

                $newFileName = 'imports/canx.csv';
                $objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel, 'CSV');
                $objWriter->save($newFileName);

                return $objPHPExcel = checkExtension($newFileName);
    }

The problem is that canx.csv is the same as the orignal file. i.e no rows are getting removed (I've purposefully put in the duplicate rows into the original).

What am I doing wrong here?

Thanks for any assistance given.

4
Contributors
5
Replies
49
Views
4 Years
Discussion Span
Last Post by Tinnin
1

The problem is that canx.csv is the same as the orignal file. i.e no rows are getting removed (I've purposefully put in the duplicate rows into the original).

I don't know what you are trying to do? Did you post this question the in the PHPExcel forum?

Take a look at these examples:

http://phpexcel.codeplex.com/discussions/80845

http://phpexcel.codeplex.com/discussions/76379

You might need this (a preg_match function to match the duplicate and remove it):

$duplicate[] = NULL;
$cvs = 'csv';

if (preg_match($worksheet,$newFileName)){
$inputFileType=$cvs;
$valve = $sheet->getCellByColumnAndRow($z, $row)->getValue();
}

if ($z == $inputFileType && preg_match($worksheet,$valve)){
$invaild[]=$row;
}
elseif ($z == $inputFileType && preg_match($worksheet,$valve)){{
$duplicate[]=$valve;
$duplicate[]=$row;
}

It's not tested you might have to debug some errors to make it work.

Hopefully you will get an idea how it works.

Edited by LastMitch: grammer

0

Thanks guys. The links helped.

I've altered my approach.

        function removeDuplicates($inputFileName, $objPHPExcel) {
                $worksheet = $objPHPExcel->getActiveSheet();
                $urn = array();

                foreach ($worksheet->getRowIterator() as $row) {
                    $rowIndex = $row->getRowIndex();
                    $cellValue = $worksheet->getCell('A'.$rowIndex)->getValue();
                    array_push($urn, $cellValue);       
                }

                $numberOfURNs = count($urn);

                foreach ($worksheet->getRowIterator() as $row) {
                    $rowIndex = $row->getRowIndex();
                    $cellValue = $worksheet->getCell('A'.$rowIndex)->getValue();
                    for ($i = $rowIndex; $i != $numberOfURNs; $i++) {
                        if ($urn[$i] == $cellValue) {
                            $worksheet->removeRow($rowIndex);
                            $objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel, 'CSV');
                            $objWriter->save($inputFileName);
                            global $row;
                            $row = $worksheet->getRowIterator($rowIndex);
                            break;  
                        }
                    }
                }

                return $objPHPExcel = checkExtension($inputFileName);
    }

It removes the rows fine but once I remove a row I assume it is shifting the other rows up meaning the rowIterator is then in the wrong place. How do I set the iterator right? I've tried

$row = $worksheet->getRowIterator($rowIndex);

but it isn't working.

Edited by Tinnin

1

I suggest getting the total number of rows (if possible), and then work your way backwards through the rows. Anything that will shift will not affect you.

1

Thanks Pritaeas. I'll have to invest in some of that common sense :)

    function removeDuplicates($inputFileName, $objPHPExcel) {
                $worksheet = $objPHPExcel->getActiveSheet();
                $urn = array();

                foreach ($worksheet->getRowIterator() as $row) {
                    $rowIndex = $row->getRowIndex();
                    $cellValue = $worksheet->getCell('A'.$rowIndex)->getValue();
                    array_push($urn, $cellValue);       
                }

                $numberOfURNs = count($urn);

                for ($rowIndex = $numberOfURNs; $rowIndex != 1; $rowIndex--) {
                    $cellValue = $worksheet->getCell('A'.$rowIndex)->getValue();
                    for ($i = $rowIndex - 2; $i != 0; $i--) {
                        if ($urn[$i] == $cellValue) {
                            $worksheet->removeRow($rowIndex);
                            $objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel, 'CSV');
                            $objWriter->save($inputFileName);
                            break;  
                        }
                    }
                }

                return $objPHPExcel = checkExtension($inputFileName);
    }

works!

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.