954,568 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Remove Duplicate while importing csv to mysql using php

Hi

i am having 3rd party csv file. i am having php uploading script to upload the datas into my table.

while uploading datas into my mysql Table mean while it has to remove duplicate entries rows from CSV file.

Please help me

rpv_sen
Junior Poster
178 posts since Mar 2011
Reputation Points: 18
Solved Threads: 16
 

i had done this task a month ago.I was given was to import csv file to mysql DB.

Step 1

Easily and good approach is to save the csv file as xml doc as option is present to save it as xml doc.

Step 2

Then import the xml doc to mysql.

Here is code below

index1.php

<html>
<head>


<style type="text/css">
body{
font-family:"Lucida Grande", "Lucida Sans Unicode", Verdana, Arial, Helvetica, sans-serif;
font-size:12px;
}
p, h1, form, button{border:0; margin:0; padding:0;}
.spacer{clear:both; height:1px;}
/* ----------- My Form ----------- */
.myform{
margin:0 auto;
width:400px;
padding:14px;
}

/* ----------- stylized ----------- */
#stylized{
border:solid 2px #B77DDF2;
background:#ebf4bf;
}
#stylized h1 {
font-size:14px;
font-weight:bold;
margin-bottom:8px;
}
#stylized p{
font-size:11px;
color:#666666;
margin-bottom:20px;
border-bottom:solid 1px #b7ddf2;
padding-bottom:10px;
}
#stylized label{
display:block;
font-weight:bold;
text-align:right;
width:140px;
float:left;
}
#stylized .small{
color:#666666;
display:block;
font-size:11px;
font-weight:normal;
text-align:right;
width:140px;
}
#stylized input{
float:left;
font-size:12px;
padding:4px 2px;
border:solid 1px #aacfe4;
width:200px;
margin:2px 0 20px 10px;
}
#stylized button{
clear:both;
margin-left:150px;
width:125px;
height:31px;
background:#666666 url(img/button.png) no-repeat;
text-align:center;
line-height:31px;
color:#FFFFFF;
font-size:11px;
font-weight:bold;
}
</style>

</head>
<body>


<div id="stylized" class="myform">
<form enctype="multipart/form-data" 
  action="import2.php" method="post">
  <input type="hidden" name="MAX_FILE_SIZE" value="2000000" />
<h1>Sign-up form</h1>
<p></p>



<label>File
<span class="small">Brows CSV file</span>
</label>
<input type="file" name="file" id="file" />

<button type="submit">Upload</button>
<div class="spacer"></div>

</form>
</div>

</body>
</html>


Here is import file

import2.php

<?php
 // error_reporting(0);

  $data = array();

function add_person( $Name, $EmailAddress, $Company, $City, $Country, $Status ) //change here
{
 global $data;

 
mysql_connect("localhost","root","") or die(mysql_error()); // Change here
mysql_select_db("n");    //Change here


 $data []= array(
   'Name' => $Name,          //   <---////////
   'EmailAddress' => $EmailAddress,        //    
   'Company' => $Company,                 // Changes here
   'City' => $City,                      //  
   'Country' => $Country,               //  
   'Status' => $Status     // <----//////  
 );
}

if ( $_FILES['file']['tmp_name'] )
{
 $dom = DOMDocument::load( $_FILES['file']['tmp_name'] );
 $rows = $dom->getElementsByTagName( 'Row' );
 $first_row = true;
 foreach ($rows as $row)
 {
   if ( !$first_row )
   {
      $Name = "";         //   <---////////
  $EmailAddress = "";                   // 
  $Company = "";                       //  
  $City = "";                         //   Changes here
  $Country = "";                     //  
  $Status = "";         // <----//////

     $index = 1;
     $cells = $row->getElementsByTagName( 'Cell' );
     foreach( $cells as $cell )
     {
       $ind = $cell->getAttribute( 'Index' );
       if ( $ind != null ) $index = $ind;

       if ( $index == 1 ) $Name = $cell->nodeValue;
  if ( $index == 2 ) $EmailAddress = $cell->nodeValue;
  if ( $index == 3 ) $Company = $cell->nodeValue;
  if ( $index == 4 ) $City = $cell->nodeValue;
  if ( $index == 5 ) $Country = $cell->nodeValue;
   if ( $index ==6 ) $Status = $cell->nodeValue;
  
       $index += 1;
     }
     add_person( $Name, $EmailAddress, $Company, $City, $Country, $Status );
   }
   $first_row = false;
 }
}
?>
<html>
<body>
These records have been added to the database:
<table>
 
<?php foreach( $data as $row ) { ?>
 <tr>
  <td><?php ( $name1 = $row['Name'] ); //echo $name1 ?></td>
  <td><?php ( $name2 = $row['EmailAddress'] ); //echo $name2?></td>
  <td><?php ($name3 =  $row['Company'] ); //echo $name3?></td>
  <td><?php ($name4 = $row['City'] ); //echo $name4?></td>     // Changes here
  <td><?php ($name5 = $row['Country'] );  //echo $name5?></td>
  <td><?php ($name6 =  $row['Status'] ); //echo $name6?></td>           
  </tr>
  
  
<?php 
$sql = mysql_query("INSERT INTO tblclients (Name,EmailAddress,Company,City,Country,Status) VALUES ('$name1','$name2','$name3','$name4','$name5','$name6')"); // Changes here
} ?>
</table>

</body>
</html>


Hope this help

extemer
Junior Poster
Banned
188 posts since Apr 2010
Reputation Points: -7
Solved Threads: 10
 

Hi

Thanks for your post and suggestion.

I will catch you after testing in my machine

rpv_sen
Junior Poster
178 posts since Mar 2011
Reputation Points: 18
Solved Threads: 16
 

HI

I have tried it, but it is not getting insert into DB

rpv_sen
Junior Poster
178 posts since Mar 2011
Reputation Points: 18
Solved Threads: 16
 

have you check your csv before convert it to xml fields name because fields name should be same as field name in the DB.if the field name are different csv file than it not catches the field name located in DB and insert data into it

extemer
Junior Poster
Banned
188 posts since Apr 2010
Reputation Points: -7
Solved Threads: 10
 

I didn't quite understand your duplicate entries problem:

1. Duplicate entries in the csv file itself

OR

2. Records in DB are duplicateed in the CSV being uploaded

Which one?

diafol
Rhod Gilbert Fan (ardav)
Moderator
7,792 posts since Oct 2006
Reputation Points: 1,170
Solved Threads: 1,080
 

The CSV file has a duplicate entries.

rpv_sen
Junior Poster
178 posts since Mar 2011
Reputation Points: 18
Solved Threads: 16
 

mean it has dulipcate field name in csv can you post the csv here

extemer
Junior Poster
Banned
188 posts since Apr 2010
Reputation Points: -7
Solved Threads: 10
 

Read in the CSV and explode it on "\n" to get an array of records, say in $lines.

Then use array_unique() to return all unique records into a new array.

loop using foreach...
Explode the array again on "," to get a multidimensional array.

Your data is now available as $array[$row][$column]

Both have a base starting on 0, with $array[0][0], being the first item.

Personally I'd build up a VALUES clause in the loop to use in a single SQL query as opposed to querying at every stage iteration in the loop.

diafol
Rhod Gilbert Fan (ardav)
Moderator
7,792 posts since Oct 2006
Reputation Points: 1,170
Solved Threads: 1,080
 

Hi At present i am using the below script, But i am unable to remove the duplicate values.

<?php
    //database connect info here  
      
	include('config.php');
	  
    //check for file upload  
    if(isset($_FILES['csv_file']) && is_uploaded_file($_FILES['csv_file']['tmp_name'])){  
      
        //upload directory  
        $upload_dir = "csv_dir/";  
      
        //create file name  
        $file_path = $upload_dir . $_FILES['csv_file']['name'];  
      
        //move uploaded file to upload dir  
        if (!move_uploaded_file($_FILES['csv_file']['tmp_name'], $file_path)) {  
      
            //error moving upload file  
            echo "Error moving file upload";  
      
        }  
      
        //open the csv file for reading  
        $handle = fopen($file_path, 'r');  
      
        //turn off autocommit and delete the product table  
        mysql_query("SET AUTOCOMMIT=0");  
        mysql_query("BEGIN");  
      
        while (($data = fgetcsv($handle, 1000, ',')) !== FALSE) {  
      
            //Access field data in $data array ex.   
      
            //Use data to insert into db  
            $sql = sprintf("INSERT into essl_att(essl_date,emp_code,emp_name,company,department,category,designation,grade,team,shift,in_time,out_time,duration,late_by,early_by,status,punch_records,overtime) values('$data[0]','$data[1]','$data[2]','$data[3]','$data[4]','$data[5]','$data[6]','$data[7]','$data[8]','$data[9]','$data[10]','$data[11]','$data[12]','$data[13]','$data[14]','$data[15]','$data[16]','$data[17]')");  
            mysql_query($sql) or (mysql_query("ROLLBACK") and die(mysql_error() . " - $sql"));  
        }  
      
	  	//echo $sql;
        //commit the data to the database  
        mysql_query("COMMIT");  
        mysql_query("SET AUTOCOMMIT=1");  
      
        //delete csv file  
        unlink($file_path);  
    }  
?>
<form action="<?php echo $_SERVER["PHP_SELF"];?>" method="post" enctype="multipart/form-data">  
    <input type="file" name="csv_file">  
    <input type="submit" name="csv_submit" value="Upload CSV File">  
</form>
rpv_sen
Junior Poster
178 posts since Mar 2011
Reputation Points: 18
Solved Threads: 16
 

Try this:

function mres($myarray){ 
    return array_map("mysql_real_escape_string",$myarray); 
} 

$str = file_get_contents($filename); //the uploaded file
$lines = explode("\r\n",$str);
//$entries = array_shift($lines);//if you want to delete the first row (header info??)

$unique = array_unique($lines);//rename $lines to $entries if used the commented line above
foreach($unique as $row){
  $bits = explode(",",$row);
  $cleanbits = mres($bits);
  $value_bits[] = "('{$cleanbits[0]}','{$cleanbits[1]}','{$cleanbits[6]}','{$cleanbits[2]}')";
// build up value list as you require from array items, leaving out quotes if an integer, etc.
}
$value_string = implode(",",$value_bits);

$q = mysql_query("INSERT into essl_att(essl_date,emp_code,emp_name,company,department,category,designation,grade,team,shift,in_time,out_time,duration,late_by,early_by,status,punch_records,overtime) values($value_string)");


This could be done with fewer lines by reusing variables perhaps or combining functions, but I thought that this would be clearer.
The array_unique strips duplicate lines within the file.
It has the advantage of only running ONE query right at the end.
NOT TESTED!

//EDIT

OK, tested - works for me.

diafol
Rhod Gilbert Fan (ardav)
Moderator
7,792 posts since Oct 2006
Reputation Points: 1,170
Solved Threads: 1,080
 

I've made some relevant changes to your code to try to extract unique records.

<?php
    //database connect info here  
      
	include('config.php');
	  
    //check for file upload  
    if(isset($_FILES['csv_file']) && is_uploaded_file($_FILES['csv_file']['tmp_name'])){  
      
        //upload directory  
        $upload_dir = "csv_dir/";  
      
        //create file name  
        $file_path = $upload_dir . $_FILES['csv_file']['name'];  
      
        //move uploaded file to upload dir  
        if (!move_uploaded_file($_FILES['csv_file']['tmp_name'], $file_path)) {  
      
            //error moving upload file  
            echo "Error moving file upload";  
			// added die, I would change to exit or something, if this doesn't work, your file handle code will blow up below this.
		    die("Error moving file upload");
        }  else {
			//echo "success contine to process file";  (I actually tested this code)
		}
      
        //open the csv file for reading  
        $handle = fopen($file_path, 'r');  		
		// sanitize you data. first read your whole file into an array since you used $data I'll use $lines as ardav suggested
		$lines = array();
		while(!feof($handle)) {
			// push each line of the file into your array
			array_push($lines,fgets($handle));
		}
		// lets see if we have anything in $lines  should have one for every row in your file...
		echo "line count = " . count($lines) . "";
		// php function to only have unique values in your array
		$lines = array_unique($lines);
		echo "lines have been uniqued, new line count = " . count($lines) . "";
      
        //turn off autocommit and delete the product table  
        //mysql_query("SET AUTOCOMMIT=0");  
        //mysql_query("BEGIN");  
		// we have already read the whole file its contents are now held in $lines, but we must process further...
		// so since we a dealing with an array lets do a foreach statement.
        //while (($data = fgetcsv($handle, 1000, ',')) !== FALSE) {  
		// again back to ardav's point about using foreach, it is such a great way to work with arrays.
		foreach($lines as $line) {
			// eplode each line on the commas
			$data = explode(",", $line);      
            //Access field data in $data array ex.         
            //Use data to insert into db  
            $sql = sprintf("INSERT into essl_att(essl_date,emp_code,emp_name,company,department,category,designation,grade,team,shift,in_time,out_time,duration,late_by,early_by,status,punch_records,overtime) values('$data[0]','$data[1]','$data[2]','$data[3]','$data[4]','$data[5]','$data[6]','$data[7]','$data[8]','$data[9]','$data[10]','$data[11]','$data[12]','$data[13]','$data[14]','$data[15]','$data[16]','$data[17]')");  
			//echo "sql = " . $sql . "";
            mysql_query($sql) or (mysql_query("ROLLBACK") and die(mysql_error() . " - $sql"));  
        }  
      
	  	echo $sql;
        //commit the data to the database  
        mysql_query("COMMIT");  
        mysql_query("SET AUTOCOMMIT=1");  
      
        delete csv file  
        unlink($file_path);  
    }  
?>
<form action="<?php echo $_SERVER["PHP_SELF"];?>" method="post" enctype="multipart/form-data">  
    <input type="file" name="csv_file">  
    <input type="submit" name="csv_submit" value="Upload CSV File">  
</form>

I also tested the code and it should do what you want.

ddymacek
Posting Whiz
317 posts since Jun 2010
Reputation Points: 36
Solved Threads: 64
 

sorry ardav, I didn't see 'page 2' comment before posting. I used his original file and altered... anyway.

ddymacek
Posting Whiz
317 posts since Jun 2010
Reputation Points: 36
Solved Threads: 64
 

hi

Thanks for your post

rpv_sen
Junior Poster
178 posts since Mar 2011
Reputation Points: 18
Solved Threads: 16
 

Did either of us solve your problem?

diafol
Rhod Gilbert Fan (ardav)
Moderator
7,792 posts since Oct 2006
Reputation Points: 1,170
Solved Threads: 1,080
 

This article has been dead for over three months

Post: Markdown Syntax: Formatting Help
You
View similar articles that have also been tagged: