Add a respective changes after comparing two CSV files

Question

Jack_44 0 Newbie Poster

6 Years Ago

I'm trying to compare 2 different CSV files, mark those differences respectively, then produce it as an output. However, my code seems to be only reading the last part of the lines from sample1.csv and sample2.csv as you can see below:

Sample1.csv
Planet,Account,Name,Station,City
Earth,1234,Pete,Nebula,Phoenix
Earth,1234,Pete,Nebula,Phoenix
Earth,1234,Pete,Nebula,Phoenix

Sample2.csv
Planet,Account,Name,Station,City
Earth,1234,Pete,Nebula,Wakanda
Earth,1234,Pete,Nebula,Montgomery
Earth,1234,Pete,Nebula,Carlo

Current Output
History,Planet,Account,Name,Station,City
Changed,Earth,1234,Pete,Nebula,Carlo

Expected Output
History,Planet,Account,Name,Station,City
Changed,Earth,1234,Pete,Nebula,Wakanda
Changed,Earth,1234,Pete,Nebula,Montgomery
Changed,Earth,1234,Pete,Nebula,Carlo

Here is the code I have:

import csv        
with open('old.csv', newline='') as f_old:
    csv_old = csv.reader(f_old, delimiter=',')
    header = next(csv_old)
    old_data = {row[0] : row for row in csv_old}

with open('new.csv', newline='') as f_new:
    csv_new = csv.reader(f_new, delimiter=',')
    header = next(csv_new)
    new_data = {row[0] : row for row in csv_new}
set_new_data = set(new_data)
set_old_data = set(old_data)    
added = [['Added'] + new_data[v] for v in set_new_data - set_old_data]
deleted = [['Deleted'] + old_data[v] for v in set_old_data - set_new_data]
in_both = set_old_data & set_new_data
changed = [['Changed'] + new_data[v] for v in in_both if old_data[v] != new_data[v]]
print(changed)    
with open('difference.csv', 'w', newline='') as f_output:
    csv_output = csv.writer(f_output, delimiter=',')
    csv_output.writerow(['History'] + header)
    csv_output.writerows(sorted(added + deleted + changed, key=lambda x: x[1:]))

Does anyone know how to get the expected output? Any help is appreciated Thanks!

python

3 Contributors
6 Replies
1K Views
1 Week Discussion Span
Latest Post 6 Years Ago Latest Post by Ramij

rproffitt 2,701 https://5calls.org

6 Years Ago

I think I'd read priors on CSV comparisons in Python. I like the solution at https://stackoverflow.com/questions/38996033/python-compare-two-csv-files-and-print-out-differences since it covers how to handle one of many situations and would be easy to extend to what you want to do.

rproffitt 2,701 https://5calls.org

6 Years Ago

I think your specification needs a lot of work. "I'm trying to compare 2 different CSV files, mark those differences respectively, then produce it as an output." Your example looks more like a merge than a compare.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

Jack_44 0 Newbie Poster · Answer 1 · 2018-07-20T19:56:17+00:00

Originally, the idea of code was from that thread. That code does perfectly in terms of comparisson, but when I have to add a new column and add respective changes in every row in the output, that code is not enough and thus I improved the code. However, the code that I have right now, only works if the first column of the sample files are completely different, soon as it sees same rows in the first column, the code breaks.

Jack_44 0 Newbie Poster · Answer 2 · 2018-07-20T20:15:45+00:00

I think you should run the code and see that it actually compares, as i stated BEFORE. The program compares 2 different files and prints em except when they have identical information in the rows of the first column. What am trying to do is to tackle that condition. Do you understand my question?

rproffitt 2,701 https://5calls.org Moderator · Answer 3 · 2018-07-20T22:24:28+00:00

Reverse engineering? I'll bow out now. Some demand such work but here I don't mind a challenge but if members don't take time to write what they need and want others to reverse engineer by reading their code, well, let's see who will do that.

I'll think about this for a bit. But as presented, the spec looks off.

Ramij 0 Newbie Poster · Answer 4 · 2018-07-31T11:58:31+00:00

Ramij 0 Newbie Poster

6 Years Ago

Thanks..