Problem Description

I tried to download my favourite site containing essays on history (leeching) with depth 4. For some reason say my download tool problem, I realized I could not download all the files. I had with me a list of files (list1) I downloaded and list of files (list2) that were present on the site.

I only wanted to download the differential. Some may comment that using a better site ripper may solve this problem. I agree, but the problem is generic. I have 2 lists and I want to find the delta.

I am quite comfortable with scripting and immediate rescue seemed to be using dort,diff....

But them I thought let me try python. Wao I could not have imagined a shorter code!

#! /usr/bin/env python

import sys
import sets

from sets import Set

#Open the list1 and read it into the set1
f=open(sys.argv[1], 'r')
set1 = Set(f.readlines())

#Open the list2 and read it into the set2
f=open(sys.argv[2], 'r')
set2 = Set(f.readlines())

#Find Delta

#Dump delta
f=open('new_dwnl.txt', 'w')