hello everyone,

got issue regarding reading duplicate records in a text file...

i need to read the file and look for any duplicates data/keys in the text file and write

them to another file (all the duplicates records)...

how can i do that, in looping...:-/

any help...:)

thanks!

Re: reading duplicate records in text file 80 80
Member Avatar

That really depends on what you are classifying as a duplicate record. What is your primary key? Is it composite? Can you provide an example of the data?

Re: reading duplicate records in text file 80 80

I smell homework here... :)

Okay, so start us off. You have a text file. Can you open it and read lines from it, in a loop? Go ahead and do that, just dumping the lines straight to the screen.

That'll make a good first step.

Re: reading duplicate records in text file 80 80

heheheh...;)

actually my issue here is how to get the duplicate data in the text file (.txt) and display it.

all i know is displaying the data without the duplicates...

i need to get the duplicate data and put it in a file.

here's my syntax
i have a Scanner an ArrayList and a TreeSet
in the while loop of the scanner i put the arrayList


while (scanner.hasNextLine()) {
String line = scanner.nextLine();
arrayList.add(line);
}

//in the TreeSet i add the arrayList
set.addAll(arrayList);

//i declare iterator here to get the data without the duplicates
Iterator it=set.iterator();
while (it.hasNext()){
System.out.println("print data w/o duplicates "+(String)it.next());
}

--how can i display the duplicates only...

thanks!:)

Re: reading duplicate records in text file 80 80
Member Avatar

Again what IS a duplicate record?

Take this data for example:
Id Name Title Salary
1 Rodgers, Frank Developer 55,000
2 Smith, Joe Developer 55,000
3 Rodger, Frank Team Lead 55,000

Are records 2 & 3 duplicate b/c they share the same salary as 1?
Is 3 a duplicate of 1 b/c they share the same name?
Is 2 a duplicate of 1 b/c they share the same title?
Are none of them duplicates b/c they have different id numbers?

Solving your homework depends on what a duplicate record actually is---simply saying you need duplicate records is too abstract to code; you need parameters that define the duplication.

Re: reading duplicate records in text file 80 80

i need to read the file and look for any duplicates data/keys in the text file and write

Can you clarify this? Are you trying to eliminate duplicate lines of data, or do you need to parse the lines into data and keys. That would be an added step.

In any case, if you want to display the duplicates only, maybe you should check each item against the rest of the list when you're reading it into the list. If it's already in the list, do whatever you need to do with it - write it to a file, skip adding it to the list, put it in another list, paint it blue and ship it to Waukegan, whatever you like.

Re: reading duplicate records in text file 80 80

actually im not trying to eliminate the duplicates lines, all i need is to put all the duplicates lines to another file...

for instance, in my file "data.txt" i have duplicates lines

AAAAA
BBBBB
BBBBB
CCCCC

i need to copy the [BBBBB] lines to another file...

how should i do that in a loop?..

thanks

Re: reading duplicate records in text file 80 80

Maybe you should check each item against the rest of the list when you're reading it into the list. If it's already in the list, do whatever you need to do with it - write it to a file, skip adding it to the list, put it in another list, paint it blue and ship it to Waukegan, whatever you like.

Re: reading duplicate records in text file 80 80

Or, if you want a second loop, after you've read everything in to the file, you pretty much have to check each item against each other item. Now you're talking about a generic problem of eliminating duplicated items from a list.
The easiest thing to do is to sort the list and go through it - is this item like the one after it? If so, put it in a second list. Write the second list to the file.

Re: reading duplicate records in text file 80 80

i dont know how to display the duplicate lines in a loop, cos whenever i put the arrayList in the loop it only display the whole lines in the 'data.txt'

//it will display the whole lines even it is not duplicate
for (int i=0; i< arrayList1.size(); i++){
            System.out.println(arrayList1.get(i));
} 

-thanks

Re: reading duplicate records in text file 80 80

i just need to display the duplicate lines...how should i do that?:)

Re: reading duplicate records in text file 80 80

Yes, that just goes through and for each item in the ArrayList, it prints it.

That's not what you want, though. For each item in the list, you want to check if it's a duplicate of some other item in the list.

Suppose you have a list of Strings:

blueberry
tangerine
apricot
kiwi
durian

and I give you another String:
gorgonzola

How do you check whether it's in the list?

Re: reading duplicate records in text file 80 80

hello again,

thanks for replying to this thread i already figure it out how to do it...:)

Re: reading duplicate records in text file 80 80

Each time you read an element from the txt file, you start a loop and compare it to all the elements read until that point.

You probably also have to check the location where you dump the duplicates to be free of duplicates. If your data.txt is like this:

AAA
BBB
BBB
BBB

Your duplicates file will probably look like this:
BBB
BBB

...

While writing this in the quick reply window, I realized the thread has 2 pages and saw it's already solved.

Re: reading duplicate records in text file 80 80

the you can (probably) use a some of Set, that doesn't allows duplicates, and you can test it with methods someSet#contains

Re: reading duplicate records in text file 80 80

Use a Hashtable to store records as they come in. Looking up whether an element is in a Hashtable is O(1). If hashtable.containsKey(record) returns true, print it to file :)

Re: reading duplicate records in text file 80 80

i already figure it out, thanks again! :)

Re: reading duplicate records in text file 80 80

[assumin we have already declared arrayList, finalFile,
after adding the lines to an ArrayList<> which i think u can do;

sort them { arrayList.sort() }
arrayList.sort() ;
loop thru the list using for loop then compare value at( i ) and at (i + 1)
finalfile;
//assuming the value are numbers
for(int i = 0; i<arrayList.size(); i++)
{
if(i+1 <= arrayList.size()){
if( arrayList.get(i) == arrayList.get(i+1)
{
add value at i to a finalfile;
or print it;
}
}


}

Re: reading duplicate records in text file 80 80

r0n, do you remember that Disney cartoon of the Sorceror's Apprentice? The one where Mickey Mouse conjures up an endless stream of animated brooms?

I don't know why that came to mind, but maybe you should mark this thread as "closed".

Re: reading duplicate records in text file 80 80

oh, i forgot to closed..thanks anyway:)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of 1.19 million developers, IT pros, digital marketers, and technology enthusiasts learning and sharing knowledge.