954,561 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Deleting Same Tuples

hai, i m new to this site and my 1st post is this. I have a table which consist of some tuples. i didnt set any primary key. by mistakely i entered the same values for two tuples. i want to delete a single tuple without copying the table to another new table or using DISTINCT keyword... how its possible. My professor told there is an way... and ask me to find out. CAN anyone help me? thanks in advance....CLEAR MY DOUBT IF U GENIUS

karan_21584
Light Poster
32 posts since Mar 2007
Reputation Points: 10
Solved Threads: 0
 

You could use the rowid, it's probably the simplest solution. For info the rowid will uniquely identify a row within a table. It is a virtual column that Oracle maintains for all tables. Other than for situations where no primary or unique exists it shouldn't be relied upon too heavily though.

I'll let you figure out how to apply it to this problem

davidcairns
Junior Poster
114 posts since Feb 2007
Reputation Points: 12
Solved Threads: 8
 

thanks Mr.Davidcairns..could u figure out how to apply it to this problem? plz..thanks in advance

karan_21584
Light Poster
32 posts since Mar 2007
Reputation Points: 10
Solved Threads: 0
 

If you add the rowid pseudo column to your initial select statement you would see that it is different for both rows. The where clause of your update statement would then reference the rowid of the record you want to update.

Memento
Light Poster
30 posts since Jan 2007
Reputation Points: 12
Solved Threads: 0
 

For small sample sizes that is the best way, but for a more generic and scalable approach I would go with the following

DELETE
FROM tablename
WHERE rowid = (SELECT t.rowid
FROM tablename t,
(SELECT keyval, min(rowid) as keeprowid
FROM tablename
GROUP BY keyval) k
WHERE t.rowid(+) = k.keeprowid
AND k.keeprowid IS NULL
AND t.rowid = tablename.rowid)

Note that I used = rather than IN as the performance difference can be huge. This will work well on almost any table, but will probably work better in this case if a non unique index is applied to the keyval field. The keyval may actually be more than one field, it could if you wish be every field in the table.

davidcairns
Junior Poster
114 posts since Feb 2007
Reputation Points: 12
Solved Threads: 8
 

That is pretty slick... I'm going to have to save that :cheesy:.

Memento
Light Poster
30 posts since Jan 2007
Reputation Points: 12
Solved Threads: 0
 

Another approach would be :

delete from TABLE_NAME T where rowid not in 
   (select max(rowid) 
    from TABLE_NAME R 
    where R.common_column = T.common_column;) ;


Here common_column is the criteria for filtering out duplicates.

~s.o.s~
Failure as a human
Administrator
11,938 posts since Jun 2006
Reputation Points: 3,281
Solved Threads: 734
 

Technically though if 3 copies of a single row existed you would only delete 1 with that script, hence why I used the 1 to keep instead of 1 to delete

Also in large tables that will be much slower

davidcairns
Junior Poster
114 posts since Feb 2007
Reputation Points: 12
Solved Threads: 8
 
Technically though if 3 copies of a single row existed you would only delete 1 with that script, hence why I used the 1 to keep instead of 1 to delete


Nope. Take a look at it once again. I delete those tuples who don't satisfy the criteria of having the max ROWID. In any case when duplicate exists, only one row will have the max rowid among all the duplicates. Hence the query. Try running it and you will know.

> Also in large tables that will be much slower

Correlated queries are generally faster as compared to other types of queries. On what basis can you say that the query would be slow ?

~s.o.s~
Failure as a human
Administrator
11,938 posts since Jun 2006
Reputation Points: 3,281
Solved Threads: 734
 

Fair enough, should read more carefully :)

In my experience with large databases providing an exact match is far more efficient than using IN or EXISTS clauses. This may be improved with adequate use of indexes, and the latest versions of Oracle may have made further improvements, but in essence it does hold. The larger the table the more pronounced this issue becomes. I have run this kind of things over tables with row counts in excess of 20 million in production.
Also joining using queries in the from clause is far more efficient in Oracle for at least versions 7, 8 and 9 (I have very limited experience working with 10).
Essentially limiting all queries to only those rows that actually need to be returned is efficient for any database.

davidcairns
Junior Poster
114 posts since Feb 2007
Reputation Points: 12
Solved Threads: 8
 

This article has been dead for over three months

Post: Markdown Syntax: Formatting Help
You