Deleting Same Tuples

Question

karan_21584 0 Light Poster

18 Years Ago

hai, i m new to this site and my 1st post is this. I have a table which consist of some tuples. i didnt set any primary key. by mistakely i entered the same values for two tuples. i want to delete a single tuple without copying the table to another new table or using DISTINCT keyword... how its possible. My professor told there is an way... and ask me to find out. CAN anyone help me? thanks in advance....CLEAR MY DOUBT IF U GENIUS

first-post oracle

4 Contributors
9 Replies
110 Views
2 Weeks Discussion Span
Latest Post 18 Years Ago Latest Post by davidcairns

All 9 Replies

davidcairns 1 Junior Poster

18 Years Ago

You could use the rowid, it's probably the simplest solution. For info the rowid will uniquely identify a row within a table. It is a virtual column that Oracle maintains for all tables. Other than for situations where no primary or unique exists it shouldn't be relied upon too heavily though.

I'll let you figure out how to apply it to this problem

Memento commented: Right on the money +1

~s.o.s~ 2,560 Failure as a human

18 Years Ago

Another approach would be :

delete from TABLE_NAME T where rowid not in 
   (select max(rowid) 
    from TABLE_NAME R 
    where R.common_column = T.common_column;) ;

Here common_column is the criteria for filtering out duplicates.

~s.o.s~ 2,560 Failure as a human

18 Years Ago

Technically though if 3 copies of a single row existed you would only delete 1 with that script, hence why I used the 1 to keep instead of 1 to delete

Nope. Take a look at it once again. I delete those tuples who don't satisfy the criteria of having the max ROWID. In any case when duplicate exists, only one row will have the max rowid among all the duplicates. Hence the query. Try running it and you will know.

> Also in large tables that will be much slower

Correlated queries are generally faster as compared to other types of queries. On what basis can you say that the query would be slow ?

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

karan_21584 0 Light Poster · Answer 1 · 2007-03-26T11:06:34+00:00

thanks Mr.Davidcairns..could u figure out how to apply it to this problem? plz..thanks in advance

Memento 2 Light Poster · Answer 2 · 2007-03-27T08:36:49+00:00

If you add the rowid pseudo column to your initial select statement you would see that it is different for both rows. The where clause of your update statement would then reference the rowid of the record you want to update.

davidcairns 1 Junior Poster · Answer 3 · 2007-03-27T14:28:29+00:00

For small sample sizes that is the best way, but for a more generic and scalable approach I would go with the following

DELETE
FROM tablename
WHERE rowid = (SELECT t.rowid
FROM tablename t,
(SELECT keyval, min(rowid) as keeprowid
FROM tablename
GROUP BY keyval) k
WHERE t.rowid(+) = k.keeprowid
AND k.keeprowid IS NULL
AND t.rowid = tablename.rowid)

Note that I used = rather than IN as the performance difference can be huge. This will work well on almost any table, but will probably work better in this case if a non unique index is applied to the keyval field. The keyval may actually be more than one field, it could if you wish be every field in the table.

Memento 2 Light Poster · Answer 4 · 2007-03-27T19:54:12+00:00

That is pretty slick... I'm going to have to save that :cheesy:.

davidcairns 1 Junior Poster · Answer 5 · 2007-04-11T03:08:18+00:00

Technically though if 3 copies of a single row existed you would only delete 1 with that script, hence why I used the 1 to keep instead of 1 to delete

Also in large tables that will be much slower

davidcairns 1 Junior Poster · Answer 6 · 2007-04-11T20:19:05+00:00

Fair enough, should read more carefully :)

In my experience with large databases providing an exact match is far more efficient than using IN or EXISTS clauses. This may be improved with adequate use of indexes, and the latest versions of Oracle may have made further improvements, but in essence it does hold. The larger the table the more pronounced this issue becomes. I have run this kind of things over tables with row counts in excess of 20 million in production.
Also joining using queries in the from clause is far more efficient in Oracle for at least versions 7, 8 and 9 (I have very limited experience working with 10).
Essentially limiting all queries to only those rows that actually need to be returned is efficient for any database.

Deleting Same Tuples

Recommended Answers Collapse Answers

All 9 Replies

Recommended Answers