Monday, March 7, 2011

Self Join to Delete Duplicate Rows:

The most effective way to detect duplicate rows is to join the table against itself as shown below.

select
book_unique_id,
page_seq_nbr,
image_key
from
page_image a
where
rowid >
(select min(rowid) from page_image b
where
b.key1 = a.key1
and
b.key2 = a.key2
and
b.key3 = a.key3
);

Please note that you must specify all of the columns that make the row a duplicate in the SQL where clause. Once you have detected the duplicate rows, you may modify the SQL statement to remove the duplicates as shown below:


delete from
table_name a
where
a.rowid >
any (select b.rowid
from
table_name b
where
a.col1 = b.col1
and
a.col2 = b.col2
)
;

No comments: