Deleting Duplicate Rows

Question

Post reply

Deleting Duplicate Rows

Steve DiDomenico

Ten Centuries

Points: 1078
More actions
June 3, 2004 at 4:23 pm

#152168

Hello,
I have duplicate rows in a table such as this:
id          donor_id
----------- -----------
933579      100009282
933580      100009282
933550      100020374
933549      100020374
933551      100029723
933552      100029723
Without anything unique to use in a DELETE statement, I cannot figure out how to delete these duplicate rows.
Can anyone offer a solution? I was thinking about using a CURSOR and checking the prev row to the curr row, but I ran into trouble accessing the field names.
Thanks in advance for your time,
Steve DiDomenico, Nashua, NH

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

5409045121009-7368 SSCoach Points: 15740 More actions · Answer 1

Create table #Tmp(Id int,Donor_ID int)

GO

Insert #Tmp

Select 933579, 100009282 union all

Select 933580, 100009282 union all

Select 933550, 100020374 union all

Select 933549, 100020374 union all

Select 933551, 100029723 union all

Select 933552, 100029723

GO

Select Min(tt.Id) as 'Min_ID',r.Donor_ID

from (Select Distinct t.Donor_ID from #Tmp as t) as r(Donor_ID)

Inner Join #Tmp as tt

On r.Donor_ID=tt.DOnor_ID

Group by r.Donor_ID

Order by r.Donor_ID

GO

Drop Table #Tmp

GO

5409045121009-7368 SSCoach Points: 15740 More actions · Answer 2

More Complete Solution!

Create table #Tmp(Id int,Donor_ID int)

GO

Insert #Tmp

Select 933579, 100009282 union all -- Multi duplicates

Select 933581, 100009282 union all

Select 933582, 100009282 union all

Select 933583, 100009282 union all

Select 933584, 100000000 union all -- Single

Select 933580, 100009282 union all

Select 933550, 100020374 union all

Select 933549, 100020374 union all

Select 933551, 100029723 union all

Select 933552, 100029723

GO

Delete from #Tmp

From (Select tt.Donor_ID,

Min(tt.Id) as 'Min_ID'

From #Tmp as tt

Group by tt.Donor_ID) as f(Donor_ID,Min_ID)

Where #Tmp.Donor_ID=f.Donor_ID

And #Tmp.ID<>f.Min_ID

GO

Select * from #Tmp Order by Donor_ID

GO

Drop Table #Tmp

GO

babuganesh2000 Hall of Fame Points: 3123 More actions · Answer 3

One more way

DELETE

FROM #Tmp

WHERE Id IN

(

SELECT Id

FROM #Tmp

WHERE Id NOT IN

(

SELECT MAX(Cast(Id AS varchar(15)))

FROM #Tmp

GROUP BY Donor_ID

)

Thanks,

Ganesh

ZenDada SSCarpal Tunnel Points: 4637 More actions · Answer 4

I keep this terrific article Chris wrote in my virtual briefcase:

http://qa.sqlservercentral.com/columnists/ccubley/findinganddeletingduplicatedata.asp

[font="Courier New"]ZenDada[/font]

Steve DiDomenico Ten Centuries Points: 1078 More actions · Answer 5

Thank you all for the posts.

I have a solution using the following sub select as a base and then additions to the WHERE clause:

(

SELECT MAX(Cast(Id AS varchar(15)))

FROM actual_table_name

GROUP BY donor_id

&nbsp

Brian Cidern SSC Veteran Points: 263 More actions · Answer 6

Simple solution -- I do this all the time.

DELETE a

FROM t_table a

INNER JOIN t_table b

ON (

a.donor_id = b.donor_id AND

a.id > b.id)

ZenDada SSCarpal Tunnel Points: 4637 More actions · Answer 7

Simple and elegant.

I am at a loss to understand why we inherit such tables in the first place?! Primary Keys, Unique Constraints... these seem to be foreign concepts to some.

[font="Courier New"]ZenDada[/font]

bcb Old Hand Points: 391 More actions · Answer 8

I've been using this one for quite some time. I like it because you can use it for combination keys (i.e. where the primary key is more than one column). Just substitute the key columns with fld1, fld1. (Also be sure to change the datatype for @fld1 and @fld2 as needed.)

/* You can delete your duplicates using a batch like:

(where fld1, fld2 are the fields that are the keys)

*/

DECLARE @Dups int, @fld1 int, @fld2 int

DECLARE Dups CURSOR FOR

SELECT fld1, fld2, COUNT(*) - 1 AS Duplicates

FROM Table1

GROUP BY fld1, fld2

HAVING COUNT(*) > 1

OPEN Dups

WHILE(1=1)

BEGIN

FETCH NEXT FROM Dups INTO @fld1, @fld2, @Dups

IF @@FETCH_STATUS < 0 BREAK

SET ROWCOUNT @Dups

DELETE Table1 WHERE fld1 = @fld1 And fld2 = @fld2

SET ROWCOUNT 0

END

DEALLOCATE Dups