Deleting Duplicate Rows

  • Hello,

    I have duplicate rows in a table such as this:

    id          donor_id   

    ----------- -----------

    933579      100009282

    933580      100009282

    933550      100020374

    933549      100020374

    933551      100029723

    933552      100029723

    Without anything unique to use in a DELETE statement, I cannot figure out how to delete these duplicate rows.

    Can anyone offer a solution?  I was thinking about using a CURSOR and checking the prev row to the curr row, but I ran into trouble accessing the field names.

    Thanks in advance for your time,

    Steve DiDomenico, Nashua, NH

     

  • Create table #Tmp(Id int,Donor_ID int)

    GO

    Insert #Tmp

     Select 933579,      100009282 union all

     Select 933580,     100009282 union all

     Select 933550,     100020374 union all

     Select 933549,     100020374 union all

     Select 933551,     100029723 union all

     Select 933552,     100029723

    GO

    Select Min(tt.Id) as 'Min_ID',r.Donor_ID

     from (Select Distinct t.Donor_ID from #Tmp as t) as r(Donor_ID)

      Inner Join #Tmp as tt

       On r.Donor_ID=tt.DOnor_ID

     Group by r.Donor_ID

     Order by r.Donor_ID

    GO

    Drop Table #Tmp

    GO

  • More Complete Solution!

    Create table #Tmp(Id int,Donor_ID int)

    GO

    Insert #Tmp

     Select 933579,      100009282 union all -- Multi duplicates

     Select 933581,      100009282 union all

     Select 933582,      100009282 union all

     Select 933583,      100009282 union all

     Select 933584,      100000000 union all -- Single

     Select 933580,     100009282 union all

     Select 933550,     100020374 union all

     Select 933549,     100020374 union all

     Select 933551,     100029723 union all

     Select 933552,     100029723

    GO

    Delete from #Tmp

     From (Select tt.Donor_ID,

        Min(tt.Id) as 'Min_ID'

       From #Tmp as tt

       Group by tt.Donor_ID) as f(Donor_ID,Min_ID)

     Where #Tmp.Donor_ID=f.Donor_ID

       And #Tmp.ID<>f.Min_ID

    GO

    Select * from #Tmp Order by Donor_ID

    GO

    Drop Table #Tmp

    GO

  • One more way

    DELETE

      FROM #Tmp

     WHERE Id IN

          (

          SELECT Id

            FROM #Tmp

           WHERE Id NOT IN

               (

                SELECT MAX(Cast(Id AS varchar(15)))

                  FROM #Tmp

                 GROUP BY Donor_ID

               )

           )

     

    Thanks,

    Ganesh

  • I keep this terrific article Chris wrote in my virtual briefcase:

    http://qa.sqlservercentral.com/columnists/ccubley/findinganddeletingduplicatedata.asp

    [font="Courier New"]ZenDada[/font]

  • Thank you all for the posts.

    I have a solution using the following sub select as a base and then additions to the WHERE clause:

     (

       SELECT MAX(Cast(Id AS varchar(15))) 

       FROM actual_table_name 

                 GROUP BY donor_id

    &nbsp

  • Simple solution -- I do this all the time.

    DELETE a

     FROM t_table a

     INNER JOIN t_table b

     ON (

      a.donor_id = b.donor_id AND

      a.id > b.id)

  • Simple and elegant. 

    I am at a loss to understand why we inherit such tables in the first place?!  Primary Keys, Unique Constraints... these seem to be foreign concepts to some.

    [font="Courier New"]ZenDada[/font]

  • I've been using this one for quite some time. I like it because you can use it for combination keys (i.e. where the primary key is more than one column). Just substitute the key columns with fld1, fld1. (Also be sure to change the datatype for @fld1 and @fld2 as needed.)

    /* You can delete your duplicates using a batch like:

     (where fld1, fld2 are the fields that are the keys)

    */

    DECLARE @Dups int, @fld1 int, @fld2 int

    DECLARE Dups CURSOR FOR

    SELECT fld1, fld2, COUNT(*) - 1 AS Duplicates

    FROM Table1

    GROUP BY fld1, fld2

    HAVING COUNT(*) > 1

    OPEN Dups

    WHILE(1=1)

    BEGIN

    FETCH NEXT FROM Dups INTO @fld1, @fld2, @Dups

    IF @@FETCH_STATUS < 0 BREAK

    SET ROWCOUNT @Dups

    DELETE Table1 WHERE fld1 = @fld1 And fld2 = @fld2

    SET ROWCOUNT 0

    END

    DEALLOCATE Dups

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply