Duplicate record delete

  • Hi All ,

    I want to delete the duplicate record from a table keeping 1 record aside.

    my base table is-info

    idnameclass

    2abc6a

    3abc6a

    4abc6a

    1abc6a

    2abc6a

    4abc6a

    4abc6a

    3abc6a

    3abc6a

    1abc6a

    2abc6a

    5abc6a

    id-int

    name-text

    class-varchar

    (there is no primary key in this table)

    Now i want the result in following way:

    idnameclass

    2abc6a

    3abc6a

    4abc6a

    1abc6a

    I have tried the following query and its running fine but its not a dynamic stuff.

    DELETE top (SELECT COUNT(*)-1 FROM aaa WHERE id ='3') --or put some number

    FROM aaa

    WHERE id ='3'-- or put some number

    So i was wondering if some one could help me in this.

    Many thanks,

    Preetpal kapoor:-)

  • Hi.. please provide table create / insert data scripts

    we have been here before 🙂

    http://qa.sqlservercentral.com/Forums/FindPost1246891.aspx

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Ya we were there before also. Thanks for your help.Actually i am new to this and trying to catch thing quickly.Here you go.....

    create table info (id int,name text,class varchar(50))

    insert into info values('2','abc','6a')

    insert into info values('3','abc','6a')

    insert into info values('4','abc','6a')

    insert into info values('1','abc','6a')

    insert into info values('2','abc','6a')

    insert into info values('4','abc','6a')

    insert into info values('4','abc','6a')

    insert into info values('3','abc','6a')

    insert into info values('3','abc','6a')

    insert into info values('1','abc','6a')

    insert into info values('2','abc','6a')

    insert into info values('5','abc','6a')

    😉

  • preetpalkapoor (2/5/2012)


    Ya we were there before also. Thanks for your help.Actually i am new to this and trying to catch thing quickly.Here you go.....

    create table info (id int,name text,class varchar(50))

    insert into info values('2','abc','6a')

    insert into info values('3','abc','6a')

    insert into info values('4','abc','6a')

    insert into info values('1','abc','6a')

    insert into info values('2','abc','6a')

    insert into info values('4','abc','6a')

    insert into info values('4','abc','6a')

    insert into info values('3','abc','6a')

    insert into info values('3','abc','6a')

    insert into info values('1','abc','6a')

    insert into info values('2','abc','6a')

    insert into info values('5','abc','6a')

    😉

    Name is a TEXT datatype??? Seriously??? 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • This will do as you asked. Are you sure you want to delete items that only have a single instance like your original post seems to indicate where the ID = 5?

    WITH

    cteNumberTheDupes AS

    (

    SELECT DupeNum = ROW_NUMBER() OVER (PARTITION BY ID, CAST(Name AS VARCHAR(MAX)), Class ORDER BY (SELECT NULL)),

    DupeCount = COUNT(*) OVER (PARTITION BY ID, CAST(Name AS VARCHAR(MAX)), Class)

    FROM info

    )

    DELETE cteNumberTheDupes

    WHERE DupeNum > 1

    OR DupeCount = 1

    ;

    SELECT * FROM info;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • with cte (id,name,class,rn)

    as

    (select id,name,class,ROW_NUMBER() over (PARTITION by id ,name,class order by ID) from jtdup)

    delete from cte

    where rn>1;

    select * from jtdup

  • weston_086 (2/5/2012)


    with cte (id,name,class,rn)

    as

    (select id,name,class,ROW_NUMBER() over (PARTITION by id ,name,class order by ID) from jtdup)

    delete from cte

    where rn>1;

    select * from jtdup

    That won't work if the name column is actually of the TEXT datatype like the op posted. You'll need to convert it to VARCHAR(MAX).

    After that and if the single ID = 5 isn't supposed to be deleted, that'll work just fine.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Thanks every one for your support.

    i got 3 methods to do that.

    1. To make temp table with i will check with my real time DB and is the easiest method.

    2. With CTE. Actually i dnt have any knowledge about CTE.

    3.It was my way but a lengthy one.

    DELETE top (SELECT COUNT(*)-1 FROM aaa WHERE id ='3') --or put some number

    FROM aaa

    WHERE id ='3'

    here i need to always change the ID.

    🙂

  • preetpalkapoor (2/6/2012)


    Thanks every one for your support.

    i got 3 methods to do that.

    1. To make temp table with i will check with my real time DB and is the easiest method.

    2. With CTE. Actually i dnt have any knowledge about CTE.

    3.It was my way but a lengthy one.

    DELETE top (SELECT COUNT(*)-1 FROM aaa WHERE id ='3') --or put some number

    FROM aaa

    WHERE id ='3'

    here i need to always change the ID.:-)

    A CTE is a very powerful tool, and it would be to your benefit to learn where it can be used, and how to use. That said I suggest you read these two articles:

    SQL Server CTE Basics

    http://www.simple-talk.com/sql/t-sql-programming/sql-server-cte-basics

    Using Common Table Expressions

    SQL Server 2008

    http://msdn.microsoft.com/en-us/library/ms190766(v=sql.100).aspx

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • Bring you results into a cursor and then use rowcount -1 row to delete like for example, this will then delete all your duplicate rows bar 1

    declare @col1 datatype,

    @col2 datatype,

    @col3 datatype,

    @cnt int

    create del cursor read only

    for

    select col1,col2,col3,count(*)

    group by col1,col2,col3

    having count(*)>1

    open del

    fetch next from del into @co1,@col2@col3,@cnt

    while @@fetchstatus=0

    begin

    set rowcount @cnt-1

    delete from table where col1=@col1

    set rowcount 0

    fetch next from del into @co1,@col2@col3,@cnt

    end

    close del

    deallocate del

    ***The first step is always the hardest *******

  • SGT_squeeqal (2/6/2012)


    Bring you results into a cursor and then use rowcount -1 row to delete like for example, this will then delete all your duplicate rows bar 1

    declare @col1 datatype,

    @col2 datatype,

    @col3 datatype,

    @cnt int

    create del cursor read only

    for

    select col1,col2,col3,count(*)

    group by col1,col2,col3

    having count(*)>1

    open del

    fetch next from del into @co1,@col2@col3,@cnt

    while @@fetchstatus=0

    begin

    set rowcount @cnt-1

    delete from table where col1=@col1

    set rowcount 0

    fetch next from del into @co1,@col2@col3,@cnt

    end

    close del

    deallocate del

    Why would you do this when the proposed solution is much cleaner and will perform much better?

    Jared
    CE - Microsoft

  • SGT_squeeqal (2/6/2012)


    Bring you results into a cursor and then use rowcount -1 row to delete like for example, this will then delete all your duplicate rows bar 1

    declare @col1 datatype,

    @col2 datatype,

    @col3 datatype,

    @cnt int

    create del cursor read only

    for

    select col1,col2,col3,count(*)

    group by col1,col2,col3

    having count(*)>1

    open del

    fetch next from del into @co1,@col2@col3,@cnt

    while @@fetchstatus=0

    begin

    set rowcount @cnt-1

    delete from table where col1=@col1

    set rowcount 0

    fetch next from del into @co1,@col2@col3,@cnt

    end

    close del

    deallocate del

    Even after assigning the correct data types to the variables attempting this code results in:

    Msg 343, Level 15, State 1, Line 6

    Unknown object type 'del' used in a CREATE, DROP, or ALTER statement.

    Msg 137, Level 15, State 2, Line 15

    Must declare the scalar variable "@co1".

    Msg 137, Level 15, State 2, Line 17

    Must declare the scalar variable "@@fetchstatus".

    Msg 102, Level 15, State 1, Line 21

    Incorrect syntax near '-'.

    Msg 137, Level 15, State 2, Line 25

    Must declare the scalar variable "@co1".

    Briefly

    1. you do not CREATE a cursor the correct statement is:

    DECLARE del cursor

    2. "@@fetchstatus". should be @@FETCH_STATUS

    In general if you have more than a few rows to delete a cursor is NOT set based, rather in the words of Jeff Moden it is "ROW BY AGONIZING ROW". Where as a CTE is SET based

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • Yer you spotted my mistake, i wasnt offering code i was offering a method that i use to remove duplicates :):-D

    my actual SQL i used in the past to delete duplicates

    Declare @count int,

    @pod int,

    @ordindx int,

    @value varchar(50),

    @del int

    Declare Duplicates cursor read_only

    for

    SELECT count(*),PropertyValueListOID,CAST(Value as varbinary)

    FROM ValueListMember (nolock)

    group by PropertyValueListOID,CAST(Value as varbinary)

    having COUNT(*)>1

    open Duplicates

    Fetch next from Duplicates into @count, @pod, @value

    while @@FETCH_STATUS = 0

    Begin

    set @del=@count-1

    set rowcount @del

    delete from ValueListMember

    where PropertyValueListOID=@pod

    and CAST(Value as varbinary)=@value

    set rowcount 0

    Fetch next from Duplicates into @count, @pod, @value

    END

    Close Duplicates

    Deallocate Duplicates

    ***The first step is always the hardest *******

  • SGT_squeeqal

    If you do use a cursor, you might want to compare its performance with a

    CTE such as this sample. I believe if you do and have a reasonable

    number of duplicate rows the CTE will give you better performance by far.

    CREATE TABLE #T(Date_Stamp DATETIME,KW1 DECIMAL(5,1), KW2 DECIMAL(5,1))

    INSERT INTO #T

    SELECT '12/10/2010', 5.3, 3.1 UNION ALL

    SELECT '12/10/2010', 5.3, 3.1 UNION ALL

    SELECT '12/9/2010', 4, 2 UNION ALL

    SELECT '12/8/2010', 3, 1 UNION ALL

    SELECT '12/7/2010', 7.4, 5 UNION ALL

    SELECT '12/7/2010', 7.4, 5 UNION ALL

    SELECT '12/7/2010', 7.4, 5

    ;with cte

    as (select row_number() over(partition by Date_Stamp,KW1,KW2 order by Date_Stamp) as rn,

    KW1,KW2

    from #T)

    --Use the DELETE ONLY after testing

    DELETE FROM cte WHERE rn > 1

    --SELECT * FROM cte --to show all results

    --Use the following to test the results

    --SELECT * FROM cte WHERE rn > 1 -- to show only items that will be deleted

    SELECT * FROM #T

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • Yer will do thanks never used CTE's as only just migrated to SQL2008r2 from 2000 🙂 the joys:w00t:

    ***The first step is always the hardest *******

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply