How to find duplicates for complete raw without listing all columns ?

  • Hi,

    I have pretty big table, like 40+ columns and trying to find/delete complete duplicates, i.e. dups by all 45 columns, is there any easy way to do this without specifying whole list of columns in GROUP BY or doing DISTINCT * ?

    this is just sample to try for 5 columns, is it possible to sipmlify this select ?

    ;WITH c1 AS (

    SELECT 1 c1, 2 c2, 3 c3, 4 c4, 5 c5 UNION ALL

    SELECT 1 c1, 2 c2, 3 c3, 4 c4, 77 c5 UNION ALL

    SELECT 1 c1, 2 c2, 3 c3, 4 c4, 5 c5 )

    SELECT *, COUNT(*) cc FROM c1 GROUP BY c1,c2,c3,c4,c5 HAVING COUNT(*) > 1

    Tx mario

  • Not as far as I know.

    However, when you work in SSMS you can drag theh "columns" entry from object explorer on the SSMS query window, and it will drop a full list of all columns in the table. That might help you get a head start.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • No. How would you know what a duplicate is? This isn't a obvious as you might think.

  • If this is just a one-time ad-hoc statement you want an easy way to construct, then Hugo's way will work nicely.

    If you have to do this for a bunch of different tables, then it might make sense to dynamically construct the T-SQL statement instead.

    This illustrates the basic idea:

    DECLARE @sql nvarchar(max);

    SET @sql='SELECT row_count=COUNT(*),*

    FROM your_table

    GROUP BY '+(SELECT STUFF((SELECT ','+QUOTENAME(name)

    FROM sys.columns

    WHERE object_id=object_id('your_table')

    FOR XML PATH ('')),1,1,''))+'

    HAVING COUNT(*)>1';

    EXEC(@sql);

    Of course, there are still other concerns. In addition to what Steve said, being able to have rows that are exactly the same across all columns suggests there are design improvements possible. It may well be that the best thing going forward is to define an appropriate constraint so that exact duplicates are prevented.

    For the specific problem you mentioned, though, Hugo's SSMS tip or the dynamic T-SQL approach should make your job a bit easier.

    Cheers!

  • Tx much all !!!

    M

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply