Removing Duplicates

Question

Post reply

Removing Duplicates

JulieW

SSChasing Mays

Points: 600
More actions
October 15, 2003 at 8:20 am

#176801

Dear All,
I am migrating data over to a new system written in a completely different lauguage, the database is not SQL but one completely different.
In this DB we have a persons table, which can be summerised as follows; -
1Smith Peter
2Smith Mary
3Smith Tony
4Smith Heather
5Smith Peter
6Smith Mary
7Jones Matt
8Jones George
9Jones Henry
10Jones George
I need SQL to return the first occurances of the duplicates ID, so I am afrer sql which will return 1, 2, 3, 4, 7, 8, 9.
Can anyone help ?
Thanks
J

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

guarddata SSCertifiable Points: 6583 More actions · Answer 1

SELECT Name, MIN( ID )

FROM <table>

GROUP BY Name

Guarddata-

JulieW SSChasing Mays Points: 600 More actions · Answer 2

Hello,

My apologies I meant to put

1 Smith Peter

2 Smith Mary

3 Smith Tony

4 Smith Heather

7 Jones Matt

8 Jones George

9 Jones Henry

As the result so the Min(ID) solution will not work.

Thanks

J

guarddata SSCertifiable Points: 6583 More actions · Answer 3

OK - help me understand. If you just change the order of the column list, do you not get the information?

SELECT MIN(ID) as ID, Name

FROM <table>

GROUP BY Name

ORDER BY MIN(ID), Name

Guarddata-

Mohamed El-Deeb SSC Veteran Points: 263 More actions · Answer 4

Well, I guess these queries were enough if you ONLY need to return the first record for each DISTINCT nameIf you are migrating data and want to remove the extra duplicates here is another query:

DELETE FROM persons p1
WHERE (SELECT COUNT(*)
 FROM persons p2
 WHERE p1.name = p2.name) >1AND p1.ID <>
  (SELECT MIN(ID)
    FROM persons p3
    WHERE p1.name = p2.name)

This query is not so complex as it might appear; For each record in the persons table, if the same name did appear more than once, all these records will be deleted except for the first one (the one having the MIN(ID))

Mohamad Ad-Deeb
Senior Software Developer, Microtech
3 Mossadak st. Dokki 12311
Giza, Egypt.
Tel. +20 2 336 9855 (Ext.112)
Fax +20 2 749 8784
http://www.microtech-eg.com

covitz Old Hand Points: 322 More actions · Answer 5

A direct way to do this is using the "distinct" qualifier, as in:

select distinct names

into newtable

from oldtable

if you have a lot of columns in the old table then you can do something like:

select *

into newtable

from oldtable

where names in (select names from oldtable group by names having count(*)=1)

order by names

Charles

Calvin Lawson SSChampion Points: 11030 More actions · Answer 6

delete p

From #People p (nolock)

JOIN#People p1 (nolock) on p.pName = p1.pName

wherep.pID > p1.Pid

Here's a extremely simple high performance way to do it in one statement:

Join the table to itself on the value you want to be unique (pName is this case). In the where clause filter all records where the pID is less than itself (the minimum). Everything else gets' deleted:

FULL code below:

/*------------------------------------------

Create test data

*/----------------------------------------------

if object_id('tempdb..#people') is not null drop table #People

create table #People (pID int identity, pName varchar(50))

Insert #People Values ('Smith Peter')

Insert #People Values ('Smith Mary')

Insert #People Values ('Smith Tony')

Insert #People Values ('Smith Heather')

Insert #People Values ('Smith Peter')

Insert #People Values ('Smith Mary' )

Insert #People Values ('Jones Matt' )

Insert #People Values ('Jones George' )

Insert #People Values ('Jones Henry' )

Insert #People Values ('Jones George')

/*---------------------------------

test deletion

*/------------------------------------

select * from #People

delete p

From #People p (nolock)

JOIN#People p1 (nolock) on p.pName = p1.pName

wherep.pID > p1.Pid

select * from #People

Signature is NULL