need to filter rows from a table by distinct

  • i need to get all the distinct rows from a table. it works fine when i try

     

    select distinct ( col1) from tableA

     

    but when i try :

     

    select distinct(col1),col2,  col3 from tableA

    it doesnt work.

     

    it also doesnt wok if i try

    select distinct( col1,col2,col3)from tableA.

     

    i really need the rows from the table with a distinct col1 value. any ideas?

  • See this recent, similar thread:

    http://qa.sqlservercentral.com/forums/shwmessage.aspx?forumid=8&messageid=282795

    Distinct works across all the columns in the result set. If you want unique col1 values, but also want col2 & col3 in your resultset, you need to provide the business rules about which records to include if col1 values repeat with different col2/3

     

  • i have read the post which you provided a link to, but im still a bit hazy.

    heres my exact problem. i some rows in my table( not very many) are duplicated exactly.

    i need to be able to 1 of each of the duplicated rows into a temp table, i can then delete all the duplicates from the main table and insert the rows back in from the temp table. i need all the data from all columns. i now realise that distinct only works on one col but how do i get the effect of a distinct for each row in this case?

     

  • DISTINCT syntax does not use parentheses. This should work:

    SELECT DISTINCT col1,col2,col3

    INTO DeDupedTable

    FROM tableA

  • the problem with that is it will select all rows in the table that are distinct, which is most rows. i only wnt the ones that are duplicates

  • I think the confusion might be based on your definition of the terms distinct and duplicate. Both imply that the combination of all of the columns in a select list are distinct, not just some of the values, but I'm thinking that's not the case with your situation.

    Are you trying to identify rows where col1 is distinct, and then return that as well as col2 and col3, or are you wanting the combination of col1, col2, and col3 to be distinct?

    If the latter, PW's code should do the trick for you.

  • Upon reading your latest message, it sounds like you want all rows where the values in col1, col2, and col3 taken together appear in more than 1 row.

    Try: SELECT col1, col2, col3 FROM tableA GROUP BY col1, col2, col3 HAVING Count(1) > 1

  • i want to return all the data from all the rows rows where col1 is not distinct. col1 isnt a primary key, but its still supposed to be distinct.

    so i want col1, col2, col3,..... where col1 is not distinct

  • here is the actual query i am running to get the data i require ( sorry its a bit long but it might demonstraight exactly what im looking to get:

     

    select

    InvoiceGuid

    ,CreditNoteID

    ,InvoiceHeaderID

    ,InvoiceBatchID

    ,BillToAccountName

    ,BilltoAccountCountry

    ,BillToAccountCountryCode

    ,BillToContactName

    from mcsInvoice

    goup

    by invoicebatchId

    having

    count(invoiceBatchId)>1

     

     

    i get this error:

     

    Msg 156, Level 15, State 1, Line 59

    Incorrect syntax near the keyword 'by'.

     

     

    Can anyone tell me what the problem is here?

  • There is no "r" in your GROUP BY statement.

    It won't do what it seems like you're wanting anyway, as you aren't aggregating. Based on what I think is your intent, try this:

    SELECT

     InvoiceGuid

     ,CreditNoteID

     ,InvoiceHeaderID

     ,InvoiceBatchID

     ,BillToAccountName

     ,BilltoAccountCountry

     ,BillToAccountCountryCode

     ,BillToContactName

    FROM

     mcsInvoice

    WHERE

     InvoiceBatchID IN (

          SELECT

           InvoiceBatchID

          FROM

           mcsInvoice

          GROUP BY

           InvoiceBatchID

          HAVING

           Count(1) > 1)

  • SELECT

     InvoiceGuid

     ,CreditNoteID

     ,InvoiceHeaderID

     ,InvoiceBatchID

     ,BillToAccountName

     ,BilltoAccountCountry

     ,BillToAccountCountryCode

     ,BillToContactName

    FROM

     mcsInvoice

    WHERE

     InvoiceBatchID IN (

          SELECT

           InvoiceBatchID

          FROM

           mcsInvoice

          GROUP BY

           InvoiceBatchID

          HAVING

           Count(1) > 1)

     

    i have tried this and it returns everything in my table. there are over 3000 rows in the table. i am pretty sure there are about 600 duplicates

  • When you say distinct and duplicate, are you referring to the InvoiceBatchID? I made that assumption based on the code you pasted.

    If that's the case, try "SELECT Count(Distinct InvoiceBatchID) FROM mcsInvoice" and tell us what number you get back. My code above, while untested, should do that, if InvoiceBatchID is indeed the column you want to find the dupes on.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply