how to delete all except results of select statement

  • How can I delete everything from a table except for the results of a select statement? Like this:

    delete from gwinn_hw where not in (select ptid, pat_ext_id, pat_last_name, pat_first_name, gwn_practice_id, kis_practice_id, Max(d_vital)

    from gwinn_hw

    group by ptid, pat_ext_id, pat_last_name, pat_first_name, gwn_practice_id, kis_practice_id)

    I'm just not seeing how to do this. Thanks for any ideas.

  • This should work, but you might want to run this against a copy of your table first to ensure that the results are what you are looking for.

    select ptid, pat_ext_id, pat_last_name, pat_first_name, gwn_practice_id, kis_practice_id, Max(d_vital) as d_vital

    into #gh2

    from gwinn_hw gh2

    group by ptid, pat_ext_id, pat_last_name, pat_first_name, gwn_practice_id, kis_practice_id

    Delete

    From gwinn_hw

    Where Not Exists

    (

    select ptid, pat_ext_id, pat_last_name, pat_first_name, gwn_practice_id, kis_practice_id,d_vital

    from #gh2 gh2

    Where gwinn_hw.ptid = gh2.ptid and

    gwinn_hw.pat_ext_id = gh2.pat_ext_id and

    gwinn_hw.pat_last_name = gh2.pat_last_name and

    gwinn_hw.pat_first_name = gh2.pat_first_name and

    gwinn_hw.gwn_practice_id = gh2.gwn_practice_id and

    gwinn_hw.kis_practice_id = gh2.kis_practice_id and

    gwinn_hw.d_vital = gh2.d_vital

    )

    If Object_ID(N'tempdb.dbo.#gh2'N'U') is not null

    Drop table #gh2

  • Like so much of SQL the answer is it depends. In this case is the number of rows which will remain, a in significant number or percentage of the total number of rows in the existing table. If it is you might want to consider.

    1. Create a table, lets call it gwinn_hw_2 to be an exact match of the existing table gwinn_hw. That is column sequence, data type, size etc., etc.

    2. User your

    (select ptid, pat_ext_id, pat_last_name, pat_first_name, gwn_practice_id, kis_practice_id, Max(d_vital) from gwinn_hw to insert the data you wish to preserve into table gwinn_hw_2.

    (A most important advantage, is you can review what is in the table gwinn_hw_2 to be sure it is all that you require)

    3. Drop the original table gwinn_hw.

    4. Rename gwinn_hw_2 back to gwinn_hw

    Or

    1a. In the above where I have used gwinn_hw_2, make that a temp table named #gwinn_hw to be an exact match of the existing table gwinn_hw. That is column sequence, data type, size etc., etc.

    2a. User your SELECT statement to insert the data you wish to preserve into table #gwinn_hw

    (A side advantage, is you can review what is in the table #gwinn_hw to be sure it is all that you require.)

    3a. Truncate the table gwinn_hw

    4a. Insert the data from #gwinn_hw into gwinn_hw.

    In either case test, test, and test again before executing step 3 in either of the suggestions above.

    Realize when complete you should check your indexes, statistics .. etc.

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • Thank you for those ideas.

    I was hoping to do this without a second table though. I'm not using all the fields in the table to get the max date that I need.

    How could I add in the other fields, which are height and weight?

    My real goal here is to get the information for the most recent date. But when I add height and weight in I get everything because that can be different on different dates.

    Again thanks for your help.

  • Denise McMillan (3/30/2012)


    Thank you for those ideas.

    I was hoping to do this without a second table though. I'm not using all the fields in the table to get the max date that I need.

    How could I add in the other fields, which are height and weight?

    My real goal here is to get the information for the most recent date. But when I add height and weight in I get everything because that can be different on different dates.

    Again thanks for your help.

    The above will probably change any suggestion previously posted to be irrelevant to your actual problem.

    Could / would you post the table definition, some sample data, and required results.

    Please click on the first link in my signature block. The article referenced includes T-SQL code to allow you to do this quickly and easily.

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • Is something like this what you mean to do (try it in a Test database :-))?

    delete g

    FROM gwinn_hw g

    LEFT OUTER JOIN (

    select ptid, pat_ext_id, pat_last_name, pat_first_name, gwn_practice_id, kis_practice_id

    , Max(d_vital) as d_vital

    from gwinn_hw

    group by ptid, pat_ext_id, pat_last_name, pat_first_name, gwn_practice_id, kis_practice_id) x

    ON x.ptid = g.ptid and x.pat_ext_id = g.pat_ext_id and x.pat_last_name = g.pat_last_name and

    x.pat_first_name = g.pat_first_name and x.gwn_practice_id = g.gwn_practice_id and

    x.kis_practice_id = g.kis_practice_id and x.d_vital <> g.d_vital

    WHERE ptid IS NULL

    Edit: I used = on d_vital but meant to use <>


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • My preference for this would be the use of CTEs (Common Table Expressions); I find this the easiest way to visualise a problem because you can break it into discreet testable steps. You don't need CTES, you could use nested SELECTs for co-Related sub queries.

    1) Get the records you want to keep

    2) get their unique keys

    3) delete records whose unique key is not in this list

    E.g.

    WITH CTE01 AS -- Get the last date for each combo

    (

    SELECT Item_ID,height,width,weight,max(insert_date) as 'insert_date'

    FROM myTable

    Group by Item_ID,height,width,weight,max(insert_date)

    )

    ,

    CTE02 AS -- get the primary key

    (

    Select PK_ID from myTable T

    JOIN CTE01 C where T.item_ID = C.Item_ID

    and T.height = C.height

    and T.width = c.width

    and T.weight = C.weight

    and T.insert_date = C.insert_date

    )

    Delete from myTable T where T.PK_ID not in (Select PK_ID from CTE02)

  • aaron.reese (4/2/2012)


    My preference for this would be the use of CTEs (Common Table Expressions); I find this the easiest way to visualise a problem because you can break it into discreet testable steps. You don't need CTES, you could use nested SELECTs for co-Related sub queries.

    1) Get the records you want to keep

    2) get their unique keys

    3) delete records whose unique key is not in this list

    E.g.

    WITH CTE01 AS -- Get the last date for each combo

    (

    SELECT Item_ID,height,width,weight,max(insert_date) as 'insert_date'

    FROM myTable

    Group by Item_ID,height,width,weight,max(insert_date)

    )

    ,

    CTE02 AS -- get the primary key

    (

    Select PK_ID from myTable T

    JOIN CTE01 C where T.item_ID = C.Item_ID

    and T.height = C.height

    and T.width = c.width

    and T.weight = C.weight

    and T.insert_date = C.insert_date

    )

    Delete from myTable T where T.PK_ID not in (Select PK_ID from CTE02)

    slight rewrite of the delete statement to get rid of the sub select if its the way you want to go.

    WITH CTE01 AS -- Get the last date for each combo

    (

    SELECT Item_ID,height,width,weight,max(insert_date) as 'insert_date'

    FROM myTable

    Group by Item_ID,height,width,weight,max(insert_date)

    )

    ,

    CTE02 AS -- get the primary key

    (

    Select PK_ID from myTable T

    JOIN CTE01 C where T.item_ID = C.Item_ID

    and T.height = C.height

    and T.width = c.width

    and T.weight = C.weight

    and T.insert_date = C.insert_date

    )

    --Nothing changed above, included to make things easier.

    Delete t from myTable T

    LEFT JOIN CTE02 c

    ON t.PK_ID = c.PK_ID

    WHERE t.PK_ID IS NULL

    EDIT: Missed /code tag


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • yes, that is probably a better syntax (personally I think it is a little harder to read) but it will work for multi-part primary keys whereas the IN (SELECT...) won't

  • aaron.reese (4/2/2012)


    yes, that is probably a better syntax (personally I think it is a little harder to read) but it will work for multi-part primary keys whereas the IN (SELECT...) won't

    i found removing sub selects when ever possible opens up more possibilities for selectivity. also makes code a little easier to read IMHO


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • Here is another possibility. Since you haven't provided test data, it's difficult to compare the various methods. This method only scans the table once, but has more logical reads.

    WITH CTE AS (

    SELECT ptid, pat_ext_id, pat_last_name, pat_first_name, gwn_practice_id, kis_practice_id, d_vital

    , ROW_NUMBER() OVER(

    PARTITION BY ptid, pat_ext_id, pat_last_name, pat_first_name, gwn_practice_id, kis_practice_id

    ORDER BY d_vital DESC) AS rn

    FROM gwinn_hw

    )

    DELETE FROM CTE

    WHERE rn > 1

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply