I need to delete 20 million rows from a table...

  • Siberian Khatru (3/14/2014)


    For me, the keys were not needing to worry about getting all of the rows back, plus the minimal logging aspect.

    Well....

    Both delete and truncate can be rolled back, so there's no difference for the first requirement. Both delete and truncate are fully logged operations, so there's no difference for second requirement either.

    Truncate does not generate a lot of log, neither does DROP TABLE, they log much the same way, but that's a very, very, very different thing to saying they are minimally logged (they're not)

    /end pedantic

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (3/14/2014)


    Siberian Khatru (3/14/2014)


    For me, the keys were not needing to worry about getting all of the rows back, plus the minimal logging aspect.

    Well....

    Both delete and truncate can be rolled back, so there's no difference for the first requirement. Both delete and truncate are fully logged operations, so there's no difference for second requirement either.

    Truncate does not generate a lot of log, neither does DROP TABLE, they log much the same way, but that's a very, very, very different thing to saying they are minimally logged (they're not)

    /end pedantic

    Hopefully, the 'ligh-bulb' moment (:hehe:) will occur soon and ppl will twig what you're saying Gail.

    I don't think you're being pedantic either. There are already tooooo many myths/misunderstandings about some areas of SQL Server, and whist I don't believe they were malicious, there are some that can be downright dangerous :crazy:

  • humbleDBA (3/14/2014)


    Eric M Russell (3/13/2014)


    Siberian Khatru (3/12/2014)


    I have a table with @ 20 Million rows of data I want to truncate or delete. I don't need any of the data in it anymore at all. I also need to do this on disk space challenged servers. Each row is comprised of 71 characters divided as 35, 35 and 1 in the included columns. Would it be best to:

    - DELETE in batches (don't think so myself)

    - TRUNCATE the table (concerned how many pages get logged in this scenario)

    - Possibly? SCRIPT, DROP and RECREATE the offending table?

    The table itself is not foreign keyed anywhere, nor does it participate in an indexed view. It is not replicated anywhere, although it might be at some point. And yes, I know I need to get after the devs to create a process that limits growth here going forward -- that is not in my province to dictate however. I can merely "suggest" and hope which sucks, but I digress.

    I am an accidental, but long time and fairly capable DBA here where I work. In other words, I ask other experts before I leap on faith alone lol...

    What say you experts out there?

    Yes, attempting to approach this the wrong way will potentially hose your server for days, and rebooting a giant transaction half-way will only make things worse.

    I am confronted with the exact same scenario on occasion, backing out 10s or 100s of millions of rows from large tables in a data warehouse. You said that you're disk space challenged, so I'll tell you how to delete this 20 million row table while minimizing transaction log growth. It works by deleting the rows in batches.

    ...

    LOL...guess you haven't read the whole of the thread? Truncate did it in 1/2 a second :w00t:

    Sorry, couldn't resist :hehe:

    +100 to Gail, though. Great work and help. 🙂

    Well, if it took only 1/2 second to delete 20 million rows, then he's lucky. That's not typical; almost to the point of disbelief. The table must have had no indexes and very narrow rows. In the field, I've seen million row delete or truncate operations run for hours, hit an out of disk space error on the log drive half way (he said the server disk space challenged), and then spend all day long rolling back. By default, I'm sticking with and reccomending the batch delete method.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (3/14/2014)


    Well, if it took only 1/2 second to delete 20 million rows, then he's lucky. That's not typical; almost to the point of disbelief.

    TRUNCATE TABLE

    It doesn't process row by row, for a large table it will simply mark the entire table for deferred drop and return almost immediately, plus since it just has to log the deallocations not the actual rows it logs much, much less than delete. About the only thing which can cause truncate to take ages is it being blocked.

    For cases where truncate is not an option (need to keep some of the rows, table referenced by foreign keys, replicated), then deleting large amounts of rows in batches is essential.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Eric M Russell (3/14/2014)


    humbleDBA (3/14/2014)


    Eric M Russell (3/13/2014)


    Siberian Khatru (3/12/2014)


    I have a table with @ 20 Million rows of data I want to truncate or delete. I don't need any of the data in it anymore at all. I also need to do this on disk space challenged servers. Each row is comprised of 71 characters divided as 35, 35 and 1 in the included columns. Would it be best to:

    - DELETE in batches (don't think so myself)

    - TRUNCATE the table (concerned how many pages get logged in this scenario)

    - Possibly? SCRIPT, DROP and RECREATE the offending table?

    The table itself is not foreign keyed anywhere, nor does it participate in an indexed view. It is not replicated anywhere, although it might be at some point. And yes, I know I need to get after the devs to create a process that limits growth here going forward -- that is not in my province to dictate however. I can merely "suggest" and hope which sucks, but I digress.

    I am an accidental, but long time and fairly capable DBA here where I work. In other words, I ask other experts before I leap on faith alone lol...

    What say you experts out there?

    Yes, attempting to approach this the wrong way will potentially hose your server for days, and rebooting a giant transaction half-way will only make things worse.

    I am confronted with the exact same scenario on occasion, backing out 10s or 100s of millions of rows from large tables in a data warehouse. You said that you're disk space challenged, so I'll tell you how to delete this 20 million row table while minimizing transaction log growth. It works by deleting the rows in batches.

    ...

    LOL...guess you haven't read the whole of the thread? Truncate did it in 1/2 a second :w00t:

    Sorry, couldn't resist :hehe:

    +100 to Gail, though. Great work and help. 🙂

    Well, if it took only 1/2 second to delete 20 million rows, then he's lucky. That's not typical; almost to the point of disbelief. The table must have had no indexes and very narrow rows. In the field, I've seen million row delete or truncate operations run for hours, hit an out of disk space error on the log drive half way (he said the server disk space challenged), and then spend all day long rolling back. By default, I'm sticking with and reccomending the batch delete method.

    It wasn't indexed and it was narrow (three fields of 35, 35 and 1 char). Sadly, I don't have the ultimate say in who does what on a server I am responsible for. It's a regrettable situation for sure. But the issues you mentioned were my biggest concern as those problems would definitely unacceptable. Deleting by batch had occurred to me originally as you suggested, but I ended up using TRUNCATE.

  • humbleDBA (3/14/2014)


    GilaMonster (3/14/2014)


    Siberian Khatru (3/14/2014)


    For me, the keys were not needing to worry about getting all of the rows back, plus the minimal logging aspect.

    Well....

    Both delete and truncate can be rolled back, so there's no difference for the first requirement. Both delete and truncate are fully logged operations, so there's no difference for second requirement either.

    Truncate does not generate a lot of log, neither does DROP TABLE, they log much the same way, but that's a very, very, very different thing to saying they are minimally logged (they're not)

    /end pedantic

    Hopefully, the 'ligh-bulb' moment (:hehe:) will occur soon and ppl will twig what you're saying Gail.

    I don't think you're being pedantic either. There are already tooooo many myths/misunderstandings about some areas of SQL Server, and whist I don't believe they were malicious, there are some that can be downright dangerous :crazy:

    What is "twig" as used here? Also, I can't speak for others, but I can tell you that in my situation, I try my best to each time actually learn why an answer was given me so as to be able to spread the word to others. Trouble for me is that I am a DINO -- DBA in name (or blame) only, and people who know far less have serious impacts all the time. I have been thinking that it's simply time for me to spend a lot more time learning the true ins and outs of DBA work, and to get myself certified as what used to be called an MCDBA, if that still exists.

  • Siberian Khatru (3/14/2014)


    humbleDBA (3/14/2014)


    GilaMonster (3/14/2014)


    Siberian Khatru (3/14/2014)


    For me, the keys were not needing to worry about getting all of the rows back, plus the minimal logging aspect.

    Well....

    Both delete and truncate can be rolled back, so there's no difference for the first requirement. Both delete and truncate are fully logged operations, so there's no difference for second requirement either.

    Truncate does not generate a lot of log, neither does DROP TABLE, they log much the same way, but that's a very, very, very different thing to saying they are minimally logged (they're not)

    /end pedantic

    Hopefully, the 'ligh-bulb' moment (:hehe:) will occur soon and ppl will twig what you're saying Gail.

    I don't think you're being pedantic either. There are already tooooo many myths/misunderstandings about some areas of SQL Server, and whist I don't believe they were malicious, there are some that can be downright dangerous :crazy:

    What is "twig" as used here? Also, I can't speak for others, but I can tell you that in my situation, I try my best to each time actually learn why an answer was given me so as to be able to spread the word to others. Trouble for me is that I am a DINO -- DBA in name (or blame) only, and people who know far less have serious impacts all the time. I have been thinking that it's simply time for me to spend a lot more time learning the true ins and outs of DBA work, and to get myself certified as what used to be called an MCDBA, if that still exists.

    Hi Siberian,

    the 'twig' was simply what Gail had said several times...TRUNCATE is not a minimally logged operation, which you had mentioned in "..., plus the minimal logging aspect...."

    I used the term ppl, because you were not the only one to mention 'minimal logging' for TRUNCATE and I've known of other DBAs who misunderstand TRUNCATE, including saying that its not possible to roll back when using TRUNCATE - which is incorrect.

  • humbleDBA (3/17/2014)


    Siberian Khatru (3/14/2014)


    humbleDBA (3/14/2014)


    GilaMonster (3/14/2014)


    Siberian Khatru (3/14/2014)


    For me, the keys were not needing to worry about getting all of the rows back, plus the minimal logging aspect.

    Well....

    Both delete and truncate can be rolled back, so there's no difference for the first requirement. Both delete and truncate are fully logged operations, so there's no difference for second requirement either.

    Truncate does not generate a lot of log, neither does DROP TABLE, they log much the same way, but that's a very, very, very different thing to saying they are minimally logged (they're not)

    /end pedantic

    Hopefully, the 'ligh-bulb' moment (:hehe:) will occur soon and ppl will twig what you're saying Gail.

    I don't think you're being pedantic either. There are already tooooo many myths/misunderstandings about some areas of SQL Server, and whist I don't believe they were malicious, there are some that can be downright dangerous :crazy:

    What is "twig" as used here? Also, I can't speak for others, but I can tell you that in my situation, I try my best to each time actually learn why an answer was given me so as to be able to spread the word to others. Trouble for me is that I am a DINO -- DBA in name (or blame) only, and people who know far less have serious impacts all the time. I have been thinking that it's simply time for me to spend a lot more time learning the true ins and outs of DBA work, and to get myself certified as what used to be called an MCDBA, if that still exists.

    Hi Siberian,

    the 'twig' was simply what Gail had said several times...TRUNCATE is not a minimally logged operation, which you had mentioned in "..., plus the minimal logging aspect...."

    I used the term ppl, because you were not the only one to mention 'minimal logging' for TRUNCATE and I've known of other DBAs who misunderstand TRUNCATE, including saying that its not possible to roll back when using TRUNCATE - which is incorrect.

    I have to say I am learning a lot of new things in this PDF I've gotten from here on Transaction Log Management. It's amazing how much I've missed by simply not digging into it more. I haven't had to for years, but things have changed and so am I along with them now. Thanks for the feedback. 😀

  • +1 truncate. I ran into it recently but there is one cevate: if the table is referenced by foreign keys you can't truncate (even if the FK constraints are cascade delete). My (albeit shallow) understanding is that the truncation bypasses part of the logging that triggers (it seems that under the hood cascade delete is handled by hidden triggers) the triggers that get rid of the appropriate rows in the other table.

    Also truncate vs delete: if you think you might ever want to roll back this operation use delete, it will be sent to the transaction log so you can roll it back vs truncate which won't.

  • mike.gallamore (3/18/2014)


    +1 truncate. I ran into it recently but there is one cevate: if the table is referenced by foreign keys you can't truncate (even if the FK constraints are cascade delete). My (albeit shallow) understanding is that the truncation bypasses part of the logging that triggers (it seems that under the hood cascade delete is handled by hidden triggers) the triggers that get rid of the appropriate rows in the other table.

    Also truncate vs delete: if you think you might ever want to roll back this operation use delete, it will be sent to the transaction log so you can roll it back vs truncate which won't.

    I guess you haven't read the whole of this mike.gallamore. First off the OP said there were no FKs. However, thats not so important as you were just raising a valid point.

    However, your bit about TRUNCATE won't roll back is wrong. Gail had already mentioned this in the thread at least once, but here is one of her comments...

    GilaMonster (3/12/2014)


    karthik.bj (3/12/2014)


    But however if you see a need to roll back, use delete. DELETE LOGS ALL NECESSARY DATA FOR ROLLBACK and occupies space.

    So does truncate. Truncate can be rolled back exactly as delete can.

    It logs less because it operates at the page level not the row level, but truncate also logs all data necessary for a rollback and can be rolled back if run within a transaction (same as delete).

    other resources that confirm this are...

    http://sqlblog.com/blogs/kalen_delaney/archive/2010/10/12/tsql-tuesday-11-rolling-back-truncate-table.aspx

    http://sqlblog.com/blogs/denis_gobo/archive/2007/06/13/1458.aspx

    http://www.sqltechnet.com/2012/08/rollback-truncate-table-statement.html

    http://www.sqlperformance.com/2013/05/sql-performance/drop-truncate-log-myth

  • GilaMonster (3/14/2014)


    Eric M Russell (3/14/2014)


    Well, if it took only 1/2 second to delete 20 million rows, then he's lucky. That's not typical; almost to the point of disbelief.

    TRUNCATE TABLE

    It doesn't process row by row, for a large table it will simply mark the entire table for deferred drop and return almost immediately, plus since it just has to log the deallocations not the actual rows it logs much, much less than delete. About the only thing which can cause truncate to take ages is it being blocked.

    For cases where truncate is not an option (need to keep some of the rows, table referenced by foreign keys, replicated), then deleting large amounts of rows in batches is essential.

    You're right, Gail. Sorry for the confusion.

    I actually rarely ever truncate any tables in production. However, there have been more than a handful of occasions where someone has kicked off a script to delete 10 million rows from a 100 million row table prior to leaving out of the day, and then I get a 2AM call from operations asking for advice on how to get the database back online again after rebooting failed to cancel the rollback. That disaster is fresh in memory, and I guess I'm still in recovery.

    However, if one needs to delete out all rows in table, then like you said, a simple TRUNCATE is the way to go, and there is no comparison to what happens with a mass delete operation.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • mike.gallamore (3/18/2014)


    Also truncate vs delete: if you think you might ever want to roll back this operation use delete, it will be sent to the transaction log so you can roll it back vs truncate which won't.

    Damn this is a persistent myth.

    Truncate is logged. Truncate is a fully logged operation. Truncate can be rolled back just the same as delete can. There are NO unlogged operations in user databases in SQL Server.

    Trivial to test (and left as an exercise to the reader)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Well there is logged and then there is logged. I seem to recall from by db courses and anywhere I've read online: ex. http://www.mssqltips.com/sqlservertip/1080/deleting-data-in-sql-server-with-truncate-vs-delete-commands/ that the deletion of the pages is logged (and probably the statement itself). But the rows that were on those pages are supposedly non-recoverable. You probably could do things with a powerful tool like restore those data pages to what they were before in a new database then just backup the rows on those pages, go back to your live database and "restore" those rows letting sql figure out what new pages those rows need to map to for you. But after the truncate my understanding is that you can't just play back the transaction logs to get back to where you were because those pages might not exist anymore or might exist with new data etc.

  • mike.gallamore (3/18/2014)


    Well there is logged and then there is logged. I seem to recall from by db courses and anywhere I've read online: ex. http://www.mssqltips.com/sqlservertip/1080/deleting-data-in-sql-server-with-truncate-vs-delete-commands/ that the deletion of the pages is logged (and probably the statement itself). But the rows that were on those pages are supposedly non-recoverable. You probably could do things with a powerful tool like restore those data pages to what they were before in a new database then just backup the rows on those pages, go back to your live database and "restore" those rows letting sql figure out what new pages those rows need to map to for you. But after the truncate my understanding is that you can't just play back the transaction logs to get back to where you were because those pages might not exist anymore or might exist with new data etc.

    You might want to read Kalen's bolg on this. I think all your concerns are explained in there and in the Comments section.

  • That doesn't really clear it up for me. She talks about rolling back the same transaction that the truncate was done in. There are exclusive locks: "and just like all X locks, they are held until the end of the transaction" . So if you use a fire and forget query in SSMS or whatever to truncate your "not needed" table, or simply think you're happy and commit the transaction then a week later want to restore the db (or a subset of the data that was in the table) what do you do? It isn't clear to me that you aren't then pouched.

Viewing 15 posts - 31 through 45 (of 47 total)

You must be logged in to reply to this topic. Login to reply