seting value for 11 000 000 rows table

  • Good morning all!

    I have created a new column:

    ALTER TABLE table1 ADD column1 int DEFAULT 0 NULL

    So, now it has 11 000 000 rows with a NULL value. I need to replace NULL with a 0 value.

    Here is what I tried already:

    1) ALTER TABLE table1 ADD column1 int DEFAULT 0 NOT NULL

    2)UPDATE table1 SET column1 = 0

    WHERE column1 IS NULL

    I was waiting for 3 hours, transaction log grew to 3 G, so I had to cancel it.

    Is there another way to do it, something like non loged operation?

    Thank you in advanced.

  • You could break it up into batches.

    Something like setting your record count to some number, say 5 or 10 thousand, and looping until you get them all. Maybe even set a scheduled job to do it for you every 5 or ten minutes or so (the looping, I mean). If you used the same where column is null criteria, and monitored the time the job took for execution, when the time for the job to finish became very little, you'd know it was done. Done in batches like this, it would be much less likely to block processes hitting this table as well, or at least would only block it for a little while rather than your hours. also, by monitoring the time of the job like that, you wouldn't have to keep checking the actual records, which of course would tie up the table more.

    Edited by - scorpion_66 on 12/06/2002 09:26:32 AM

  • If you aren't bothered about recovery, the fastest way to do it will be to use SELECT INTO (because SELECT INTO is not logged to the transaction log).

    e.g.

    SELECT 0 as column1, column2, column3

    INTO newtable1 from table1

    then drop table1 and use sp_rename to rename newtable1 to table1.

    This method gets more complicated if there are any foreign key constraints involving table1.

    I have used this technique many times, and it is many times quicker than using UPDATE.

  • Thanks again!!!!!!

    Will try that on the weekend.

  • The key problem is the UPDATE clause in your piece. Instead drop the column with alter statement then add back again with

    ALTER TABLE table1 ADD column1 int DEFAULT 0 NOT NULL

    this should be fairly quick, painless, and no TL stuff.

  • Tried that already, same picture. Waited for

    3 hours, lost my patience and canceled this

    madness:)

    So I'm going to try as was suggested above.

    Thanks anyway!

  • Wow, I have several tables with 7000000 where I had to add a column. I didn't see this happen in my case, was like 15 minutes or so. Oh well, hopefully doing the other will fix for you.

  • The lesson here is that NULL and DEFAULT should be trigger that you have an invalid table definition. If you are supplying a default, then the column should be NOT NULL.

    You could try changing your DB recovery model to SIMPLE make your change then change it back to FULL. Make backups before and after.

  • Even then the time it is taking will still be the same. I think the other method is still going to take a bit but you can do i in spurts that yor are comfortable with.

  • I thought the time involved was very excessive for such a small update as well, and decided there had to be other factors involved to slow it down. I was figuring foreign keys or constraints, or maybe no indexes, or possibly even an IO bound system. That's why I suggested batching it rather than the rebuild of the table. I was also thinking of it from a production perspective, where you would want to keep the table available while it was occurring.

    With the transaction log growing to 3 GB during this update, it seems that there is much more happening than a simple update from a null value to a 0. Even for twice that number of rows. Notice how when Antares686 suggested the drop and add back with the not null, slava_davidchuk says that he attempted that with the same symptoms. Thats the most efficient method of accomplishing this that there is. It still takes the same amount of time. Obviously, there is something else going on. I'm betting on foreign keys constraints checks. OR this column as part of a composite index.

    When this update is running, what does the EM show for the process. Are there any associated waits being reported. What is the table structure being dealt with, PK's FK's, indexes, etc...

    I still think that the batching of the updates is your best generic solution, providing the most concurrency & least impact on your database while accomplishing the job, but it would be interesting to know what caused your situation, and a better solution could be come up with. Possibly something as simple as dropping a FK for the duration of your re-build.

  • No FK involved and the column is not a part of comp index. Simple table, about 20 columns, clustered index and 2 none clustered. So, here is what happened: first time I updated only 50 rows ( took 1 sec ), second time 1000 rows ( took 1 sec ) then 1 000 000 rows ( took 3 sec ).The rest of 10 000 000 rows update took about 2-3 min. I think it has something to do with a query optimizer...

    What you think???

  • I think SQL pulls some odd crap sometime. The adding a column with a default of 0 with ALTER should have done just as quick. So obviously it does something else besides what we tell it.

  • SO, all together, going that route, it took about 4 minutes? That seems MUCH more inline with what I would expect. Did the transaction log still show the huge amount of growth? Did you change recovery to simple for the update? I still don't see the logical reason for the issue you started with. Without knowing why, its likely to affect you again.

  • Trans log grew up to 1.2 G.

    Before and last time no trans were waited for the update. My fault, forgot to mention recovery mode was changed to simple...

    How it can affect it, isn't it one big transaction (update) ?

  • Updates are bad, if you are to alter large amounts of data, renaming the table, creating the new one and then copying the data back in is the quickest way to do it. You also end up with a less fragmented table (hopefully) than by using updates, where pagesplits occur.

    in 6.5 the difference was stupid, updates would take hours inserts would take seconds.

    Simon Sabin

    Co-author of SQL Server 2000 XML Distilled

    http://www.amazon.co.uk/exec/obidos/ASIN/1904347088


    Simon Sabin
    SQL Server MVP

    http://sqlblogcasts.com/blogs/simons

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply