changing data type

  • I have  a database table with 15 million rows of data. In this table one of the columns datatype is float,

    but now I want to change it to decimal datatype . what is the most efficient way of achieving this. can I do this change in enterprise manager or Quesry analyzer would be good.

    Thanks.

  • Learn to use QA, EM more often than not just copies the table with the required changes and reinsert all the data. However a script in QA will make the server do the minimum work.

    ALTER TABLE dbo.TableName

    ALTER COLUMN X Decimal(18,4) not null

  • Enterprise Manager will use a sytax like

    BEGIN TRANSACTION

    CREATE TABLE dbo.Tmp_MyTableName

     (

     MyTableName numeric(38, 16) NOT NULL)  ON [PRIMARY]

    GO

    IF EXISTS(SELECT * FROM dbo.MyTableName)

      EXEC('INSERT INTO dbo.Tmp_MyTableName (MyFieldName)

      SELECT CONVERT(numeric(38, 16), MyFieldName) FROM dbo.MyTableName TABLOCKX')

    GO

    DROP TABLE dbo.MyTableName

    GO

    EXECUTE sp_rename N'dbo.Tmp_MyTableName', N'MyTableName', 'OBJECT'

    GO

    COMMIT

    This is creating a new table and insert data (15 million rows forget it).

    I would use alter table in Query analyzer.

    Damn - I was creating this syntax in EM and 8 min late

    Regards,
    gova

  • Thanx for completing my point. One more thing is that you can save such scripts for documentation and rerun or roll them back when needed.

  • If you have sufficient disk space, copying the data to build a new table may run much faster than updating the existing table.  Obviously it is a lot more work, but a 15-million row table may take forever to update.

    Decimal(18,4) will require 9 bytes for storage vs 8 for Float.  Will this cause page splits?  Probably not with only a one byte difference, unless the rows are small and there are many rows in a page.  A type change likely to cause page splits should almost certainly be done by copying to new table.

    If you want to copy instead of ALTER but do it in QA, you can enter the change in EM and copy the script it generates to QA.  The script will have all the code for dealing with indexes, foreign keys, extended properties, etc.  It has to be run carefully because it drops the table whether or not the data is copied successfully.  I either run it in two parts, or put a test like IF (SELECT COUNT(*) FROM Original) = (SELECT COUNT(*) FROM New) in it.  Or change the DROP TABLE to a rename, and drop it manually after verification.

  • If you have sufficient disk space, copying the data to build a new table may run much faster than updating the existing table.

    That is a very BIG may

    1. When you have triggers on a table I have to give you the bad news that Triggers have to changed manually. Alter perform without any problems

    2. If you have permissions restricted on that table The creation of the new one better take care of that before and after or those will be lost. Alter perform just fine

    3. Allocation time for the WHOLE dataset is much more likely to increase file size with alter you may get some pages splitted but not that big of a deal when the sizes are that close

    4. All indexes, foreign keys, and constraint will have to be recreated. Alter perform just fine 

    5. I never recommed EM over QA to perform such actions becuse you are not in control of what is happeneing 

    6. Why risk that many things when you know that the safest and recommended way is ALTER

     


    * Noel

  • Anyways.. how about decimal (16, 4) > 8 for 8 .

    Actually what size is actually needed for that task????

    I'd defnitly go with QA in this case. I don't see em's dance outperform the alter in this case.

  • That's a very tiny MAY, in my experience.  I can INSERT/SELECT 15 million rows in minutes on most of the servers I work with, but I have seen ALTER take hours on a large table.  ALTER has to generate at least twice as much log activity to update every row.  I may be slightly paranoid (not a bad thing in a DBA), but I like having a saved copy of the original table in case the data type change was a bad idea.

    Yes, you have to make sure you get all permissions, triggers, indexes, keys, and so on, but that is why I use EM to create the original script.

  • Yes, you have to make sure you get all permissions, triggers, indexes, keys, and so on, but that is why I use EM to create the original script.

    And the rebuild of all those is tiny ?

    Yes, when you have alot of activity on the table that you are altering the table is locked  and sql has to contend with other connections trying to access it none recomended to do it a peak hours though

     I don't know anything about your server but something is definitely wron whe creating a FULL 15 millon rows + ALL INDEXES + ETC is FASTER that altering from float to decimal

    I don't think Alter generates twice as much activity either what is more when you change a varchar(x) to varchar(X+y) NOTHING happens from allocation standpoint.

    So every case is different but no way this is one for that !!

     


    * Noel

  • Oh and one more thing

    When you said: I can INSERT/SELECT 15 million rows in minutes ...

    I hope you meant :  I can SELECT INTO ....

    The amount of logging on INSERT/SELECT is VERY HIGH when compared with SELECT INTO

     


    * Noel

  • You're welcome to stop by any time and troubleshoot my server configuration.

    Try to remember that I didn't say to never use ALTER TABLE.  I said that ALTER may have a worst-case behavior you should consider before deciding what to do.  Any action on tables that large should cause you to consider risks and alternatives that you wouldn't waste time on for smaller tables.  I also put a higher priority on risk avoidance than on maximum performance for something like a schema change.

    I didn't mean SELECT INTO because the whole point was to precisely define the new table.  I suppose SELECT INTO would work (with a CAST on the redefined column), but I'm not positive all table properties (identity, defaults, etc) would be recreated.  My preferred method is to take the EM-generated script and make a few changes to make it more robust.  SELECT INTO would require a little more thought and experiment.

    An ALTER that doesn't increase the size of the rows is probably going to run in almost no time.  With a one-byte row size difference, that's probably what will happen in this case.  An ALTER that increases the row size and incurs page splits can take forever.  I don't like the downside risk.  (If the float is the only column in the table you can get 578 rows per page, changing to dec(18,4) can only fit 539 rows per page and there will be page splits.  I realize it is extremely unlikely we're discussing a one-column table, just thought I'd mention it.)

    The copy method is more predictable.  (If you don't have the disk space I predict it won't work.)  The table copy and index recreation will be much faster than a worst-case ALTER.  I like having the opportunity to look over the script EM generates to make sure I haven't overlooked some other property of the table that could be impacted.  I don't have every index, constraint, and permission in every database memorized.  I also like modifying the script to rename and save the original table while the effects of the ALTER are tested.

    If you're certain that your intended action won't be a worst-case, then ALTER away with my blessing.  If the ALTER is safe, you might waste a few minutes doing the copy instead.  But if ALTER bogs down, you might waste an indeterminate abount of time.  Or you might decide you've waited long enough, kill the ALTER, and wait even longer for the rollback.  ALTER may be quicker 99.9% of the time, but I'm paranoid enough to worry about the 0.1% chance that I'll ruin my afternoon.

  • That's exactly why we test our script on a recent backup of the server to make sure nothing screws up... but thanx for the feedback, I never thaught of those situations.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply