insert into table taking forever....

  • Hi,

    I have a table with 7,238,368 rows in it and running an insert into another table.

    I'm selecting an explict list of colums from the large table and the destination table is been truncated prior to the insert.

    This runs as an over night process each night but lately its been running very very slow and can take hours for this insert.

    I know that 7+ million rows are being moved but it shouln't take that long.

    The database is sized correctly.

    When I run  SELECT COUNT(*) FROM TABLE_NAME NOLOCK during the insert it just returns a zero. What is SQL Server doing here, evaluating the size of data, i'm not sure.

    Any help welcome.

    Eamon

  • Would be a bit more helpful if you posted the code so we can see what you did wrong.  For example, in the SELECT you posted in your message, the way you used NOLOCK is incorrect.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • As Jeff stated, more info would help, but a few things pop out. First, does the table you are inserting into already have records in it? If so, how many? Is it indexed? If so, what are the indexes on it? What does the table structure look like, etc? Your are correct that 7 million rows shouldn't take very long, but on a highly indexed table, especially one with a clustered index with little space to work, its performance can be reduced drastically. If you're inserting into an empty table, you might also consider a SELECT INTO, especially if you are using the Simple or Bulk-Logged Recovery Model, assuming you are using SQL Server 7.0 (sp1) or later (which you are if you can even select Simple Recovery Model). There are some potential "gotchas", such as computed columns become physical columns,etc, but it will be damned fast. There is much debate over whether to use SELECT INTO or not, but I personally feel a lot of it is leftover from the 6.5 and early 7.0 days, when it caused all sorts of locking problems. I still use CREATE TABLE/INSERT when performance isn't an issue, as it makes for more self-documenting code (you see the table structure at a glance, instead of having to go look it up), but when performance is paramount, SELECT INTO is what I use.

    If the table into which you are inserting is empty prior to the insert, and if your SELECT Count(*) is running against that table, I'm not surprised that it reports 0 records. EM also shows 0 records during an insert until the insert is complete.

    Edit: I just reread, and noticed that you answered the question about whether the destination is populated or not. My indexing questions still stand, as well as my advice to at least try SELECT INTO as an option. You'll often find that a SELECT INTO/CREATE INDEX query will outperform an INSERT/SELECT into an already created table.

  • Can you run it in some batches? I like David's thoughts above, but I wonder if 1000, or 10,000, or 100,000 rows seem to complete in reasonable times.

    Has the data size changed lately?

  • The real killer would be the presence of a clustered index or a great many non-clustered indexes.  Unless you use SELECT/INTO, the Simple Recovery Mode won't be of much help and, even then, the presence of indexes during the insert will likely blow any chances for high speed non-logged performance.

    Any triggers present could also pose a significant problem...

    Still, I'd like to see the code for this as well as the full schema for the table including indexes and triggers.  The fact that the table is being truncated prior to the insert has no bearing on speed.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Hi,

    the destination table gets truncated before the INSERT happens and yes it does have a clustered index and several non-clustered indexs. It took almost one hour to run.

    I can't copy the actual command in as I type but it's something like:

    INSERT INTO DestTable

    SELECT Col1,Col2..........Col50 --yes, there are about 50 columns

    FROM SourceTable NOLOCK

    I think Jeff is correct and the SELECT INTO would work best (followed by setting the indexes after?)

    Question, does an INSERT startement also evaluate the actual data been inserted for invalid data or does it just compare the corresponding column types. I think the second one right?

    Anyways, it worked and because it happened to be a UK tax year end report then tomorrows will only have about 5,000 rows to insert because the new tax year starts !!

    Eamon

    ps...and what's wrong with my NOLOCK

  • 50 columns...

    Have you done an estimation of the data to be passed?

    What about estimation of transaction information to be recorded into log file?

    Try to do SELECT TOP 10000, then 100k. It will give you an idea how bad this idea really is.

    > ps...and what's wrong with my NOLOCK

    Syntax.

    It shows you post not actually the query you run.

    _____________
    Code for TallyGenerator

  • Actually, that nolock syntax does work.

    Here's an odd idea, may help, may not. After the truncate, update the statistics of the table you've truncated. Truncate does not affect the rowmodcnt, and hence won't trigger a stats update.

    I would suggest that you drop the nonclustered indexes before the insert and recreate them after. It will be much quicker than trying to insert with several NC indexes in place

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • That syntax does not work.

    It just assigns alias "NOLOCK" to the table.

    _____________
    Code for TallyGenerator

  • What's wrong with your NOLOCK is just what Serqiy said... because it's not in parenthesis, the word NOLOCK is being used as a table alias and not a table optimizer hint.  Get in the habit of typing WITH (NOLOCK) so that if you ever forget the parenthesis again, you will get an error so you can find your mistake.

    Although I'll likely stir up some folks that have a problem with brief locking of a system table, SELECT/INTO would be the way to go for the highest speed but only if you have the "SIMPLE" recovery mode active.  Best not to change it if otherwise is true.

    Otherwise, the best bet is to simply drop the table, recreate it, load it, then apply the clustered index and the other indexes.  You should be able to load at least a million rows a minute depending, of course, on how complicated the joins on the INSERT/SELECT are.

    Still don't understand why you don't post the code, but whatever.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Yeah, but it's not going to throw a syntax error. Just won't do as he's expecting, which he probably didn't notice.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I agree about the indexes, I had a nightly batch that dis something like 14 Million (75 fields) records per night and also took forever. My suggestions: Remove indexes before the load reason for this is when you leave indexs on table it has to update the clustered and non clustered index for every record - think about it logically you hav disk IO for the data and then additional disk IO to update your indexes. open perfmon and look at the current disk cue length for the physical drive for your database file, your count should be something like 2 for every 1 spindle(check the diskription on perfmon) you will probbably notice that your current disk cue length is around 100 - 300 this means that your hard drive cant keep up. Then look at your estimated execution plan, what is your total cost maybe PM me your execution plan. Last but not least what I have done in same scenario if at all possible spend the time to populate your destination table incrementally meaning just the new data and not all the same data everynight.

    Hope this helps or gives you some ideas and guidance.

  • Run DBCC SHOWCONTIG WITH ALL_INDEXES. I bet that table is heavily fragmented.

     

     

  • Eamon,

    Drop the target table (takes indexes with it) and do a SELECT collist INTO tgttable FROM srctable WITH (NOLOCK).  Then, reapply your indexes.  If it takes more than 10 minutes to copy the 7 million rows (should take more like 7 minutes), there is something really wrong with your server, server configuration, or the disk system.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply