SQL Select for 70 Million records

  • Hi,

    We have two servers. Server A is source server which has  a table with 5 columns (2 int, 2varchar(50) , 1 datetime)with 50 million records init. On daily base from Server B SSIS package is used to read the table and just inserts into server B destination table.

    Due you business logic i am unable to say the reason for not using any lookup or other options as such now, Currently table will be truncated daily before the load . "Select Col1, Col2, Col3,Col4,Col5 from my70million" is the sql statement in SSIS source -SQL task.

    Here query use to take some 25 mins to complete , sometime it uses to take 35-45 mins of time ,this might be because of other process would run in parallel, how to make reduce the time consumption and make a static time across all days with minimal time increase if the record count is more.

    Any possible suggestions to reduce the time consumption. NO where clause or anyother joins to the table.

  • Hardware.

    Since you're not in any way filtering data, the only solution is great big giant hardware. Spend lots of money and this will go faster.

    Now, you could look at possibly breaking down the query into chunks, processing 100,000 rows at a time say. You may find this is faster. It will use less transaction log space. It'll allow for a little less locking & blocking. It'll use fewer resources over all for each given transaction. All that may result in a performance enhancement.

    Otherwise, spend money.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • As it's a full table scan, there's not so much you can do.

    You could read this & then experiment with the DefaultBuffer settings.

    Definitely use Fast Load and destination table lock.

    If there are constraints on the target table, or a PK or indexes, consider removing them before doing the load.

    Fine-tune the window and perform the run when systems are not busy with other tasks.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • 1. If the varchar columns repeat values frequently, encode the varchar(50) columns as ints, using a separate encoding/lookup table.  This would be, by far, the biggest performance improvement.

    2. Page compress the table on the originating server, if it saves significant space (it wouldn't save much after encoding, but until then it could).  You can check this using EXEC sys.sp_estimate_data_compression_savings.  Check whether compressing the table on the destination server helps or not.

    3. Use a trigger(s) to mark adds / changes / deletes on the original table so you don't have to copy the whole table every time.  The triggers need to very efficient.  Don't try to do this without triggers, since it's too easy to miss modifications.  This assumes you don't have processes that load data in a way that avoids firing triggers.

    Edit: Any bulk inserts that don't use triggers will have to make sure they load the columns that mark modifications.

    • This reply was modified 3 years, 11 months ago by  ScottPletcher. Reason: Added clarification for bulk inserts if a trigger is used

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • In the OLEDB destination - change the commit and batch sizes from the default values.  The default values essentially send all 70 million rows as a single transaction which then has to all be committed after all rows have been loaded.

    Jeffrey Williams
    Problems are opportunities brilliantly disguised as insurmountable obstacles.

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Thanks Phil. As suggested by you i could see buffer setting has already changed 50% more from default, fast load option is already enabled, table lock to destination table is already done, fast load option also already enabled. Is there still any other options to look into it other the link shared by you. As i could see most of the measures are already in place.

     

  • yuvipoy wrote:

    Thanks Phil. As suggested by you i could see buffer setting has already changed 50% more from default, fast load option is already enabled, table lock to destination table is already done, fast load option also already enabled. Is there still any other options to look into it other the link shared by you. As i could see most of the measures are already in place.

    The advice given by Jeffrey Williams is also worth a try.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply