Insert slowing down

  • Hi all,

    I am looking for some professional opinions here with a slight annoyance I'm facing.

    I program for a company who imports quite a bit of data in a week's time (up to 4mil rows).

    Some background:

    The data is imported from a data file into the database using a custom designed software tool,

    this tool then encrypts certain data portions before saving the data to the database.

    Up to 9 tables and up to 100 columns in total may be affected and up to 400k rows (limited)

    may be imported per sitting.

    My problem is this:

    The import starts out fine with relatively good speed (encryption slows it down), it can take

    me up to 1hour 40min to process (give or take a few min), and yes this is a time sensitive process.

    The data is alpha numeric and varied in length and complexity. Encryption and data formatting is

    handled by the client pc and only data saving is done by the server.

    And yes I've tried this with and without indexes, just so you know...

    Any thoughts?

    Thanks,

    Leonard

    PS: Quite the mouth full ๐Ÿ™‚

  • You need to specify a lot more detail.

    You say that the processing starts fast and then slows down. How do you know this?

    Is the bottlneck on the client or the server ? Is the bottleneck in CPU or disk IO ?

    Exactly what technology are you using in the client program ? Is it C#, C++, VB.NET, VB6 or what ? Could you show us the code that is performing the insert ?

  • Well letโ€™s elaborate the situation:

    Technologies used:

    Server is a Quad Xeon 2.45Ghz (or something like that) 16Gb Ram, 1.5Tb total storage divided into 4 drives 1 for Windows, 1 for mdf 1 ldf etc. MS Windows server 2003 R2 64bit SQL 2005 64bit.

    Development software Vb.Net 2003 direct link to server i.e. closed network approx 15~20 user concurrent.

    Hope that helps a bit, to start off on.

    Speed determined by amount of row read and inserted per minute, this includes reading from the file, formatting, encryption (certain items only) then saving to db. Generally starts at 134~180 entries/minute then falling to 40~66 entries/min.

    I am unable to determine bottleneck point, as far as I can see there is none.

    Hope this helps...

  • lb_henning (8/3/2010)


    Well letโ€™s elaborate the situation:

    Technologies used:

    Server is a Quad Xeon 2.45Ghz (or something like that) 16Gb Ram, 1.5Tb total storage divided into 4 drives 1 for Windows, 1 for mdf 1 ldf etc. MS Windows server 2003 R2 64bit SQL 2005 64bit.

    Development software Vb.Net 2003 direct link to server i.e. closed network approx 15~20 user concurrent.

    Hope that helps a bit, to start off on.

    Speed determined by amount of row read and inserted per minute, this includes reading from the file, formatting, encryption (certain items only) then saving to db. Generally starts at 134~180 entries/minute then falling to 40~66 entries/min.

    I am unable to determine bottleneck point, as far as I can see there is none.

    Hope this helps...

    You first need to determine if it is the client program that is slowing down or if it is the database.

    I suspect that it is the client program that is doing something bad.

    Start by running the SQL profiler to see exactly what the client program is doing. Compare durations of insert statements at the beginning of the run with durations at the end of the run.

    If you see that insert statements are actually slowing down you can continue trying to diagnose what is happening at the database level, if each insert statement still takes the same amount of time you need to look closely at what the client program is doing.

  • Makes sense, I'll take a look at script execution times.

    Thanks

  • I checked the script durations and went like this:

    The first 10min duration 0, at 10min it drops to 35, at 20min to 110 and after 30min it goes to 500 and slower.

    The scripts remain the same, its just the values that change.

  • Sounds like you are doing this a row at a time? Can I ask why? lots of things get bad when you do row-by-row processing. Can you batch the work up into a set based operation?

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • We did use batch processing originally, can't remember why, but found that row-by-row approach worked more efficiently back then when we were still using SQL2000 (now SQL 2005 Std).

    We run two types of imports, the first uses alpha numerical data and the second (which is identical to this process) which uses numeric only. The latter taking 15 to 20min max, that includes encryption.

    So, yes this is row-to-row processing

  • lbdw (8/4/2010)


    I checked the script durations and went like this:

    The first 10min duration 0, at 10min it drops to 35, at 20min to 110 and after 30min it goes to 500 and slower.

    The scripts remain the same, its just the values that change.

    Please post an example of a script that is taking 500ms to execute.

    Also please post table definitions and indexes for any tables referenced in the script.

  • The script looks like this:

    INSERT INTO [OPR_SUPER_BOX] (SBox_No,Job_ID) VALUES ('000000000000000',152)

    INSERT INTO [OPR_BOX] (Box_No,Job_ID,SBox_No) VALUES ('000000000000000000000',152,'0000000000000000')

    INSERT INTO [OPR_BATCHES] (Batch_No,Box_No,Job_ID) VALUES ('00000000000000000000000','00000000000000000000000',152)

    INSERT INTO [OPR_SERIALS] (Serial,Job_ID,Batch_No,Exp_Date,Misc,SCID)VALUES('0000000000000',152,'00000000000000000000000','','','')

    INSERT INTO [OPR_PINS] (Serial,Job_ID,Pin_Num,Pin_Set,Pin_Serial) VALUES ('000000000000',152,'0000000000000000000000000000000000000000000000000000000000000000',1,'000000000000')

    INSERT INTO [OPR_PINS] (Serial,Job_ID,Pin_Num,Pin_Set,Pin_Serial) VALUES ('000000000000',152,'0000000000000000000000000000000000000000000000000000000000000000',2,'000000000000')

    INSERT INTO [OPR_PINS] (Serial,Job_ID,Pin_Num,Pin_Set,Pin_Serial) VALUES ('000000000000',152,'0000000000000000000000000000000000000000000000000000000000000000',3,'000000000000')

    INSERT INTO [OPR_PINS] (Serial,Job_ID,Pin_Num,Pin_Set,Pin_Serial) VALUES ('000000000000',152,'0000000000000000000000000000000000000000000000000000000000000000',4,'000000000000')

    INSERT INTO [OPR_PINS] (Serial,Job_ID,Pin_Num,Pin_Set,Pin_Serial) VALUES ('000000000000',152,'0000000000000000000000000000000000000000000000000000000000000000',5,'000000000000')

    Zeros being place holders for actual data matching value lengths.

    Average of 2/3 indexes per table, one unique PK and 1~2 clustered indexes on most searched items.

    In this case the OPR_PINS table has PK_PINS (PK) ON Unique_ID [bigint], then index:

    IX_OPR_PINS ON Job_ID [int], Serial [varchar] (50).

    No triggers used no additional sp's.

    Thats it, other tables are similar, btw, did a software check, dummy import to local memory only took like 30seconds

  • Have you tried running the load with PerfMon running? Check disk IO activity, waits, read/writes per second, CPU usage, Memory Usage, Buffer Cache hit ration.

    Loading 4 million rows at 400k per row (is that kilo-bits?) all in one batch is a lot of data, how many GB are you loading.

    I'd say you are stating to get a really big log file, possibly have a bottleneck in your RAM and have a lot of TempDB activity.

    I recommend you build in some sort of batch control to try load about 100000 rows at a time and then do a commit. You would probably need to test what batch size give the best results.

    Cheers

    Leo

    Leo
    Nothing in life is ever so complicated that with a little work it can't be made more complicated.

  • Leo.Miller (8/5/2010)


    I recommend you build in some sort of batch control to try load about 100000 rows at a time and then do a commit. You would probably need to test what batch size give the best results.

    I agree. Your problem could be caused by performing too many commits. If you do not explicitly use BEGIN TRAN and COMMIT each insert statement will be its own transaction which means that each insert will have to wait until data is actually written to the log disk.

    It is possible that everything runs fast at the start of the job and then slows down very much when caches on the disk and disk controller are full.

    If this is the problem then performing fewer commits will probably fix it.

  • Thanks for the feedback guys, but unfortunately I only upload 400K entries at a time never more using 1 Transaction and commit at end.

    Also Isolation level are set to minimal locks for importing (set using VB interface)

    Memory floats around 4Gb total usage or so, CPU usage 10~15% usage (avg between cores), disk I/O write ops avg around 500/sec,

    but falls as SQL slows down, everything seems to mimic SQL as its slows down.

    This DB is only 10Gb or so, and the files loaded are generally around 20mb and after formatting it could double to 40mb.

    I have another DB about 140Gb which has no problems ever, with the same design.

    To me this is kinda perplexing

  • lbdw (8/6/2010)


    Thanks for the feedback guys, but unfortunately I only upload 400K entries at a time never more using 1 Transaction and commit at end.

    Also Isolation level are set to minimal locks for importing (set using VB interface)

    Memory floats around 4Gb total usage or so, CPU usage 10~15% usage (avg between cores), disk I/O write ops avg around 500/sec,

    but falls as SQL slows down, everything seems to mimic SQL as its slows down.

    This DB is only 10Gb or so, and the files loaded are generally around 20mb and after formatting it could double to 40mb.

    I have another DB about 140Gb which has no problems ever, with the same design.

    To me this is kinda perplexing

    I agree, it sounds strange.

    What does the Avg Disk Sec/Write and % Disk time and write bytes/sec look like for each physical disk ? Does the avg disk sec/write increase as the program runs ? Are any of the disks pegged att 100% disk time or above ?

    Have you analyzed waits states in SQl server ? What are the main wait states used during the import ?

    See this article for more information: http://download.microsoft.com/download/4/7/a/47a548b9-249e-484c-abd7-29f31282b04d/Performance_Tuning_Waits_Queues.doc

    Is there any mirroring, log shipping, replication, or anything like that in use ?

    Is it possible that there is a heavy SQL profiler trace running on the server ? Perhaps by another user ?

    Use select * from ::fn_trace_getinfo(0) to check

  • Well, let me list some more stats:

    average IO write ops 4~47 / sec, avg 188434 bytes/ sec (running raid thus using averages) , half way through it starts to peak 100+ sustained and write ops at 600+ /sec then drops again, and so on and so on....

    Batch request start at 80/sec then drop to 40/sec +-12, Lock waits none.

    SQL Buffer cache hit ratio 99.33* sustained

    Page Life Expectancy 194~288+

    Page write 0~1048/sec (as required)

    No locks or blocking locks on the active process.

    No mirroring

    No replication

    Logging kept to minimum

    All this seems to indicate rather stable conditions, nothing to really jump at me, well maybe the IO write fluctuations, but still

    I tested all of this on my test server as well, Quad Core, 4Gb ram, 800Gb RAID WinXP SQL 2005 32bit and the results

    are along the same lines as this, go figure ๐Ÿ™‚

    What can I say...

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply