Long ETL process running whole night leading to service failure

  • There is an ETL job, which runs whole night and finally at 8:30 morning SQL Server stops responding to requests (Server is in cluster). Then, we will manually recycle the service.

    Database details are:

    Size: 99742.38 MB

    Data file: Initial size (61230 MB) and Growth: 2048 MB

    Log File: Initial Size( 38513 MB) and Growth: 1024 MB

    Database is in simple recovery model.

    Server is X86 model system and has got 19 GB RAM.

    AWE is enabled

    Min Server memory: 128 MB

    Max Server memory: 2147483647 MB

    Drive on which datafile is hosted is having 49 GB free space and Drive on which log file is hosted is having 16 GB free space.

    Actually, i am planning to go for following

    -> Increase tempdb data files to match processor count of 16

    -> Increase Number of log files of that particular database and place it on other drives..

    Any suggestions..

    Thanks,

    Sudhie.

  • My suggestion would be to have a look at the ETL process and see if there are ways you can re-write it to be more efficent.

    What does the process involve? how much data are you moving?

  • Its Cognos data reader, third party tool pushing data to SQL. Databases are created for itself, i am trying to find a guy who should be able to tell me the data volume. All i can see from error log or backup set is daily, it grows for some 1 GB, but not a correct figure though. Log might be growing and getting truncated on restart, as they are ETL processes, they might be involved in lot of calculations.

    One more thing is, they do not have proper indexes it seems. Databases is created automatically from tools...

  • If your server has 19GB of memory you should set the max server memory to 17GB

    SQL could just be choking down on the OS causing the lock ups and issues that you are experiencing.



    Shamless self promotion - read my blog http://sirsql.net

  • Nicholas Cain (5/28/2009)


    If your server has 19GB of memory you should set the max server memory to 17GB

    SQL could just be choking down on the OS causing the lock ups and issues that you are experiencing.

    Allocating 17GB of memory to SQL Server is still too much. At most, you should allocate no more than 15GB leaving at least 4GB free for the OS.

    And, since you are also running an ETL process I would drop the max memory to 12GB and leave plenty of available memory for the other processes.

    As for tempdb - unless you are seeing contention there really is no need to add the additional files. If you still decide that you want to add the files, start with no more than four files (I am assuming you have 4 quad core processors, and you should start with one file per socket instead of one file per core). You can increase this if needed - but you really shouldn't go above 8 files as there is really no benefit after that point.

    Jeffrey Williams
    Problems are opportunities brilliantly disguised as insurmountable obstacles.

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Thanks for the reply,

    This morning i observed something, Just before service got stopped, i was below 2 spids occupying most of the memory.

    60981200x00422656PAGEIOLATCH_SH 2:1:64 2210453521822009-05-28 08:48:41.9932009-05-29 08:21:39.71700suspended 0xFF6C395B3E89B2409A0271727EF1A48B00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000SMIUKLONMGT04 Orion Alerting Engine 1720 DECLARE 001185C3CDA0TCP/IP svc-orion-p 0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000x01004C00F02EB12068C1520F0000000000000000000

    617384600x00025109LCK_M_SCH_M MD: database_id = 2 METADATA_CACHE($hash = 0x1:0x20000000) 2222648352522009-05-28 08:47:56.5402009-05-29 08:21:44.82702suspended 0x15A35AC340919A46B1D1D0E0BF89C60900000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000SMIUKLONAP05P MagicProcessJobs Module 2172 CREATE TABLE 005056B17B56TCP/IP _SMSYSADMIN_ 0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000x03002400BEFD8D4A62D452010A9B0000010000006087020

    These are Solar Winds monitoring SPID's and database_id 2 is TEMPDB. when i tried to find out the SPID, services got stopped.

    Any idea, if solar winds creates something?

    Thanks,

    Sudhie.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply