Why is log file for tempdb filling up?

  • Greetings,

    In the application event log I am seeing the error 9002, Severity 17, State 6. The log file for the database tempdb is full. Back up the transaction log to free some space. I don't understand why this is happening. The tmepdb datafile is 3.6GB and is mostly unallocated. THere is 20GB free space on the drive on which it resides and the log file for the tempdb database is only 52MB. Both the datafile and the transaction log are set to autogrow by 10% and the file size is unlimited. So I don't understand why this is happening.

    Poking around with google I find a couple of suggestions, one is that perhaps there is some query doing lots of sorting which could potentially cause the problem. Another suggestion has to do with timeing of backups and in fact the the regularly scheduled backup of the transaction log for the production database is started about 6 minutes before this event occurred. I also saw something about collating tables but I don't understand the relevance of that.

    The question is twofold, 1) is there a method whereby I can determine what caused this and 2) what can I do to prevent it form occurring again?

    Thanks.

  • I do not recommend ever setting the growth to a percent. Set it to 100MB, see if that resolves the issue. If not, increase it to 200 or more.

    There are many things that can use the TEMPDB. Usually that error just means the log file couldn't grow fast enough for the current activity.

    -SQLBill

  • This error may not cause your system down. The database tempdb is automatically shrunk normally. There is an "bug" here. We may get an error message if a user sends a large transaction while the database is shrinking. It happened.

  • SQLBill, I am curious as to why you suggest setting growth to a value rather than a percent. Are there preformance issues or is it merely because of the increasing uneven growth that makes possible?

    Thanks.

  • Setting a percent will cause exponent growth. It's always growing 10% of the current size. So as the db gets larger it needs to expand even more.

    Let's say you have a 10 mb db and you get 1 mb of growth a day. By 10%, the first growth will be 1 mb to 11 mb. The next growth will be 10% of that 1.1 mb to 12.1 mb, then 10% of that. Eventually, your db could be growing much larger than the space you really need. Let's say you have 10 GB of disk space and 9 gb is used. But you are auto-growing by 10%. The next autogrow will be 1 GB (remember you only need 1 mb a day) and now your database is full and cannot grow anymore. Adjusting it to grow by the proper MB size can help you to keep better control of the growth.

    -SQLBill

  • Thanks, that is what I suspected.

    I have seen many references to this problem being caused by bugs or large transactions or transactions occurring as the backup is running. Is there a means to prevent this problem or to at least determine what specific event caused it?

    Thanks.

  • I have read that if autogrowth is set to a %, tempdb grows at 10% of the initial size. I always want to make sure this, but BOL doesn't explicitly specify it. Can anyone please confirm?

    Thanks 

  • Nope, everything I have seen and been told in MS Classes is that it grows by 10% of it's CURRENT size.

    -SQLBill

  • From the BOL...

    To specify that the file should grow by a percentage of the current file size, select By percent and specify a value.

    -SQLBill

  • Tempdb issues in SQL 2000 : I do these two things for tempdb:

    1) Apply MS KB recommendations for tempdb concurrency.[/url] This is done in 2005 as a production setup standard, too. To summarize, you create as many tempdb data files as there are physical processors, sized the same, and with autogrowth turned OFF. Log can be autogrow ON. You apply trace flag 1118 as a startup pramater.

    For a 4 proc box, I usually set up 4 tempdb data files of 1 GB each. Log is set to 4 GB initial size with growth increments of 500 MB. I never set up autogrowth by %. Autogrow is really a failsafe thing - you don't want to depend on it to maintain your dbs. You should be sizing the dbs to allow for free space and monitor for low free space, to increase the free space after hours.

    2) For tempdb that fills up fast -it is usually a developers job that is mis-using tempdb ... where a temp table is created and then one large transaction is launched, rather than a looped transaction with commits along the way.

    Another cause I have seen is that replication (or a 3rd party replication product like Datamirror) using tempdb. Also in SQL 2005, certain index maint options can drive up tempdb to extremely large sizes very quickly.

    In this case I put a SQL Agent job on the server to keep tempdb down -

    step 1 in the job is simply to checkpoint tempdb.

    step 2 is to truncate tempdb log.

    Here is the script for the SQL 2000 job - try it in non-production first.[/b] I have been using it for two years in production on SQL 2000.

    Adjust the schedule by observing tempdb in Enterprise Manager taskpad view and refresh the view every so often to see the data usage and log usage. In the case of this job script, tempdb log had grown to 135 GB, and the log fill rate was about a 500 MB every minute. I manually ran a checkpoint on tempdb, then shrank the log down to 4GB, then applied this job script to msdb db. I set the schedule for every 2 minutes, and so far tempdb has been fine.

    (Haven't tried in SQL 2005 but should work fine).

    [font="Arial Narrow"]

    BEGIN TRANSACTION

    DECLARE @JobID BINARY(16)

    DECLARE @ReturnCode INT

    SELECT @ReturnCode = 0

    IF (SELECT COUNT(*) FROM msdb.dbo.syscategories WHERE name = N'Database Maintenance') < 1

    EXECUTE msdb.dbo.sp_add_category @name = N'Database Maintenance'

    -- Delete the job with the same name (if it exists)

    SELECT @JobID = job_id

    FROM msdb.dbo.sysjobs

    WHERE (name = N'! Tempdb')

    IF (@JobID IS NOT NULL)

    BEGIN

    -- Check if the job is a multi-server job

    IF (EXISTS (SELECT *

    FROM msdb.dbo.sysjobservers

    WHERE (job_id = @JobID) AND (server_id <> 0)))

    BEGIN

    -- There is, so abort the script

    RAISERROR (N'Unable to import job ''! Tempdb'' since there is already a multi-server job with this name.', 16, 1)

    GOTO QuitWithRollback

    END

    ELSE

    -- Delete the [local] job

    EXECUTE msdb.dbo.sp_delete_job @job_name = N'! Tempdb'

    SELECT @JobID = NULL

    END

    BEGIN

    -- Add the job

    EXECUTE @ReturnCode = msdb.dbo.sp_add_job @job_id = @JobID OUTPUT , @job_name = N'! Tempdb', @owner_login_name = N'sa', @description = N'MBEADLES: This job was placed on the server 1/8/2008 because tempdb Log file was growing very large (136 GB + ). This will checkpoint tempdb at 2 minute intervals and then truncate the tempdb log. I have used it on my prod servers for at least one year and no problems.', @category_name = N'Database Maintenance', @enabled = 1, @notify_level_email = 0, @notify_level_page = 0, @notify_level_netsend = 0, @notify_level_eventlog = 2, @delete_level= 0

    IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

    -- Add the job steps

    EXECUTE @ReturnCode = msdb.dbo.sp_add_jobstep @job_id = @JobID, @step_id = 1, @step_name = N'checkpoint', @command = N'checkpoint', @database_name = N'tempdb', @server = N'', @database_user_name = N'', @subsystem = N'TSQL', @cmdexec_success_code = 0, @flags = 0, @retry_attempts = 0, @retry_interval = 1, @output_file_name = N'', @on_success_step_id = 0, @on_success_action = 3, @on_fail_step_id = 0, @on_fail_action = 2

    IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

    EXECUTE @ReturnCode = msdb.dbo.sp_add_jobstep @job_id = @JobID, @step_id = 2, @step_name = N'backup', @command = N'Backup log tempdb

    with truncate_only', @database_name = N'master', @server = N'', @database_user_name = N'', @subsystem = N'TSQL', @cmdexec_success_code = 0, @flags = 0, @retry_attempts = 0, @retry_interval = 1, @output_file_name = N'', @on_success_step_id = 0, @on_success_action = 1, @on_fail_step_id = 0, @on_fail_action = 2

    IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

    EXECUTE @ReturnCode = msdb.dbo.sp_update_job @job_id = @JobID, @start_step_id = 1

    IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

    -- Add the job schedules

    EXECUTE @ReturnCode = msdb.dbo.sp_add_jobschedule @job_id = @JobID, @name = N'sch', @enabled = 1, @freq_type = 4, @active_start_date = 20060417, @active_start_time = 0, @freq_interval = 1, @freq_subday_type = 4, @freq_subday_interval = 2, @freq_relative_interval = 0, @freq_recurrence_factor = 0, @active_end_date = 99991231, @active_end_time = 235959

    IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

    -- Add the Target Servers

    EXECUTE @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @JobID, @server_name = N'(local)'

    IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

    END

    COMMIT TRANSACTION

    GOTO EndSave

    QuitWithRollback:

    IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION

    EndSave:

    [/font]

  • Chaohua Wang (10/5/2006)


    This error may not cause your system down. The database tempdb is automatically shrunk normally. There is an "bug" here. We may get an error message if a user sends a large transaction while the database is shrinking. It happened.

    Unless you mean something else, I gotta say, No... it's NOT! You just gotta show me where you got that bad piece of information so I can go throw a rock at them... 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Thanks for the script, Beadles, just what I needed to manage my problem with TempDB. Three times in the last month it has filled the hard drive and given me error 9002. I have not yet been able to determine the root cause.

    I have a question. Is there a way to trigger this checkpoint/truncate when TempDB or TempDB log file hits a certain size?

  • Unless you've changed the recovery model of TempDB to something it shouldn't be (i.e. anything other than "simple") - it should automatically truncate as transactions commit.

    So - the trick is to keep the impact of your transactions/operations on TempDB to something manageable.

    Do you have a good handle on what is happening when TempDB "blows up"? Something with a missing join/badly build join can easily grab bunches and bunches of data (even if something later in the process then tosses a whole bunch of that junk out), etc...

    If you do - give us rundown of what it is and we can see what will help fix your issue. If you don't, well - you need to find out. It's trying to do something it's been told to do, so it's going to keep running out of room until you find and prosecute said offender to any allowable lengths you can find.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Trigger by file size? Absolutely.

    You could have a job that runs periodically to put the free space in the files into variables, and then when the free space is at some threshold you set, fire off the tempdb job.

    But I think this over-complicates the process. There is no downside to running this tempdb job once every two minutes (as I do on one box) or every two hours (as I do on another). Just as long as you know your tempdb fill patterns and set the schedule to accomodate it.

    BTW - I had to set the two minute one to maintain tempdb log that is filling up via processes unknown to the application developer side. I am currently considering hardware causes (I see 47 errors thrown the Virtual Drive service) and OS patch level is different from a nearly identical server with no tempdb problem.

    But this tempdb job lets us do the troubleshooting without fear of a server halt.

  • Other things you can do to analyze tempdb hammering:

    1) Find out what sp's hit tempdb:

    Script (best if run with results to text mode with space as delimiter):

    Exec sp_msforeachdb @command1 = 'use ?; Select ''?''; select distinct ''Found in ? '', b.name from syscomments a, sysobjects b where a.id = b.id and CHARINDEX(''#'', a.text) > 0'

    Note - '' = two single quotes, not a double quote mark.

    2) Run Profiler with the Standard template, add in a few other data columns like HostName, DatabaseID, DBUsername, Application, NTUsername -- and put in a filter in TextData for LIKE '%#%'

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply