couldn't kill a query as it is in roll back process..but it has been rolling back fro more than a day now

  • we are having a situation were a transaction log backup of a database was blocked by another transaction log backup of the same database. one was a regular transaction log backup job which runs every 30 mins and the other was a part of job which performs log backup after log truncation. we killed the blocking backup, so now it is in some kind of limbo rolling back. can you please help me understand what is happening and how to resolve this issue.

    thanks in advance

  • That's your problem right there.

    Don't SHRINK. It's a logged operation and there's a whole hell of a lot of stuff to log.

    If you're doing that in a maintenance plan take it out RIGHT NOW, it just causes more problems than you can imagine.

    As for the rollback it's going to take whatever time it needs. Maybe this can give you an estimate of the time left but I don't know for sure.

    SELECT r.[session_id]

    , c.[client_net_address]

    , s.[host_name]

    , c.[connect_time]

    , [request_start_time] = s.[last_request_start_time]

    , [current_time] = CURRENT_TIMESTAMP

    , r.[percent_complete]

    , [estimated_finish_time] = DATEADD(MILLISECOND, r.[estimated_completion_time], CURRENT_TIMESTAMP)

    , [estimated_seconds_left] = DATEDIFF(s, CURRENT_TIMESTAMP, DATEADD(MILLISECOND, r.[estimated_completion_time], CURRENT_TIMESTAMP))

    , current_command = SUBSTRING ( t.[text], r.[statement_start_offset]/2, COALESCE(NULLIF(r.[statement_end_offset], -1)/2, 2147483647) )

    , module = COALESCE(QUOTENAME(OBJECT_SCHEMA_NAME(t.[objectid], t.[dbid])) + '.' + QUOTENAME(OBJECT_NAME(t.[objectid], t.[dbid])), '<ad hoc>')

    FROM sys.dm_exec_requests AS r

    INNER JOIN sys.dm_exec_connections AS c

    ON r.[session_id] = c.[session_id]

    INNER JOIN sys.dm_exec_sessions AS s

    ON r.[session_id] = s.[session_id]

    CROSS APPLY sys.dm_exec_sql_text(r.[sql_handle]) AS t

    WHERE r.[percent_complete] <> 0;

  • Do you by any chance use any third party backup tools? The reason I ask is that I have seen this once on a server using Hyperbac. Something must have gone wrong with Hyperback because the backup never completed, and when I killed the process then rollback never completed. Once I restarted the Hyperbac service then the rollback completed.

  • Sorry some changes in the previous statement

    The shrink never happened. we perform a log backup before the shrink operation.

    the backup got stuck, even before the shrink operation.

    I ran the query and the results are intimidating, it shows another 5 days to complete.

    the rollback is stuck in 19% from yesterday. only the time to complete the rollback is increasing.

    Please check the attachment for the query results.

    Please advise. Will a sql server restart help..thanks.

  • Nils Gustav Stråbø (8/2/2011)


    Do you by any chance use any third party backup tools? The reason I ask is that I have seen this once on a server using Hyperbac. Something must have gone wrong with Hyperback because the backup never completed, and when I killed the process then rollback never completed. Once I restarted the Hyperbac service then the rollback completed.

    we don't use any third party software for backup. the backup command was issued from a .net utility

  • thirulok_t (8/2/2011)


    Sorry some changes in the previous statement

    The shrink never happened. we perform a log backup before the shrink operation.

    the backup got stuck, even before the shrink operation.

    I ran the query and the results are intimidating, it shows another 5 days to complete.

    the rollback is stuck in 19% from yesterday. only the time to complete the rollback is increasing.

    Please check the attachment for the query results.

    Please advise. Will a sql server restart help..thanks.

    I, like many other, preffer screenshots rather than the office suit (because of potential viruses in the .vba of the documents).

    Sorry for any inconvinience.

  • Nevermind my last comment.

    There must be blocking somewhere.

    What is the rollback waiting on?

    --Borrowed from the activity monitor via profiler

    SELECT

    [Session ID] = s.session_id,

    [User Process] = CONVERT(CHAR(1), s.is_user_process),

    [Login] = s.login_name,

    [Database] = ISNULL(db_name(p.dbid), N''),

    [Task State] = ISNULL(t.task_state, N''),

    [Command] = ISNULL(r.command, N''),

    [Application] = ISNULL(s.program_name, N''),

    [Wait Time (ms)] = ISNULL(w.wait_duration_ms, 0),

    [Wait Type] = ISNULL(w.wait_type, N''),

    [Wait Resource] = ISNULL(w.resource_description, N''),

    [Blocked By] = ISNULL(CONVERT (varchar, w.blocking_session_id), ''),

    [Head Blocker] =

    CASE

    -- session has an active request, is blocked, but is blocking others or session is idle but has an open tran and is blocking others

    WHEN r2.session_id IS NOT NULL AND (r.blocking_session_id = 0 OR r.session_id IS NULL) THEN '1'

    -- session is either not blocking someone, or is blocking someone but is blocked by another party

    ELSE ''

    END,

    [Total CPU (ms)] = s.cpu_time,

    [Total Physical I/O (MB)] = (s.reads + s.writes) * 8 / 1024,

    [Memory Use (KB)] = s.memory_usage * 8192 / 1024,

    [Open Transactions] = ISNULL(r.open_transaction_count,0),

    [Login Time] = s.login_time,

    [Last Request Start Time] = s.last_request_start_time,

    [Host Name] = ISNULL(s.host_name, N''),

    [Net Address] = ISNULL(c.client_net_address, N''),

    [Execution Context ID] = ISNULL(t.exec_context_id, 0),

    [Request ID] = ISNULL(r.request_id, 0)

    -- [Workload Group] = ISNULL(g.name, N'')

    FROM sys.dm_exec_sessions s LEFT OUTER JOIN sys.dm_exec_connections c ON (s.session_id = c.session_id)

    LEFT OUTER JOIN sys.dm_exec_requests r ON (s.session_id = r.session_id)

    LEFT OUTER JOIN sys.dm_os_tasks t ON (r.session_id = t.session_id AND r.request_id = t.request_id)

    LEFT OUTER JOIN

    (

    -- In some cases (e.g. parallel queries, also waiting for a worker), one thread can be flagged as

    -- waiting for several different threads. This will cause that thread to show up in multiple rows

    -- in our grid, which we don't want. Use ROW_NUMBER to select the longest wait for each thread,

    -- and use it as representative of the other wait relationships this thread is involved in.

    SELECT *, ROW_NUMBER() OVER (PARTITION BY waiting_task_address ORDER BY wait_duration_ms DESC) AS row_num

    FROM sys.dm_os_waiting_tasks

    ) w ON (t.task_address = w.waiting_task_address) AND w.row_num = 1

    LEFT OUTER JOIN sys.dm_exec_requests r2 ON (s.session_id = r2.blocking_session_id)

    --LEFT OUTER JOIN sys.dm_resource_governor_workload_groups g ON (g.group_id = s.group_id)--TAKE THIS dmv OUT TO WORK IN 2005

    LEFT OUTER JOIN sys.sysprocesses p ON (s.session_id = p.spid)

    ORDER BY s.session_id;

  • its not being blocked by any other process..

    please check the attachment for the query results

  • Restart SQL. I've seen it a few times, a restart fixes it.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (8/2/2011)


    Restart SQL. I've seen it a few times, a restart fixes it.

    afraid to ask...

    and nothing else?

    Couldn't it be that it's waiting a lot on IO to complete the rollback?

  • there is no IO wait. we have RAID 10 15k disks with fiber channel for log drive, it usually takes 3-5 minutes for doing log backups for 700 databases.

    last night i made some inserts and updates in a table and took a full back.

    i restored the database and it had the changes in it.

    i also tried to drop all connections by detaching the database and also tried taking it offline. none of this worked.

    so the only option i have is restart..

  • Ninja's_RGR'us (8/2/2011)


    GilaMonster (8/2/2011)


    Restart SQL. I've seen it a few times, a restart fixes it.

    afraid to ask...

    and nothing else?

    Nothing that I've tried in the cases I've seen

    Couldn't it be that it's waiting a lot on IO to complete the rollback?

    If it was, it would show an IO-related wait.

    p.s. thirulok, please post .jpg or .png for image files. Some of us are still on slow internet connections.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • And by some of us she means herself :-D.

  • sorry Gila..

    i will upload .jpg images, next time.

    regarding IO waits, i just see the regular activity in our subsystem.

    And as mentioned earlier, our Disk subsystem is pretty good.

  • thirulok_t (8/2/2011)


    sorry Gila..

    i will upload .jpg images, next time.

    regarding IO waits, i just see the regular activity in our subsystem.

    And as mentioned earlier, our Disk subsystem is pretty good.

    We're all out of ideas, and by we I mean I ;-).

    Let us know how the reboot went...

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply