DBCC CHECKDB Hangs

  • I have SQL 2005 SP2 64 bit running on W2K3 SP2 64 bit and when I run DBCC CHECKDB it hangs. I have run it against the same database on SQL 2005 SP2 32 bit and it completes in 30 mins.

    Any idea's would be appreciated

  • Hey,

    This may be due to new data purity checks that SQL Server 2005 automatically performs for databases created within SQL Server 2005. If you created your databases using this version, this is default behaviour. Try running DBCC CHECKDB with PHYSICAL_ONLY option, which does not perform the data purity check.

    I would also suggest another test by upgrading to Cumulative Update 5 for SQL Server 2005 Service Pack 2 and re-running DBCC CHECKDB again....Test environment only.

    Thanks,

    Phillip Cox

  • Phil,

    Thanks for that I will try running DBCC CHECKDB with PHYSICAL_ONLY option. and also upgrade to the cumulative update.

    Unfortunately whilst I can run it against a test database I have to use my live server as it is the only 64 bit server I have.

    Once I have tried both I will let you know either way

    Regards

    Sean

  • Hi Sean,

    Ok, look forward to update.

    Thanks,

    Phillip Cox

  • Hi Paul,

    Unfortunately both of your suggestions failed. It seems that with SQL x64 standard edition you cannot lock pages in memory so DBCC CHECKDB keeps paging and therefore gets itself in a loop, I think I will have to contact PSS on Monday or update to EE.

    Thanks anyway

    Regards

    Sean

  • Not that CHECKDB gets itself in a loop, just that because it effectively cycles the buffer pool and uses lots of memory, it could cause lots of paging depending on your architecture. You'd see the same behavior with a complex join between two huge tables as well.

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Paul,

    Thanks for the reply, you are right I dont think it is CHECKDB that is causing the problem as we took the same DB onto a 32 bit system and ran CHECKDB and it completed. I think the problem is with 64 bit SE SQL 2005 on W2K3 64 bit with paging problems, I will have to speak to PSS on Monday and try and find a resolution.

    If you have any other ideas please let me know, there is a beer for the first solution!

    Regards

    Sean

  • One solution that I recommend as a way for customers to offload the CPU and IO workload of CHECKDB from production servers is to restore your backups on another system and run CHECKDB on them there.

    More info in this blog post - http://www.sqlskills.com/blogs/paul/2007/11/19/CHECKDBFromEveryAngleConsistencyCheckingOptionsForAVLDB.aspx

    Thanks

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Paul

    That is what we have done, we took a backup of the DB and moved it to another server and ran checkdb, it took about 20 mins to complete.

    What is frustrating is that the server that managed to complete checkdb is a 32bit box with 2 GB of RAM and a single processor whereas the production box is a 4 way 64bit box with 8 GB of RAM, I would have expected it to run much quicker on the production box but it doesn't because of paging problems.

    I want to speak to PSS to see if there is a solution as most of the posts I have seen say the only way is to either upgrade to enterprise edition or go back to 32bit, which if true means that SQL 2005 64bit Standard Edition is pretty much useless!

    Regards

    Sean

  • Hi Sean,

    Thanks for update and sorry suggested test didn't solve problem.

    I am still looking into this and just wanted to know the size of the database you are checking? Have you tried running DBCC CHECKDB on a smaller or sample database?

    Please let me know and I am pretty sure we get to cause, as I have worked on a number of x64 SE systems supporting fairly large databases and haven't experienced this problem, so very interested.

    If possible, please provide following information:

    1. sp_configure for production box (x64) vs sp_configure for recovery box (32-bit)

    2. # of filegroups for concerned database

    Have you tried to run DBCC CHECKDB in sub-phases to break-down? If not, try the following to see where is takes the longest:

    1. DBCC CHECKALLOC

    2. DBCC CHECKCATALOG

    3. DBCC CHECKTABLE

    4. Version of SQL Server (e.g. select @@version)

    5. Check for read-only filegroup(s)

    Also, any information on the actual server build(s) would be helpful to understand (e.g. CPU type), including storage layout and RAID levels.

    Thanks,

    Phillip Cox

    Thanks,

    Phillip Cox

  • Phillip,

    The database is just over 17GB.

    The results from sp_configure are:-

    Test Box:-

    name minimum maximum config_value run_value

    ----------------------------------- ----------- ----------- ------------ -----------

    allow updates 0 1 0 0

    clr enabled 0 1 0 0

    cross db ownership chaining 0 1 0 0

    default language 0 9999 0 0

    max text repl size (B) 0 2147483647 65536 65536

    nested triggers 0 1 1 1

    remote access 0 1 1 1

    remote admin connections 0 1 0 0

    remote login timeout (s) 0 2147483647 20 20

    remote proc trans 0 1 0 0

    remote query timeout (s) 0 2147483647 600 600

    server trigger recursion 0 1 1 1

    show advanced options 0 1 0 0

    user options 0 32767 0 0

    Live Box :-

    name minimum maximum config_value run_value

    ----------------------------------- ----------- ----------- ------------ -----------

    Ad Hoc Distributed Queries 0 1 0 0

    affinity I/O mask -2147483648 2147483647 0 0

    affinity mask -2147483648 2147483647 0 0

    affinity64 I/O mask -2147483648 2147483647 0 0

    affinity64 mask -2147483648 2147483647 0 0

    Agent XPs 0 1 1 1

    allow updates 0 1 0 0

    awe enabled 0 1 0 0

    blocked process threshold 0 86400 0 0

    c2 audit mode 0 1 0 0

    clr enabled 0 1 0 0

    cost threshold for parallelism 0 32767 5 5

    cross db ownership chaining 0 1 0 0

    cursor threshold -1 2147483647 -1 -1

    Database Mail XPs 0 1 0 0

    default full-text language 0 2147483647 1033 1033

    default language 0 9999 0 0

    default trace enabled 0 1 1 1

    disallow results from triggers 0 1 0 0

    fill factor (%) 0 100 0 0

    ft crawl bandwidth (max) 0 32767 100 100

    ft crawl bandwidth (min) 0 32767 0 0

    ft notify bandwidth (max) 0 32767 100 100

    ft notify bandwidth (min) 0 32767 0 0

    index create memory (KB) 704 2147483647 0 0

    in-doubt xact resolution 0 2 0 0

    lightweight pooling 0 1 0 0

    locks 5000 2147483647 0 0

    max degree of parallelism 0 64 0 0

    max full-text crawl range 0 256 4 4

    max server memory (MB) 16 2147483647 7168000 7168000

    max text repl size (B) 0 2147483647 65536 65536

    max worker threads 128 32767 0 0

    media retention 0 365 0 0

    min memory per query (KB) 512 2147483647 1024 1024

    min server memory (MB) 0 2147483647 2048000 2048000

    nested triggers 0 1 1 1

    network packet size (B) 512 32767 4096 4096

    Ole Automation Procedures 0 1 0 0

    open objects 0 2147483647 0 0

    PH timeout (s) 1 3600 60 60

    precompute rank 0 1 0 0

    priority boost 0 1 0 0

    query governor cost limit 0 2147483647 0 0

    query wait (s) -1 2147483647 -1 -1

    recovery interval (min) 0 32767 0 0

    remote access 0 1 1 1

    remote admin connections 0 1 0 0

    remote login timeout (s) 0 2147483647 20 20

    remote proc trans 0 1 0 0

    remote query timeout (s) 0 2147483647 600 600

    Replication XPs 0 1 0 0

    scan for startup procs 0 1 1 1

    server trigger recursion 0 1 1 1

    set working set size 0 1 0 0

    show advanced options 0 1 1 1

    SMO and DMO XPs 0 1 1 1

    SQL Mail XPs 0 1 0 0

    transform noise words 0 1 0 0

    two digit year cutoff 1753 9999 2049 2049

    user connections 0 32767 0 0

    user options 0 32767 0 0

    Web Assistant Procedures 0 1 0 0

    xp_cmdshell 0 1 0 0

    There is only 1 filegroup.

    Box Layouts are:-

    Test Box:-

    Processor - Intel P3 hyperthreading enabled

    Memory - 1Gb

    Discs - 2 Discs OS on 1st Disc Data on second

    OS - Windows 2003 SE SP1 32Bit

    Production Box

    Processors - Dual Xeon 1.6 Dual Core

    RAM - 8GB

    Discs - 1 x RAID 0 Array OS and Progs, 1 x Raid 10 array Data

    OS - Windows 2003 R2 SP2 64bit

    Splitting DBCC CHECKDB gave the following results:-

    CHECKALLOC

    Test Box - 1min 43 seconds

    Production Box - 29 seconds

    CHECKCATALOG

    Test Box - 22 seconds

    Production Box - 3 seconds

    DBCC CHECKTABLE

    Do I need some parameters with this as I get the following error message on both boxes if I don't use any:-

    Msg 2583, Level 16, State 3, Line 1

    An incorrect number of parameters was given to the DBCC statement.

    @@version returns:-

    Test Box - Microsoft SQL Server 2005 - 9.00.3042.00 (Intel X86)

    Feb 9 2007 22:47:07

    Copyright (c) 1988-2005 Microsoft Corporation

    Standard Edition on Windows NT 5.2 (Build 3790: Service Pack 1)

    Production Box - Microsoft SQL Server 2005 - 9.00.3215.00 (X64)

    Dec 8 2007 17:58:16

    Copyright (c) 1988-2005 Microsoft Corporation

    Standard Edition (64-bit) on Windows NT 5.2 (Build 3790: Service Pack 2)

    All the filegroups are not read only

    Just for your info we are running this because we had index corruption which we fixed by running:-

    sp_msforeachtable 'dbcc dbreindex (''?'')'

    Hope with all this you can shed some more light

    Regards

    Sean

  • Hi,

    I'm having the exact same issue on a production server that is almost identical as yours.

    - SQL 2005 SP2 64 bit running on W2K3 SP2 64

    - Two Xeon 2Ghz Quad-Core 64 bits

    - 4 Gigs RAM

    We have DBCC CHECKDB running every night on 6 databases (the biggest being about 25 gigs). The operation fails every 3-4 days since two weeks ago.

    We tried to remove the CheckDB for the bigger database and the operation still failed on friday night (It worked on 2 little 500Mb databases and a 2gig one, but failed on a fourth which is also about 2gig in size).

    Our database server has 4 gigs RAM and the DBCC CHECKDB seems to fail after processing about 4 gig worth of databases. I don't know if this is related to our issue but it is somewhat strange. In fact, I don't think this is related because the DBCC CHECKDB works 3 times out of 4.

    Our checkDbs are done in an SSIS Package.

    I also found this thread on SQLTeam which may be a good place to start : http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=67196

    So, if your find a solution, please post it here. In the meantime, I'll continue searching on this issue and keep you informed of my findings.

    Stanislas Biron

  • Hey,

    Ok, we'll try a few quick tests to see if we can uncover the root cause.

    First, please run the following command and post results. In addition, please let me know the size of tempdb too.

    DBCC CHECKDB ('your problem database') WITH ESTIMATEONLY

    DBCC CHECKTABLE ('your table name')

    Also, if possible, can we setup a profiler trace to monitor DBCC CHECKDB work while it runs? If you can, please attach to post to allow review.

    Also, can you confirm disk layout, as previous post states 1x RAID 0 and 1 x RAID 10.

    Thank you,

    Phillip Cox

    MCITP - Database Administrator

  • Hi,

    Here are the results if the DBCC CHECKDB ('MyDatabase') WITH ESTIMATEONLY :

    Estimated TEMPDB space needed for CHECKALLOK (KB)

    4737

    Estimated TEMPDB space needed for CHECKTABLES (KB)

    848092

    These are the numbers for the 25 gigs database. For the others the numbers are in the same proportion to their size.

    The size of the tempdb database is currently 211 megs.

    Our disk layout is a Raid 5 array.

    I'll setup a profiler to monitor what happens in the maintenance operations tonight and post the results tomorrow morning.

    It doesn't seem that the problem is caused by a particular database, since the sequence of events in the Server's application log don't stop at the same DBCC CHECKDB everytime. It has always been on the 25 gigs database, but after I removed the check on that particular database, the issue arose again in another database.

    We didn't rule out hardware problems but we would have seen others symptoms of this by now.

    Also, for your information, I disabled the auto create statistics and auto update statistics on the tempdb database last week to correct another problem we had with a vendor's software that was over-using the tempdb database (creating about 40 temp tables per second while importing some data in their system). Needless to say that this system is no longer on our production server.

    Kb article related to this: http://support.microsoft.com/kb/916086

    Thanks for your help !

    Stanislas Biron

  • Stanislas,

    When you say that it 'failed' - what exactly do you mean? It seemed to hang? It complained that it didn't have enough disk space? Or some other error?

    Your tempdb is 211MB and CHECKDB says it needs over 800MB to run - that's going to to cause tempdb to grow. If you have tempdb not set to autogrow, or there isn't enough disk space where tempdb is located, CHECKDB will fail. Are any of these the case?

    Is you tempdb on the RAID5 array? If so, it's going to suffer badly for write performance, as with RAID5 you pay a penalty on writes. This could be the cause of your issue with the 3rd party software. 40 temp tables per second doesn't seem high to me (depending on the application) and SS2005 specifically has tempdb allocation algorithms to not cause contention with massive temp table creation/deletion.

    Thanks

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

Viewing 15 posts - 1 through 15 (of 25 total)

You must be logged in to reply to this topic. Login to reply