Database cleanup?

  • We have a fairly large SQL database of a little over 30 Gb. I know that access has a database cleanup feature that removes all the un-needed space allocation from the database. I am just wondering if SQL 2005 has a feature like that or does it automatically keep the size to a minimum?

  • The solution to reducing the size of a database file is a multistep process that can be quite complex due to some of the steps causing database fragmentation, which has a significant impact on performance.

    Are there maintenance plans to rebuild indexes ?

    SQL = Scarcely Qualifies as a Language

  • The server itself does a complete backup to a single file at noon everyday and there is a complete backup of all databases by symantec backup exec every midnight. I am asuming this is what you were asking about. Maintenace plan on indexes.

  • Where would I begin to learn about this process?

  • Index maintenance : Includes checking out the Fragmentation of INDEXES using DBCC SHOWCONTIG : http://msdn.microsoft.com/en-us/library/ms175008.aspx

    REBUILD Indexes includes rebuilding the Fragmented Indexes, which you can view either by the above Command or USE sys.dm_db_index_physical_stats and sys.dm_db_index_usage_stats and sys.dm_db_index_operational_stats

    Select * from sys.dm_db_index_physical_stats (DB_ID('DatabaseName'),NULL,NULL,NULL,'DETAILED') and check out the avg_fragmentation_in_percent column to figure out the fragmentation level on your indexes. Exclude the HEAP Tables. and Build your Indexes Likewise, using the ALTER INDEX Statements.

    http://technet.microsoft.com/en-us/magazine/cc162476(TechNet.10).aspx

    Shrink Database: To shrink the Size of the Database Data and Log Files using DBCC SHRINKDATABASE : http://msdn.microsoft.com/en-us/library/ms190488.aspx also Check out DBCC SHRINKFILE.

  • Thank you for all the great replies. I ran dbcc showcontig on our largest database, it took about 5 minutes to run. I am not sure I understand what the output means exactly. Here is a clip of the output for two of the larger tables. Can someone help me decode what the output is saying?

    DBCC SHOWCONTIG scanning 'GLPJD' table...

    Table: 'GLPJD' (84195350); index ID: 1, database ID: 27

    TABLE level scan performed.

    - Pages Scanned................................: 259395

    - Extents Scanned..............................: 32674

    - Extent Switches..............................: 32935

    - Avg. Pages per Extent........................: 7.9

    - Scan Density [Best Count:Actual Count].......: 98.45% [32425:32936]

    - Logical Scan Fragmentation ..................: 1.31%

    - Extent Scan Fragmentation ...................: 74.36%

    - Avg. Bytes Free per Page.....................: 116.8

    - Avg. Page Density (full).....................: 98.56%

    DBCC SHOWCONTIG scanning 'GLJED' table...

    Table: 'GLJED' (1943677972); index ID: 1, database ID: 27

    TABLE level scan performed.

    - Pages Scanned................................: 312199

    - Extents Scanned..............................: 39328

    - Extent Switches..............................: 72062

    - Avg. Pages per Extent........................: 7.9

    - Scan Density [Best Count:Actual Count].......: 54.15% [39025:72063]

    - Logical Scan Fragmentation ..................: 39.75%

    - Extent Scan Fragmentation ...................: 63.41%

    - Avg. Bytes Free per Page.....................: 2142.5

    - Avg. Page Density (full).....................: 73.53%

  • As introduction to database physical storage is typically a Computer Science third year course and is studied in depth at the graduate level, this will be difficult to explain in a forum post ( I took this back in 1983).

    Dr Charles Bachman developed this storage technique back in the 1960s.

    Simply, a logical row has a physical implementation as a record on unit of disk storage called a page. A page can only contain one type of record. Pages are then combined into extents and then extents exist within files. Pages are also used for storage of the entries of indexes.

    For SQL Server, a page is 8K in size, and there are 8 pages in an extent. SQL server can read a single page, a single extent or up to 8 extents with one I/O request.

    Performance is optimal when the order specified in the clustered index is the order of the physical pages - that is, the pages are contiguous. If the pages are out of order, there is an increase in the number of I/O needed to fufill a request.

    From the SQL Server Books OnLine:

    Scan Density [Best Count: Actual Count]

    Is a percentage. It is the ratio Best Count to Actual Count. This value is 100 if everything is contiguous; if this value is less than 100, some fragmentation exists. Best Count is the ideal number of extent changes if everything is contiguously linked. Actual Count is the actual number of extent changes.

    Logical Scan Fragmentation

    Percentage of out-of-order pages returned from scanning the leaf pages of an index. This number is not relevant to heaps. An out-of-order page is a page for which the next physical page allocated to the index is not the page pointed to by the next-page pointer in the current leaf page.

    Extent Scan Fragmentation

    Percentage of out-of-order extents in scanning the leaf pages of an index. This number is not relevant to heaps. An out-of-order extent is one for which the extent that contains the current page for an index is not physically the next extent after the extent that contains the previous page for an index.

    DBCC SHOW_CONTIG Results:

    Table GLPJD

    Scan Density [Best Count:Actual Count].......: 98.45% [32425:32936]

    - Logical Scan Fragmentation ..................: 1.31%

    - Extent Scan Fragmentation ...................: 74.36%

    Table GLJED

    - Scan Density [Best Count:Actual Count].......: 54.15% [39025:72063]

    - Logical Scan Fragmentation ..................: 39.75%

    - Extent Scan Fragmentation ...................: 63.41%

    Analysis:

    Table GLPJD shows some fragementation and should be optimized.

    Table GLJED shows major fragementation as the optimal number of extents should be 39,025 (2.4 Gb) but the actual is 72,063 (4.6Gb)

    Optimization can fix the fragementation but works by making a complete copy of the table and all indices and then releases the old space for reuse. While this copy is occuring, no updates to the data are allowed e.g. the table is locked. The amount of free space for the database file must be at least the size of the largest table plus indicies. There is also space needed for sorting which by default is done in the user database although there is an option to use tempdb for sorting.

    To optimize all tables within a database, the sqlmaint.exe can be run, such as:

    sqlmaint.exe -RebldIdx -SupportComputedColumn -S server_name[\instance_name -D database_name

    After optimization is run, the amount of free space can be check with this SQL:

    exec sp_spaceused

    Warning: When you drop or rebuild large indexes, or drop or truncate large tables, the Database Engine defers the actual page deallocations, and their associated locks, until after the transaction commits. Deferred drop operations do not release allocated space immediately. Therefore, the values returned by sp_spaceused immediately after dropping or truncating a large object may not reflect the actual disk space available

    SQL = Scarcely Qualifies as a Language

  • The bottom line is... if you expect the database to grow back to the same size in the next week or two, don't shrink it...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Hi Carl. Thanks for the excelent reply. I appreciate you breaking it down for me like that. I can see that I need to spend some time in study to really grasp this stuff.Just a couple of questions to clarify if i could.

    I remeber when we were upgraded to sql 2005 the guy doing it said that attempts to migrate our sql 2000 data to 2005 format kept failing so he left in in 2000 format. Is a possible reason for this all the fragmentation showing up in the database? I was looking through a few others and the ones i showed you arent even the worst.

    I also read a write up on external database fragmentation. I understand from the article that it can also cause your database to take a huge performance hit... which makes sense to me. Should i try and take care of internal or external fragmentation first?

    I suspect a good practice would be to do a complete backup of each database before I go ahead with any type of optimizations so I can restore it in case someting falls apart?

    In the case where you said optimal use would be 2.4Gb and we are using 4.6Gb; would that be to say we would clean 2.2Gb out of the raw database file if we ran an optimization?

    With the warning you gave me at the bottom of your response. Were you saying that after I run the optimization that I just need to be patient for the changes to show up?

    How often should a database such as ours be gone over and maintained on a regular basis?

    Thanks for your time.

  • http://www.karaszi.com/SQLServer/info_dont_shrink.asp

    Massive fragmentation is the usual result of a shrink operation. Then as the database grows back up it causes more internal AND external fragmentation. Index maintenance isn't effective, etc, etc. Just don't go there.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • OK. Correct me if I am wrong. I want to defragment but not shrink the database?

  • usually. Rebuild indexes to defragment. Shrinking usually causes fragmentation.

  • shrink it at each weekend.


    [font="Arial"]MCDBA, MCITP (DB Dev, DB Admin), MCSE,MCTS, OCA 10g[/font]

  • when the weekend shrink job runs, include the reindex job.


    [font="Arial"]MCDBA, MCITP (DB Dev, DB Admin), MCSE,MCTS, OCA 10g[/font]

  • sqlservercenter (7/7/2008)


    when the weekend shrink job runs, include the reindex job.

    Why shrink the database in the first place if it is just going to grow again over the next week? What exactly are you gaining by doing this?

    Jeffrey Williams
    Problems are opportunities brilliantly disguised as insurmountable obstacles.

    How to post questions to get better answers faster
    Managing Transaction Logs

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply