Suggestion on Updating Statistics

  • I which cases we have to update the index statistics, could anyone please enlighten the cases ?

  • Mostly it's if you notice queries running poorly because of out of date statistics. Normally that happens on large tables (million + rows) where the index is on an increasong column (like date inserted) and your queries often look for the latest dates.

    If you do frequent index rebuilds, you don't need to update stats. Statistics are updated as part of an index rebuild

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • You can also need to update statistics on smaller tables where large percentages of the data changes regularly.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Taking your answer into consideration say if i have a table in production server having 8 key columns on which frequent search is done and also heavy insert/update/delete operation is done,

    so in this case do we need to update the statistics frequently ?

    Abhijit - http://abhijitmore.wordpress.com

  • Maybe. It shouldn't harm (unless you do a sampled update right after an index rebuild). It may or may not help.

    How often do you rebuild the indexes on that table?

    Grant Fritchey


    You can also need to update statistics on smaller tables where large percentages of the data changes regularly.

    Wouln't the auto update take care of that?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • If you have the db options

    'auto_create_statistics' and

    'auto_update_statistics' on

    you don't have to update them manually (check with sys.databases). You should turn on these options:

    alter database 'myDB' set auto_create_statistics on;

    alter database 'myDB' set auto_update_statistics on;

    /*instead you might want to consider 'auto_update_statistics_async'*/

    On every table with heavy DML operations AUTO STATS are a must. If you don't want to enable AUTO STATS db-wide, check sp_autostats in BOL.

  • GilaMonster (7/22/2008)


    Grant Fritchey


    You can also need to update statistics on smaller tables where large percentages of the data changes regularly.

    Wouln't the auto update take care of that?

    I've found that you need to nudge the auto-update along on occassion. Worse yet, one of my co-workers was able to demonstrate a place where the statistics got out of data and the random sampling didn't fix them. We had to do a full scan with a manual statistic update to get them back in line. This now runs automatically about once every two weeks for two tables in the database.

    Most of the time, more often than not, the automatic statistics handle everything, on their own.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Grant Fritchey (7/22/2008)


    GilaMonster (7/22/2008)


    Grant Fritchey


    You can also need to update statistics on smaller tables where large percentages of the data changes regularly.

    Wouln't the auto update take care of that?

    I've found that you need to nudge the auto-update along on occassion. Worse yet, one of my co-workers was able to demonstrate a place where the statistics got out of data and the random sampling didn't fix them. We had to do a full scan with a manual statistic update to get them back in line. This now runs automatically about once every two weeks for two tables in the database.

    Most of the time, more often than not, the automatic statistics handle everything, on their own.

    Hi Grant,

    please check this KB article:

    http://support.microsoft.com/kb/195565/en-us

    Does that explain the strange behavior?

  • Thanks. Great description of automatic statistic maintenance, although it's for SQL Server 2000. Here's the BOL entry for 2005. There are a few differences, but nothing major.

    I guess I wasn't clear in my post. We have auto update and auto create of statistics enabled. They work fine, most of the time. However, some tables statistics get out of date despite the automatic updates. So we update them manually, on most systems once a week or less, during off-hours, on a weekend. A very vew tables that we've found so far, get so badly out of date, that simply using the sampling mechanism, that randomly pulls pages from the index or table in question as described in the BOL entry above, doesn't work and we've been forced to enable a full scan, manually. Even these are only scheduled occasionally and the automatic statistic maintenance does the job the rest of the time.

    As Gail has pointed out, statistics get updated as indexes are rebuilt. We also have an index maintenance routine that examines the fragmentation of the index and based on that fragmentation level either does a full rebuild or just does a reindex.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Grant Fritchey (7/22/2008)


    However, some tables statistics get out of date despite the automatic updates. So we update them manually, on most systems once a week or less, during off-hours, on a weekend. A very vew tables that we've found so far, get so badly out of date, that simply using the sampling mechanism, that randomly pulls pages from the index or table in question as described in the BOL entry above, doesn't work and we've been forced to enable a full scan, manually.

    Interesting.

    I've got a bunch of tables that have to be updated manually. Around 200 million rows, load 2 million a day, clustered index on a datetime (date inserted)

    Since auto-update only kicks in if 20% of the table is changed, it would be 10 days between updates. It only takes 3 days for the stats to get far enough wrong that the optimiser thinks there's no rows for the curent day when there's actually 2 million

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Grant Fritchey (7/22/2008)


    Thanks. Great description of automatic statistic maintenance, although it's for SQL Server 2000. Here's the BOL entry for 2005. There are a few differences, but nothing major.

    I guess I wasn't clear in my post. We have auto update and auto create of statistics enabled. They work fine, most of the time. However, some tables statistics get out of date despite the automatic updates. So we update them manually, on most systems once a week or less, during off-hours, on a weekend. A very vew tables that we've found so far, get so badly out of date, that simply using the sampling mechanism, that randomly pulls pages from the index or table in question as described in the BOL entry above, doesn't work and we've been forced to enable a full scan, manually. Even these are only scheduled occasionally and the automatic statistic maintenance does the job the rest of the time.

    We have experienced the exact same issue.

    The way to tell whether adequate sampling is being employed while auto-updating stats is to run the following:

    DBCC SHOW_STATISTICS ('tblName', 'idxName')

    Look at the Rows and Rows Sampled columns. If Rows Sampled is much less than Rows and you are dealing with a large table, then you've got a problem that only FULL_SCAN will fix.

    That has also solved many of our prod issues.

    __________________________________________________________________________________
    SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
    Persisting SQL Server Index-Usage Statistics with MERGE[/url]
    Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]

  • Possibly more accurate to say, "If Rows Sampled is much less than Rows and you are dealing with a large table and you have a non-uniform data distribution, then you've got a problem that only FULL_SCAN will fix"

    I've got a couple 100 million + row tables that are fine with sampled statistics, because the data distribution in those indexes is uniform and the sampling gives quite accurate results.

    Take a look at the histogram that Show_Statistics outputs. If you know that you have a non-uniform data distribution and teh histogram doesn't reflect that, or misses rows entirely, then you may need a higher sample rate or a full scan.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (8/1/2008)


    Possibly more accurate to say, "If Rows Sampled is much less than Rows and you are dealing with a large table and you have a non-uniform data distribution, then you've got a problem that only FULL_SCAN will fix"

    I've got a couple 100 million + row tables that are fine with sampled statistics, because the data distribution in those indexes is uniform and the sampling gives quite accurate results.

    Take a look at the histogram that Show_Statistics outputs. If you know that you have a non-uniform data distribution and teh histogram doesn't reflect that, or misses rows entirely, then you may need a higher sample rate or a full scan.

    Good point, dealing with a skewed data distribution is key to this.

    __________________________________________________________________________________
    SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
    Persisting SQL Server Index-Usage Statistics with MERGE[/url]
    Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]

  • And that's why we've only had to do the FULL SCAN in a few places and on a somewhat irregular basis. Most indexes, most of the time, if they were designed properly, have an even enough distribution for the sampling mechanism to work very well.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Grant Fritchey (8/2/2008)


    And that's why we've only had to do the FULL SCAN in a few places and on a somewhat irregular basis. Most indexes, most of the time, if they were designed properly, have an even enough distribution for the sampling mechanism to work very well.

    So is this a question of design? I must admit I've never thought about it this way.

    Are there any strategies to minimize the possibility of uneven data distributions through proper indexing? Now that would be an interesting article for SSC.

    FULL SCAN, though it works to alleviate production issues, is a maintenance nightmare. One cannot run it on the whole database all the time because it will block other processes running on the instance. Rather, a few arbitrarily chosen tables are usually chosen. It's messy, it's dirty and I don't like it. If there is a way to minimize its use through proper indexing I want to know about it. 🙂

    __________________________________________________________________________________
    SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
    Persisting SQL Server Index-Usage Statistics with MERGE[/url]
    Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply