Pro-Active SQL Monitoring

  • Hi

    We have a clustered environment and the important database is mirrored. It is a highly transactional database. We also have a RO (Read Only) environment which is also clustered and use transactional replication to move data from Primary to RO's. With respect to availability, we are doing great. What I need pointers is, what other things I can do for Pro-Active monitoring and to show that we are doing a great job in system availability now and down the road, especially for management.

    We also have alerts for job failures, mirroring issues, replication issues etc

    Some things I can think and need help with is:

    1. Reindexing online where thats not happening now (we cannot afford more than 10 or 15 min downtime). So, how can we do this for about 250GB db?

    2. Capture performance counters, but which one are more important for pro-active monitoring?

    3. Estimating DB size growth over a period so we increase size accordingly

    4. Server uptime over a period, so we show reliability and uptime over a period

    Any others you guys can suggest?

    What else

  • There is one observation on your setup configuration.

    As failover clustering and db mirroring are present on the same server there be a clash when by any chance failover occurs.

    Cluster failover period is longer than database mirroring time out value. So when failover happens, db mirroring will assume principal server has failed. You need to increase the db mirroring time out value accordingly to prevent it. Hope this is addressed already.

    M&M

  • Some things I can think and need help with is:

    1. Reindexing online where thats not happening now (we cannot afford more than 10 or 15 min downtime). So, how can we do this for about 250GB db?

    --I do two things dont reindex everything, find out which tables have more then 100 pages and only work on those. now you need to set the fill factor right to eliminate as much page splits as possible this will cut down on the number of time you need to rebuild. One big thing to keep in mind since you are on a SAN Reindexing does not have a very big effect, this is not true for a DAC but with a SAN you have SAN cache, and you generally will have many more spindels that can spread the load. PLease read the following article...lincha did some interesting testing on SAN and indexes....I have found his testing to be pretty much true in my environmants

    .http://sqlblog.com/blogs/linchi_shea/archive/2008/12/07/performance-impact-file-fragmentation-and-san.aspx

    2. Capture performance counters, but which one are more important for pro-active monitoring? I try and capture 5 things CPU, memory(page life exptancy, DISK reads and DISK writes for both logs and databases. then I do this about once a month so you can trend

    3. Estimating DB size growth over a period so we increase size accordingly

    4. Server uptime over a period, so we show reliability and uptime over a period

    Any others you guys can suggest?

    you need to track down your wait statistics...best place to pre act....keep your wait stats to a minimum adn you are ahead of the game

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply