Database Health

  • Comments posted to this topic are about the item Database Health

  • There are some members of staff in my team at work, who are extremely untrusting.

    Every morning, I will query the database for jobs which have not completed and are not switched off. Everything else, therefore must still be waiting, running or have failed.

    However, the others view the entire table to include a list of the items which have succeeded.

    I guess it's like a comfort blanket - "I'm doing well, because look at everything that works!"

  • In my environment we have a whole raft of health check reports and alerts

    • Disk space
    • Tables nearing the limit of their identity range
    • Failed jobs
    • Databases that will need to grow. We need to plan for expanding the database, not just let it happen willy nilly
    • Error logs that need cycling
    • Index fragmentation
    • Alerts for application performance such as "No customers bought in the past 15 minutes"
    • All replication jobs notify the DBAs if they fail
    • Batch load summary jobs - i.e. make sure that even if a batch load worked it has what you expect in it

    It is basically identifying anything that would require a peremptive action from the back room boys to keep things running seamlessly for the rest of the business.

  • Hi,

    I totally agree, I do a monthly management report for my bosses and this contains stuff like;

    The main jobs and their times

    Size of databases

    Counts of core data such as no of companys etc

    Average growth of main tables

    etc...

    All on nice color charts I might add

    I then get the more granular system based reports from my team..cpu, disk use etc

    The business doesn't care about cpu...but they do like a colourful chart 🙂

    We are trying to get all these fancy charts moved over to something a little more automated...so we are looking at Report server.

    Graeme

  • I think we seem to all be covering the same checks whcih is a releif!!

    I used to work in an environment with approx 400 SQL servers

    we used to have a job run centrally each morning to provide us with a central list of issues, rather than one page per server.

    we would run the following checks against each server and report back to a central table which could be read via some simple asp.net web pages with ordering and sorting so we could see all of the data but order it according to how we want to look at it (ie. disk space in lowest to highest order)

    -DBCC SQLPERF(logspace) - looking for databases in wrong recovery mode

    -XP_fixeddrives

    -databases than have not been backed up in the last 7 days (i.e new databases)

    -SQL agent state (runing, stopped)

    -Sql service uptime

    -Failed jobs in the last 3 days ( in case of weekend failures) and the last 3 days history of that job as a drill down so you can check if its a recurring failure.

    -database status (single user etc)

    -database create date

    -database names not in the known list

    -SQL agent jobs that have a last run date >2 weeks ago or are disabled.

    the checks ran from a single sql 2005 instance using a CLR assembly that i wrote that could execute and query against any server without using linked servers.

    I can post the CLR and Sample Database/ web pages if anyone is interested.

    MVDBA

  • 400?! What did your company do?!

  • They were in the leisure industry (casinos and the like) - with servers at each of the sites, plus servers to control electronic gaming and fruit machines.

    400 doesn't count the the 3000+ MSDE instances installed at reception and information portals

    MVDBA

  • We do monitoring on ~125 SQL Servers. We unfortunately do not have access to them unless we call the site and connect to their system through a VPN. This disconnected state causes us to have to pull a ton of information on a monthly basis and do some trending.

    We gather the MSInfo32 and SrvInfo information into a flat file. We also add the SQL Server configuraiton, Job setup, Job history, SQL Server Error Logs, and misc other stuff

    we do also pull stats from the individual databases to trend the growht and processing.

  • Hmmm, not a lot of responses. Guess lots of people don't do this, don't care, or are watching the markets.

    Fisher-Price interfaces are good for management. Big buttons, lots of color.

  • Steve, I agree with your editorial's statement that managers are probably not interested in CPU usage, database sizes, disk space remaining, etc. That information is important to DBAs but no one else. But some of these things can be turned around into something that DOES (or at least should) mean something to them. They often require some trend analysis and other supporting data to produce. For example:

    -- Applications (not servers) needing additional disk space within the next x months (affects budgets)

    -- Applications needing additional tuning or upgrades (due to CPU utilization, response times, downtime, etc. exceeding thresholds)

    Other things that ought to go up to management regularly are fairly static but could, with a little work, be automated. For example:

    -- Critical applications having a database that is a single point of failure. If this is too technical (I hope not!) it could be expressed as risk of critical applications experiencing unscheduled downtime (high, medium, low). Expressing it this way could also let you consider other factors besides single points of failure.

    It would be nice if these kind of reports could look at applications as a whole (web servers, application servers, network throughput, etc. as well as database servers) but that's beyond the communication and management abilities of any large company I've ever worked for.

    This kind of information would have to be distilled even further before getting up to the executive level (pretty charts etc., as you said) but managers just below the executive level could really use this, in my opinion. It's all about getting rid of the tech-speak and putting the data we should already have into a format that means something to them.

  • I'm interested in seeing the CLR assembly that you have built. I currently poll all my sql server instances and send email reports to myself. I'm looking for a way to incorporate all of this into one instance that will run and give me all the information that I need. I would love insight into how you created your CLR assembly to do your reporting for you. Thanks for posting.

  • Ok,

    I've had a couple of private messages for the CLR - i'll ask Steve Jones Nicely if he can host the .net code and also scripts to implement it for those of you who don't have visual studio. - hopefully then post back a link into the forum...

    It's a small assembly but the code is too much to post here

    MVDBA

  • That would be great...............if he won't post it I can provide an email address to send it to. Thanks for the reply. 😀

  • Ok guys,

    here are the basics attached to this post.

    one file contains the C# code in case you want to play with this - i.e change the authentication method etc - don't worry you won't need this as the assembly binary code is included in the other script

    the other file contains a single script with all of the components in you need

    here are the basics of how you use it and whats in the script - follow the script in sequence and alter to suit your own requirements

    1) create database monitoring

    2) create table of servers

    3) populate table of servers with some servers on your site.

    4) add the assembly to the database

    5) create a stored proc to access the assembly

    6) create a stored proc to loop through your server list and perform the relevant proc calls

    in the example i have included a proc that goes through your server list and gets the data on disk free space... feel free to try it with any other queries

    a few notes on the rotator2 assembly

    1) it uses windows authentication

    2) it returns 2 additional columns 2 your query (at the begining) Servername and status

    3) it can only handle varchar(4000) as the max query text length

    4) returns 0 if there is an error

    a few notes on the proc usp_fixeddrives

    1) if the rotator proc fails and returns 0 it inserts a record indicating the conection has failed.

    2) creates and truncates the results table (tbl xp_fixeddrives) every time it is run (you could alter this if you want)

    3) uses a cursor to call the rotator proc once for each server and return results back to centralised table

    4) results are accessible using Select * from tbl_xp_fixeddrives - which can be easily emailed, or

    I've just knocked all this together in a rush, so apologies if there are any mistakes. If there are any errors post them back here and i'll attempt to correct. - if you can't figure it out.

    Also - feel free to use and modify this code for your own purposes. if anyone wishes to tart it up and submit it as an article or featured script then feel free - i won't be offended or accuse you of plagurism

    Hope this is helpfull.

    Mike V

    MVDBA

  • Mike has contacted me and should be sending me some code. Not sure if it's more than attached above, but I'll get it posted and drop a note in here.

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply