Where are the good Senior Level DBA's?

  • BWAAA-HAAAA!!!! I found a simple and easy way to handle all of this. I wake up in the morning hating everyone and make exceptions from there. 😛 By the 3rd cup of coffee, the list has usually dwindled to only repeat offenders. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • tgarland (5/24/2012)


    So assigning blame and making sure it's handled correctly is paramount in a big company.

    This may be one of the scariest statements I have seen in a while. I would hate to work for a company that fosters this kind of attitude. I work for a big company and we focus on fixing the issue and if we can educate someone in the process then it was a success. I feel that companies that play the blame game promote a culture where individuals feel that they cannot make mistakes causing the individual to try and hide mistakes. By focusing on fixing the issue and education, we promote an open culture where people freely admit mistakes because they are not worried about getting blamed.

    I said "pointing finger in a constructive way" for a very good reason. If it's constructive then they don't hide, they don't get shouted at but they do have to take responsibility for what happened and make sure it doesn't happen anymore.

    You said you try to educate someone when you can, here we do the same thing, but we just do it every single time to save time.

    We learned that just fixing things or "covering" for other is just a poisoned gift, it help them in the short time but does nothing for the long term.

  • Lynn Pettis (5/24/2012)


    The mistake can come from poor development or wrong way to use the application.

    By experience the mistakes comes from the way the application has been developed, they simply didn't plan or didn't have the skill to make an application that can handle the kind of load it's been sold to do (it doesn't matter if the problem comes from the dev or the commercial selling the application).

    In some very rare case the business is pushing the application beyond the limit specified in the sla/contract.

    This part scares me too. You pay for a 3rd party application to accomplish a task, and it accomplishes this task. Then you use it in a way that was not anticipated by the vendor and now you are telling the vendor they didn't know what they were doing? That is arrogance. Most times, what I have found is that the users push the system to the limits, and when it doesn't do exactly what they want, they find ways around the system. This usually works for a while but it usually catches up with them and the system starts to fail. Are you still going to blame the vendor and their developers for not anticipating what the users might do?

    Everyone seem to understand the opposite of what i say, i guess i suck at English :-).

    We blame the application if it doesn't do it's task.

    We blame the user if he makes the application do thing it's not supposed to do (including overloading the application).

    We blame us if the server is not doing what it's supposed to do.

    We don't blame people that are not responsible for the issue at hand.

    We don't send them to the corner if they've been bad.

    We only shout at them if they are total *** and there is no other way to get the message across.

  • Oliiii (5/25/2012)


    Everyone seem to understand the opposite of what i say, i guess i suck at English :-).

    We blame the application if it doesn't do it's task.

    We blame the user if he makes the application do thing it's not supposed to do (including overloading the application).

    We blame us if the server is not doing what it's supposed to do.

    We don't blame people that are not responsible for the issue at hand.

    We don't send them to the corner if they've been bad.

    We only shout at them if they are total *** and there is no other way to get the message across.

    Here is the problem:

    We blame the application if it doesn't do it's task.

    We blame the user if he makes the application do thing it's not supposed to do (including overloading the application).

    We blame us if the server is not doing what it's supposed to do.

    You are playing a blame game. Let's blame the sys admin for a hard drive failure, or the network admin for a switch failing. Let's blame the users for overloading the system when the business has changed processes or workload has simply increased. Let's blame the developer because the system doesn't do what the user wants, but does what they requested.

    Things happen, and you can't always blame someone or something.

    You have a failure, you do a root cause analysis, not to assess blame, but to determine what happened, why it happened (not assigning blame), and what needs to be done to keep it from occuring again.

  • Lynn Pettis (5/25/2012)


    Oliiii (5/25/2012)


    Everyone seem to understand the opposite of what i say, i guess i suck at English :-).

    We blame the application if it doesn't do it's task.

    We blame the user if he makes the application do thing it's not supposed to do (including overloading the application).

    We blame us if the server is not doing what it's supposed to do.

    We don't blame people that are not responsible for the issue at hand.

    We don't send them to the corner if they've been bad.

    We only shout at them if they are total *** and there is no other way to get the message across.

    Here is the problem:

    We blame the application if it doesn't do it's task.

    We blame the user if he makes the application do thing it's not supposed to do (including overloading the application).

    We blame us if the server is not doing what it's supposed to do.

    You are playing a blame game. Let's blame the sys admin for a hard drive failure, or the network admin for a switch failing. Let's blame the users for overloading the system when the business has changed processes or workload has simply increased. Let's blame the developer because the system doesn't do what the user wants, but does what they requested.

    Things happen, and you can't always blame someone or something.

    You have a failure, you do a root cause analysis, not to assess blame, but to determine what happened, why it happened (not assigning blame), and what needs to be done to keep it from occuring again.

    Ah, here is why you don't seem to get what i say 🙂

    I'm not talking about any failure or incident. I'm only talking about mistake.

    When things go right (failure is handled correctly and so on...) it doesn't reach us, when something reach us that mean something went wrong and it wasn't handled correctly (in other word, someone made a mistake).

    So we don't really care if a disk crashes, that happen all the time, we care if it crashes and it was not protected by a raid when it should have been (specifically requested or by default), and in this case that is a mistake by the storage team so they are the one to be blamed for the issue.

    The result of that blame might not be some guy being shouted at by his manager, it might simply end up by the team being reinforced by 2 more peoples because their management found out they made the mistake because they were simply overworked.

    With the amount of applications we have for SQL Server alone, even if an application has an avg of 1 big issue every 10 years (that's being massively optimistic), that's still over 2 per day for us. So i might seem a bit hard on some issue or ways to work, but once you start managing a lot of servers and DB there are things you can't do anymore and everyone needs to do their work correctly.

    By applying that way of working to our team first we managed to get from hundreds of incidents a week to only 1 or 2 and we expect to reach 1 or 2 a month by the end of the year.

    By starting to assign blame to the right team we went from having a bunch of monkey in the staging dept to a system that can stage physical and VM in minutes.

    That may not seem like something incredible for some but heck, a year ago they gave us a few clusters with one windows 2008 node and one Windows 2003, a few month ago we were happy if we had to spend less than a day fixing configuration issues on brand new clusters, so by keeping pointing fingers in the same direction for valid reasons they finally got the budget, formations and manpower required to deliver a good quality service...

    We explain to people as nicely as possible as often as we can, but that doesn't work for everyone or is not always possible. Assigning blame to some team can be done by just letting them know something should have been done differently and letting the business know exactly what happened without trying to cover things up.

    We are seeing very good and very real improvement only because we started to blame the right people.

    Each blame we made always ended up with an overall improvement. We may have bruised a few ego but things always ends up better for everyone.

  • The kind of "blame"/"responsibility" being proposed here is the healthy kind. Find out who messed up and why, and educate them on the right way to do it. That's a good thing.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Oliiii (5/25/2012)


    Ah, here is why you don't seem to get what i say 🙂

    I'm not talking about any failure or incident. I'm only talking about mistake.

    When things go right (failure is handled correctly and so on...) it doesn't reach us, when something reach us that mean something went wrong and it wasn't handled correctly (in other word, someone made a mistake).

    So we don't really care if a disk crashes, that happen all the time, we care if it crashes and it was not protected by a raid when it should have been (specifically requested or by default), and in this case that is a mistake by the storage team so they are the one to be blamed for the issue.

    The result of that blame might not be some guy being shouted at by his manager, it might simply end up by the team being reinforced by 2 more peoples because their management found out they made the mistake because they were simply overworked.

    With the amount of applications we have for SQL Server alone, even if an application has an avg of 1 big issue every 10 years (that's being massively optimistic), that's still over 2 per day for us. So i might seem a bit hard on some issue or ways to work, but once you start managing a lot of servers and DB there are things you can't do anymore and everyone needs to do their work correctly.

    By applying that way of working to our team first we managed to get from hundreds of incidents a week to only 1 or 2 and we expect to reach 1 or 2 a month by the end of the year.

    By starting to assign blame to the right team we went from having a bunch of monkey in the staging dept to a system that can stage physical and VM in minutes.

    That may not seem like something incredible for some but heck, a year ago they gave us a few clusters with one windows 2008 node and one Windows 2003, a few month ago we were happy if we had to spend less than a day fixing configuration issues on brand new clusters, so by keeping pointing fingers in the same direction for valid reasons they finally got the budget, formations and manpower required to deliver a good quality service...

    We explain to people as nicely as possible as often as we can, but that doesn't work for everyone or is not always possible. Assigning blame to some team can be done by just letting them know something should have been done differently and letting the business know exactly what happened without trying to cover things up.

    We are seeing very good and very real improvement only because we started to blame the right people.

    Each blame we made always ended up with an overall improvement. We may have bruised a few ego but things always ends up better for everyone.

    I get what you are saying, I just disagree with how you are saying it. I personally have a problem with finding who is "to blame" for a problem/mistake/error. There is too much negative connotation with the phrase "to blame" because it is used more often than not when trying to deflect responsibility.

    I don't care how nicely you may put it, but once you use the phrase "you are to blame" with me, I am already on the defensive, even when I know I may have made the mistake or error.

    I am all for identifying the who, what, why, where, how of a problem, error or mistake, and for identifying policies or procedures that will prevent or mitigate such problems, errors or mistakes from happening again. It is necessary for improvement of individuals, groups, and organizations.

    My suggestion, move away from using the phrase "to blame." In my opinion there is just too much of a negative connotation to this phrase.

  • I get what you are saying, I just disagree with how you are saying it. I personally have a problem with finding who is "to blame" for a problem/mistake/error. There is too much negative connotation with the phrase "to blame" because it is used more often than not when trying to deflect responsibility.

    I don't care how nicely you may put it, but once you use the phrase "you are to blame" with me, I am already on the defensive, even when I know I may have made the mistake or error.

    I am all for identifying the who, what, why, where, how of a problem, error or mistake, and for identifying policies or procedures that will prevent or mitigate such problems, errors or mistakes from happening again. It is necessary for improvement of individuals, groups, and organizations.

    My suggestion, move away from using the phrase "to blame." In my opinion there is just too much of a negative connotation to this phrase.

    Very well said, Lynn.

  • I'm with you. Blame is absolutely a negative word.

    "Who is responsible" is nicer but still smacks of my mother being mad about something and demanding to know "who is responsible for this??"

    If staffed with mature individuals that know their responsibilities and accept consequences of actions in areas for which they are responsible, there can be good change.

    But "blame" sounds a lot like sitting around pointing fingers and shaming people. not an environment I could stay in long (been in them, left them). On the job I have now, we all work closely together and each is responsible for certain aspects. If something goes wrong, we each look through our own areas to determine the Cause, vs blame.

    Happily, we each readily admit our errors when we find them because we know that others have work intertwined with our own and impacts are rarely isolated to "just my stuff" and we don't have time to do anything but fix the problem. we don't have time nor room for egos and accept that each of us is imperfect and will make mistakes.

    We do post-mortems later to see what we have learned by what went right, what went wrong, and what we had to abandon because we just couldn't make it happen correctly.

    It sounds like the Blame scenario doesn't get personal so maybe it's just how it was phrased. But based on some jobs I have had, the word Blame is absolutely negative to me.

  • herladygeekedness (5/25/2012)


    I'm with you. Blame is absolutely a negative word.

    "Who is responsible" is nicer but still smacks of my mother being mad about something and demanding to know "who is responsible for this??"

    I tend to agree here. The best way I've seen to handle this in the past is a review of "what went wrong" and "what could we have done to prevent this?" without any discussion of who missed something.

    Focus on the future, building better DR planning. If management doesn't every single out people and only events, it becomes easier to accept mistakes and improve for the future.

  • Heh... I just call it a "post mortem review"... that way everyone is equally insulted. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • I always blame Chuck Norris when anything goes wrong. He is unstoppable.

    Chuck Norris was born on May 6th, 1945

    Nazi Germany surrendered on May 7th, 1945.

  • JamesMorrison (5/25/2012)


    I always blame Chuck Norris when anything goes wrong. He is unstoppable.

    Chuck Norris was born on May 6th, 1945

    Nazi Germany surrendered on May 7th, 1945.

    Nice joke, but, he was born March 10, 1940 .

  • Lynn Pettis (5/25/2012)


    JamesMorrison (5/25/2012)


    I always blame Chuck Norris when anything goes wrong. He is unstoppable.

    Chuck Norris was born on May 6th, 1945

    Nazi Germany surrendered on May 7th, 1945.

    Nice joke, but, he was born March 10, 1940 .

    That also means that he didn't start it. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Chuck Norris can gargle peanut butter.

    Chuck Norris can slam a revolving door.

    Chuck Norris can design databases so amazing that they don't even need indexes.

Viewing 15 posts - 166 through 180 (of 187 total)

You must be logged in to reply to this topic. Login to reply