Query tuning advice

  • Hi All,

    Need some direction on query tuning a stored procedure. The stored proc has multiple sql statements inside it and ~500 lines of code which includes MERGE statement.

    What's happening is , In Production, this stored procedure is taking more than an hour.

    Note : the parameters passed to this stored proc changes.

    I asked the dev team to provide some test cases for a GOOD run and BAD run. I have attached those as part of zip file which also contains Actual execution plans for both runs.

    The good run case, It took ~ 5 secs.

    The bad run case which they claim it taking more than 60 mins , it took around 4 mins when I ran it SSMS query window.

    Not sure why there is big difference from 60 mins to 4mins.

    I want to understand why it is taking 4mins for bad run.

    The app team expectation for this query is 60 secs. Anything more than that, application calls are TIMING OUT.

    Now they came back to DB team, to know why query was taking more than an hour. I said, it could be blocking during that time frame.

    Could be some parameter sniffing issues. I am not sure. I am not able to tell confirmly. App team wanted to run a trace on prod.

    I am thinking about to enabling a trace during that time frame especially when are seeing the issue.

    So, Looking for some help on below.

    1. Need help to see if there is any difference in the plans based on different parameters.

    2. If I had to run a trace for live troubleshooting , what all events I need to select to run the trace?

    what columns to select and what filters to keep to minimize the overhead of the trace. I should be able to capture blocking info, stored proc statements , if any parameter sniffing issues,

    any missing indexes, all that stuff I should capture.

    3. Also, what all questions or data I should be collecting from the user/dev team so that we can narrow down the issue and take it to resolution or workaround?

    SQL Server version is Microsoft SQL Server 2017 (RTM-CU23)

    Thank you.

    Sam

    Attachments:
    You must be logged in to view attached files.
  • First, not trace. Use Extended Events. It puts far, far less of a load on the system. Plus, you're on SQL Server 2017. Every single piece of new functionality is monitored through Extended Events. Use that.

    1. Overall in terms of performance, it's hard to say what the issue is. For example, 43% of the estimated cost, in both plans, is the MERGE. The plans have the same hash value, so they're probably identical. Is it then bad parameter sniffing? Probably not, but it's hard to be 100% sure based on what you've provided. The two plans that each were 25% of the estimated cost had the same story. At this point, we don't have enough data to be dispositive on the point of origin between the good and bad runs. Although, the bad run hit a lot more data. If you scroll down to the last query with 25% estimated cost, it read 104k rows in various scans, whereas the "good" run read zero. Funny enough, both queries ended up inserting no data, but one was reading a bunch and one was not. What's the difference between the two runs? On a guess, the data loaded here #EmailDomainRegionVariation. One had a bunch of data, the other didn't.
    2. RPC_COMPLETED, SP_STATEMENT_COMPLETED, and I would use correlation to be able to get everything ordered neatly. That's going to give you overall stored procedure performance and the individual statements. Because, what you really need is to focus on the statement giving you trouble, then figure out why. You could also look to the wait statistics to see why a given query is running slow.

    Also, you have a bunch of tuning opportunities throughout. For example, this: AND TERR.GEO_KEY LIKE '%MSP%'. No statistics use or index use because of this. Nothing but table scans. Guaranteed performance issues. I didn't unpack all the queries, but it looks like a lot of repetition and possibly unnecessary data movement (so many scans that end up with one or zero rows). A careful breakdown will likely identify more tuning opportunities.

    One clarity issue, the queries are using the old school join syntax that went off the standard more than 30 years ago. What you have will work (as long as it's always inner joins), but it does make for less clear code and will cause issues as soon as you try to use outer joins. I'd strongly recommend changing that.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • If the MERGE statement is just updating or inserting then change it to be an update statement followed by an insert where not exists statement.

     

  • Hi Grant Sir,

    Thanks for the pointers and sharing the link of capturing Waits. will check with them replace old join syntax's with ANSI joins.

    Couple of questions:

    1. for conditions like ,  <col> LIKE '%MSP%', how can we re-write or how to improve performance for such queries?
    2. Wanted to give a try with OPTION(RECOMPILE). will it help?

    3. Is there a good way to track down or monitor slow running queries within a stored procedure?

    Thanks,

    Sam

    • This reply was modified 2 years, 6 months ago by  vsamantha35.
  • Hi Jonathan,

    Thanks for the input. I would like to know if there is any implications/perf issues of using MERGE statement especially when dealing large sets of data?

     

  • vsamantha35 wrote:

    Hi Jonathan,

    Thanks for the input. I would like to know if there is any implications/perf issues of using MERGE statement especially when dealing large sets of data?

     

    I've found replacing a MERGE that does update and insert with an update followed by an insert makes it run in about 60% of the time it takes to run with MERGE. This is on large datasets (100 million rows).

  • vsamantha35 wrote:

    Hi Grant Sir,

    Thanks for the pointers and sharing the link of capturing Waits. will check with them replace old join syntax's with ANSI joins.

    Couple of questions:

    1. for conditions like ,  <col> LIKE '%MSP%', how can we re-write or how to improve performance for such queries?
    2. Wanted to give a try with OPTION(RECOMPILE). will it help?

    3. Is there a good way to track down or monitor slow running queries within a stored procedure?

    Thanks,

    Sam

    1. If the data is 'something MSP something' and you MUST find the MSP value, then you're kind of stuck. However, if the data is 'MSP something' then eliminate the first wild card. If you have to, add another column that says, yes, this is an 'MSP' row, assuming that's actually filtering for the data. It's all down to storing information in a way that is easily retrieved, not easily stored. That's a mistake lots of people make when it comes to databases.
    2. There are no immediate indications of parameter sniffing, let alone bad parameter sniffing. So no, I don't think that will help. In fact, considering the number of statements in this procedure, it'll likely hurt because you'll be using up CPU & other resources to recompile multiple statements, with blocking associated. Nope. Utterly bad idea. Don't just poke at things. Gather metrics. Understand what they're telling you. Make structural and code changes in response to those metrics, not just on a whim.
    3. Yeah, Extended Events like I said in the post above. RPC_COMPLETED shows the full proc, with all statements. SP_STATEMENT_COMPLETED shows just individual statements. Use these two in combination with 'Track Causality' and you've got everything you need to easily group by a given call. Filter for the OBJECT_ID and you'll only capture the calls to the one procedure, reducing the overhead of the capture of the information.
    4. Yeah, you didn't have a #4. MERGE is notorious as a performance bottleneck. Remember, any execution plan for a MERGE statement gets compiled with the first call, regardless of if that is an INSERT, UPDATE or DELETE through the MERGE. This means that you may get an optimal plan for the UPDATE, but a crappy one for the INSERT or DELETE because the parameters passed aren't the same as they would be for those queries.
    5. Also, with a really big procedure like this, lots of conditional statements, you may be better off making some of the queries into secondary procedures so that they can be compiled independently from the broader procedure. Effectively the same issue as the MERGE statement.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Jonathan AC Roberts wrote:

    vsamantha35 wrote:

    Hi Jonathan, Thanks for the input. I would like to know if there is any implications/perf issues of using MERGE statement especially when dealing large sets of data?

    I've found replacing a MERGE that does update and insert with an update followed by an insert makes it run in about 60% of the time it takes to run with MERGE. This is on large datasets (100 million rows).

    Absolute agreement by the way. MERGE is a compromise. Like many compromises, it's not as good as the things it's replacing.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Doing only a quick glance at the execution plans, this looks like it could be a parameter sniffing issue.  You could go all surgical on this but I'm thinking that building a recompile into the proc would solve your issue, especially considering that a recompile would take a whole lot less time than what you're currently suffering.  It it doesn't work (but I think it might), it's easy enough to remove from the proc.

    As the saying goes, "One good test is worth a thousand expert opinions".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Many thanks Grant sir for the detailed explanation.

    Thanks Jeff and Jonathan for the feedback.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply