Execution time increases dramatically when using variable in query

  • I have a question about a query im trying to execute.. The query looks like

    DECLARE @Limit DATETIME

    SET @Limit = '2007-11-02 14:00:00'

    SELECT

    a, b, c

    FROM

    tbl1 LEFT OUTER JOIN tbl2 LEFT OUTER JOIN tbl3, LEFT OUTER JOIN tbl4

    WHERE

    tbl1.TimeStamp > @Limit

    This query never finished (at least takes very long time to execute) BUT if I dont use the @Limit variable

    like this:

    SELECT

    a, b, c

    FROM

    tbl1 LEFT OUTER JOIN tbl2 LEFT OUTER JOIN tbl3, LEFT OUTER JOIN tbl4

    WHERE

    tbl1.TimeStamp > '2007-11-02 14:00:00'

    the query executes in less than a minute. tbl1.TimeStamp has datatype DATETIME.

    Does anyone know the reason that I`m not able to use a variable?

    Any help appreciated!

  • I'm not sure what you're running this in, but the fact that you're running 3 cross joins (meaning joins with NO join criteria) is going to very quickly return VERY large recordsets.

    If each of your tables had 100 records, the query (with no WHERE clause) would return 100 million rows. If they each had 1000 rows, that's a 1 Trillion row data set.

    Depending on what you're using to return those results, you will find that the query wasn't done running when it showed you some results.

    So - start by adding join criteria (how does table1 relate to table2/ relate to table3?). Then look at the indexing and make sure that it's right. You're going to need some index involving the date field on table1, as well as indexes on the primary and foreign keys.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Hi Matt

    I have criterias on the joins, but i did'nt write them in the post, sorry for that. Even though I have criterias its still a very large dataset!

    Why do the indexing matter, All I do to make it work is to type the variable value directly into the query instead of using the variable?

  • THAT doesn't make much sense. you should find that both of them should be the same execution plan.

    Did you run DBCC Freeproccache between the two executions? I have a feeling the "second" run reused some of the stats from the first.

    the indexing would make a lot of difference, since it would get it to the data much faster (in both scenarios).

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Get execution plan for each of them. Compare them. If execution plans are very different, check when statistic was last updated for all indexes for all these tables. If not recently, update them.

  • Try this

    DECLARE @Limit DATETIME

    SET @Limit = '2007-11-02 14:00:00'

    SELECT

    a, b, c

    FROM

    tbl1 LEFT OUTER JOIN tbl2 LEFT OUTER JOIN tbl3, LEFT OUTER JOIN tbl4

    WHERE

    tbl1.TimeStamp > (SELECT @Limit)

    I had the same problem once and this small change works well :hehe:

  • If that last suggestion works, it would be amazing to me. It shouldn't matter, other than forcing a new execution plan. But if the optimum plan is chosen, then it should be the same as including the (SELECT).

  • Classic parameter sniffing problem.

    When you use the value in the query (or a parameter), the optimiser, when it compiles the query, knows the value used in the condition, and can make appropriate decisions based on the number of rows that will be affected (based on the statistics on the table)

    When you use a variable, the optimiser cannot see the value and has to make a guess as to how much of the table will be affected by the query. It will assume 30%, and will select a plan based on that.

    If you're using a variable, and the 30% os very far off the actual rows, you may end up with a very sub-optimal execution plan, and hence very long execution times.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • do you have any common field between the tables ?

  • Yes, I have some common fields, and these are used in the join clause, so the result dataset is as small as possible.

    Is there some solution to this or is the only way not to use a variable? Can I make the optimizer read the value instead of guessing?

    I`m quite new to SQL-server and really appreciate your help, thanks.

  • You can try the optimise for query hint, if there's a specific value that you usually use. Problem with that is if you pass in other values with vastly different row counts, you'll get poor performance again.

    SELECT ... FROM ...

    WHERE SomeColumn > @YourVariableName

    OPTION (OPTIMIZE FOR (@YourVariableName = 'Common value')

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • The two most likely possible issues here are as mentioned:

    Parameter sniffing, which will result in different execution plans. (can we see them, or have you at least checked them?)

    Caching of data. have you repeated this test, changed order etc. To ensure this is the only influencing factor?

    I ask because there are many factors, one of which is blocking. Also consider profiling the process to see what it is getting hungup on.

  • Gail Shaw's reply tells you why you've seen the performance difference and others have given you clues about what you can do about it.

    Adding to these, here's my suggestion.

    Look at the Execution plans for the query when run with/without the parameter and try to understand the differences. I've seen similar situations and the key difference has been the choice of Index used to look up the required rows. If this is also your problem, you can use an Index-hint to force the optimizer to adopt a more efficient execution plan. This may work for you, but beware that it comes with it's own baggage.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply