How to optimize this query

  • I have this query where each select runs very fast, approx 1 to 2 seconds on a table with almost a million record but when combined, it takes forever.

    select answered = sum(CASE WHEN

    sf.answerNumeric IS NULL AND (sf.answerText IS NULL

    OR rtrim(sf.answerText) = '') THEN 0 ELSE 1 END)

    FROM surveyFilled sf

    Where sf.surveyInstanceID = 4153 and sf.questionID in (select svr.questionID

    from svCategoryQuestionRisk svr

    where svr.categoryID in

    (select cc.ID from vw_categoryColsQuestionLORAll cc where cc.ID in(

    SELECT distinct fl.categoryID

    FROM vw_surveyCategoryHierarchyFlat fl

    WHERE ( fl.fullPath like '%>67>%' or fl.fullPath like '%>67' )

    )

    and cc.type = 'Category' and cc.surveyInstanceID = 4153) and svr.surveyVersionID = 5)

  • please post full DDL of all involved objects ( tables, views, indexes )

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Going to need more information, such as the execution plans.

    Check out the second link down on the left in my signature, it'll walk you through what we'll need to effectively help you.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • As a first try you could try replacing the subqueries with joins. Also verify the right indexes are in place.

    Jayanth Kurup[/url]

  • This is what your query looks like when formatted, but seems like an extra parentesis is at the end:

    select answered = sum(

    CASE WHEN

    sf.answerNumeric IS NULL AND (sf.answerText IS NULL OR rtrim(sf.answerText) = '')

    THEN 0 ELSE 1 END)

    FROM surveyFilled sf

    Where sf.surveyInstanceID = 4153

    and sf.questionID in

    (

    select svr.questionID

    from svCategoryQuestionRisk svr

    where svr.categoryID in

    (

    select cc.ID

    from vw_categoryColsQuestionLORAll cc

    where cc.ID in

    (

    SELECT distinct fl.categoryID

    FROM vw_surveyCategoryHierarchyFlat fl

    WHERE ( fl.fullPath like '%>67>%' or fl.fullPath like '%>67')

    )

    and cc.type = 'Category' and cc.surveyInstanceID = 4153

    )

    and svr.surveyVersionID = 5)

    So, yeah, that's a bunch of nested queries.

    Without being able to test, looks equivalent to this:

    select answered = sum(

    CASE WHEN

    sf.answerNumeric IS NULL AND (sf.answerText IS NULL OR rtrim(sf.answerText) = '')

    THEN 0 ELSE 1 END)

    FROM surveyFilled sf

    inner join svCategoryQuestionRisk svr on sf.questionID = svr.questionID and sf.surveyInstanceID = 4153

    inner join vw_categoryColsQuestionLORAll cc on svr.categoryID = cc.ID and svr.surveyVersionID = 5

    inner join vw_surveyCategoryHierarchyFlat fl on cc.ID = fl.categoryID

    and ( fl.fullPath like '%>67>%' or fl.fullPath like '%>67')

    and cc.type = 'Category'

    and cc.surveyInstanceID = 4153

    So give that a shot see if that performs a bit better. The indexes on the search fields and FKs will still be more important.

    ______________________________________________________________________________
    How I want a drink, alcoholic of course, after the heavy lectures involving quantum mechanics.

  • Jayanth_Kurup (6/21/2011)


    As a first try you could try replacing the subqueries with joins.

    http://sqlinthewild.co.za/index.php/2010/01/12/in-vs-inner-join/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Sorry folks for the incomplete information:

    Designs:

    svCategoryRisk

    Column name Data Type Allow Nulls

    surveyVersionIDint Unchecked

    categoryID int Unchecked

    questionID int Unchecked

    questionOrderint Unchecked

    riskFormula varchar(MAX)Unchecked

    dateUpdateddatetime Unchecked

    surveyFilled

    surveyInstanceID intUnchecked

    questionID intUnchecked

    answerNumeric floatChecked

    answerText nvarchar(255)Checked

    answeredDate datetimeUnchecked

    answeredBy_luserID intChecked

    levelOfRisk floatChecked

    vw_categoryColsQuestionLORAll is a view with design:

    SELECT a.surveyInstanceID, a.surveyVersionID, a.ID, a.type, COALESCE (q.text, c.title) AS name, a.[level], CAST(a.totalRisk AS numeric(15, 2)) AS totalRisk,

    a.questions, a.answered, CAST(a.totalRisk / a.answered AS numeric(15, 2)) AS lor, GETDATE() AS dateUpdated

    FROM (SELECT surveyInstanceID, surveyVersionID, questionID AS ID, 'Question' AS type, 1 AS typeID,

    CASE WHEN lvl1ID = categoryID THEN 2 WHEN lvl2ID = categoryID THEN 3 WHEN lvl3ID = categoryID THEN 4 WHEN lvl4ID = categoryID THEN

    5 WHEN lvl5ID = categoryID THEN 6 WHEN lvl6ID = categoryID THEN 7 ELSE NULL END AS [level], levelOfRisk AS totalRisk,

    1 AS questions, CASE WHEN NOT levelOfRisk IS NULL THEN 1 ELSE 0 END AS answered

    FROM dbo.vw_categoryColsQuestionLOR

    UNION

    SELECT surveyInstanceID, surveyVersionID, lvl0ID AS ID, 'Category' AS type, 2 AS typeID, 0 AS [level], SUM(levelOfRisk) AS totalRisk,

    COUNT(questionID) AS questions, SUM(CASE WHEN NOT levelOfRisk IS NULL THEN 1 ELSE 0 END) AS answered

    FROM dbo.vw_categoryColsQuestionLOR

    GROUP BY surveyInstanceID, surveyVersionID, lvl0ID

    UNION

    SELECT surveyInstanceID, surveyVersionID, lvl1ID AS ID, 'Category' AS type, 2 AS typeID, 1 AS [level], SUM(levelOfRisk) AS totalRisk,

    COUNT(questionID) AS questions, SUM(CASE WHEN NOT levelOfRisk IS NULL THEN 1 ELSE 0 END) AS answered

    FROM dbo.vw_categoryColsQuestionLOR

    GROUP BY surveyInstanceID, surveyVersionID, lvl1ID

    UNION

    SELECT surveyInstanceID, surveyVersionID, lvl2ID AS ID, 'Category' AS type, 2 AS typeID, 2 AS [level], SUM(levelOfRisk) AS totalRisk,

    COUNT(questionID) AS questions, SUM(CASE WHEN NOT levelOfRisk IS NULL THEN 1 ELSE 0 END) AS answered

    FROM dbo.vw_categoryColsQuestionLOR

    GROUP BY surveyInstanceID, surveyVersionID, lvl2ID

    UNION

    SELECT surveyInstanceID, surveyVersionID, lvl3ID AS ID, 'Category' AS type, 2 AS typeID, 3 AS [level], SUM(levelOfRisk) AS totalRisk,

    COUNT(questionID) AS questions, SUM(CASE WHEN NOT levelOfRisk IS NULL THEN 1 ELSE 0 END) AS answered

    FROM dbo.vw_categoryColsQuestionLOR

    GROUP BY surveyInstanceID, surveyVersionID, lvl3ID

    UNION

    SELECT surveyInstanceID, surveyVersionID, lvl4ID AS ID, 'Category' AS type, 2 AS typeID, 4 AS [level], SUM(levelOfRisk) AS totalRisk,

    COUNT(questionID) AS questions, SUM(CASE WHEN NOT levelOfRisk IS NULL THEN 1 ELSE 0 END) AS answered

    FROM dbo.vw_categoryColsQuestionLOR

    GROUP BY surveyInstanceID, surveyVersionID, lvl4ID

    UNION

    SELECT surveyInstanceID, surveyVersionID, lvl5ID AS ID, 'Category' AS type, 2 AS typeID, 5 AS [level], SUM(levelOfRisk) AS totalRisk,

    COUNT(questionID) AS questions, SUM(CASE WHEN NOT levelOfRisk IS NULL THEN 1 ELSE 0 END) AS answered

    FROM dbo.vw_categoryColsQuestionLOR

    GROUP BY surveyInstanceID, surveyVersionID, lvl5ID

    UNION

    SELECT surveyInstanceID, surveyVersionID, lvl6ID AS ID, 'Category' AS type, 2 AS typeID, 6 AS [level], SUM(levelOfRisk) AS totalRisk,

    COUNT(questionID) AS questions, SUM(CASE WHEN NOT levelOfRisk IS NULL THEN 1 ELSE 0 END) AS answered

    FROM dbo.vw_categoryColsQuestionLOR

    GROUP BY surveyInstanceID, surveyVersionID, lvl6ID) AS a LEFT OUTER JOIN

    dbo.question AS q ON a.type = 'Question' AND a.ID = q.questionID LEFT OUTER JOIN

    dbo.category AS c ON a.type = 'Category' AND a.ID = c.categoryID

    WHERE (NOT (a.ID IS NULL))

    vw_surveyCategoryHierarchyFlat is a view with design:

    SELECT DISTINCT

    sv.surveyVersionID, sv.surveyVersion, l0.childID AS categoryID, CASE WHEN NOT l9.parentID IS NULL

    THEN 1 ELSE 0 END + CASE WHEN NOT l8.parentID IS NULL THEN 1 ELSE 0 END + CASE WHEN NOT l7.parentID IS NULL

    THEN 1 ELSE 0 END + CASE WHEN NOT l6.parentID IS NULL THEN 1 ELSE 0 END + CASE WHEN NOT l5.parentID IS NULL

    THEN 1 ELSE 0 END + CASE WHEN NOT l4.parentID IS NULL THEN 1 ELSE 0 END + CASE WHEN NOT l3.parentID IS NULL

    THEN 1 ELSE 0 END + CASE WHEN NOT l2.parentID IS NULL THEN 1 ELSE 0 END + CASE WHEN NOT l1.parentID IS NULL

    THEN 1 ELSE 0 END + CASE WHEN NOT l0.parentID IS NULL THEN 1 ELSE 0 END + 1 AS [level], COALESCE (CAST(l9.parentID AS VARCHAR) + '>', '')

    + COALESCE (CAST(l8.parentID AS VARCHAR) + '>', '') + COALESCE (CAST(l7.parentID AS VARCHAR) + '>', '')

    + COALESCE (CAST(l6.parentID AS VARCHAR) + '>', '') + COALESCE (CAST(l5.parentID AS VARCHAR) + '>', '')

    + COALESCE (CAST(l4.parentID AS VARCHAR) + '>', '') + COALESCE (CAST(l3.parentID AS VARCHAR) + '>', '')

    + COALESCE (CAST(l2.parentID AS VARCHAR) + '>', '') + COALESCE (CAST(l1.parentID AS VARCHAR) + '>', '')

    + COALESCE (CAST(l0.parentID AS VARCHAR) + '>', '') + CAST(l0.childID AS VARCHAR) AS fullPath, COALESCE (CAST(l9.parentName AS VARCHAR) + '>',

    '') + COALESCE (CAST(l8.parentName AS VARCHAR) + '>', '') + COALESCE (CAST(l7.parentName AS VARCHAR) + '>', '')

    + COALESCE (CAST(l6.parentName AS VARCHAR) + '>', '') + COALESCE (CAST(l5.parentName AS VARCHAR) + '>', '')

    + COALESCE (CAST(l4.parentName AS VARCHAR) + '>', '') + COALESCE (CAST(l3.parentName AS VARCHAR) + '>', '')

    + COALESCE (CAST(l2.parentName AS VARCHAR) + '>', '') + COALESCE (CAST(l1.parentName AS VARCHAR) + '>', '')

    + COALESCE (CAST(l0.parentName AS VARCHAR) + '>', '') + CAST(l0.childName AS VARCHAR) AS fullName

    FROM dbo.category AS c INNER JOIN

    dbo.vw_surveyCategoryHierarchy AS l0 ON c.categoryID = l0.childID LEFT OUTER JOIN

    dbo.vw_surveyCategoryHierarchy AS l1 ON l0.parentID = l1.childID LEFT OUTER JOIN

    dbo.vw_surveyCategoryHierarchy AS l2 ON l1.parentID = l2.childID LEFT OUTER JOIN

    dbo.vw_surveyCategoryHierarchy AS l3 ON l2.parentID = l3.childID LEFT OUTER JOIN

    dbo.vw_surveyCategoryHierarchy AS l4 ON l3.parentID = l4.childID LEFT OUTER JOIN

    dbo.vw_surveyCategoryHierarchy AS l5 ON l4.parentID = l5.childID LEFT OUTER JOIN

    dbo.vw_surveyCategoryHierarchy AS l6 ON l5.parentID = l6.childID LEFT OUTER JOIN

    dbo.vw_surveyCategoryHierarchy AS l7 ON l6.parentID = l7.childID LEFT OUTER JOIN

    dbo.vw_surveyCategoryHierarchy AS l8 ON l7.parentID = l8.childID LEFT OUTER JOIN

    dbo.vw_surveyCategoryHierarchy AS l9 ON l8.parentID = l9.childID LEFT OUTER JOIN

    dbo.vw_surveyVersion AS sv ON l0.surveyVersionID = sv.surveyVersionID

    Thanks

  • The query performs a lot faster but it does not give the right output. I am going to try to work on it myself. I have also attached some information about the tables that I am using in the query.

    Thanks a lot.

  • 3 IN clauses deep with a DISTINCT using LIKE leading percent OR WHERE clause. Nice. 😛

    I didn't see any variables here so it is fixed query, right? Perhaps once you find the optimal plan you can force it with plan guide.

    If variable inputs required, will almost certainly need to disassemble this into one or perhaps 2 interim steps to give the optimizer a decent chance at getting the optimal plan for given input.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • I've seen this sort of thing fairly often. A query that does a lot of searching being done in one fell swoop takes minutes. Breaking it out into temp tables (not table variables) takes seconds. If this is happening then you're probably getting intermediate record sets written to TempDB anyhow.

    Just break it down so you get the smallest record set into a temp table and then use that from there on.

    Kevin has it right - divide and conquer.

    Todd Fifield

  • Can you show me an example as how to store the result set of an inner query to a temp table and then use it. And why does it take less time?

    Thanks,

    Ratan

  • Hello Everybody,

    Thanks to toddasd, Kevin and Todd, I was able to come up with a solution. This is the query I came up with:

    CREATE TABLE #theseQuestions(questionID int)

    Insert Into

    #theseQuestions(questionID)

    select distinct sf.questionID

    from surveyFilled sf

    inner join svCategoryQuestionRisk svr on sf.questionID = svr.questionID and sf.surveyInstanceID = 4153

    inner join vw_categoryColsQuestionLORAll cc on svr.categoryID = cc.ID and svr.surveyVersionID = 5

    inner join vw_surveyCategoryHierarchyFlat fl on cc.ID = fl.categoryID

    and ( fl.fullPath like '%>67>%' or fl.fullPath like '%>67')

    and cc.type = 'Category'

    and cc.surveyInstanceID = 4153 and fl.surveyVersionID = cc.surveyVersionID

    select answered = sum(

    CASE WHEN

    sff.answerNumeric IS NULL AND (sff.answerText IS NULL OR rtrim(sff.answerText) = '')

    THEN 0 ELSE 1 END)

    FROM surveyFilled sff

    INNER JOIN #theseQuestions tq ON sff.questionID = tq.questionID and sff.surveyInstanceID = 4153

    drop table #theseQuestions

    It runs almost 167 times faster now but still takes around 18 seconds. Right now, I think its just because of the large amount of data. If anyone can improve it further, I will much appreciate and also explain why the change makes it faster.

    Thanks,

    Ratan Deep Gupta

  • without seeing the actual execution plan it's almost impossible to tune it any further. Especially since there seem to be views involved (vw_categoryColsQuestionLORAll vw_surveyCategoryHierarchyFlat). Those views might contain code that's responsible for a large amount of the duration (e.g. calling a complex function in the WHERE clause, recursive cte's or the like).

    Just a wild guess:

    One option might be to query vw_categoryColsQuestionLORAll separately and store the data in a temp table (maybe indexed) and limit the number of rows stored in that temp table by using WHERE cc.type = 'Category' and cc.surveyInstanceID = 4153.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Hello Lutz,

    Using a separate table for the view made a difference of about 2 seconds. Why does using a separate table makes a query faster?

    Thanks,

    Ratan

  • A view (assuming it's not an indexed view) usually just holds the definition of the query.

    When referencing such a view in another query, SQL Server will use the underlying definition trying to come up with a better execution plan. Furthermore, it might consider all rows being returned by the view applying the filter afterwards instead of limiting the rows to be considered in the first place.

    By storing the data in a temp table the final query becomes easier to analyze and will (usually) have less rows to be processed. It's also (at least sometimes) a lot easier to add indexes required for the specific query in question.

    It's a lot easier to explain the specific reason in a given scenario when comparing both actual execution plans.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply