Help us tune a query

  • Excellent. That's what I was hoping for. How does that perform?

    Add these lines before the query and post the details from the messages tab

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    GO

    Now that the select seems to be under control, can you test out the two options for the insert (one with the PointsCategory, one without) and with the added index on the UserPoints table.

    CREATE INDEX idx_UserPoints_PointsCategoryUserID ON UserPoints (PointsCategory, UserID) INCLUDE (Date, PointsScored)

    -- When the category is specified

    INSERT INTO #TopScores (

    UserID,

    RecentPoints,

    AllPoints,

    RowNumber

    )

    SELECT [UserPoints].UserID,

    SUM(CASE WHEN [Date] >= @CutoffDate THEN [PointsScored] ELSE 0 END) AS RecentPoints,

    SUM([PointsScored]) AS AllPoints,

    ROW_NUMBER() OVER(ORDER BY SUM(CASE WHEN [Date] >= @CutoffDate THEN [PointsScored] ELSE 0 END) DESC,

    SUM([PointsScored]) DESC) as RowNumber

    FROM [UserPoints]

    WHERE PointsCategory = @PointsCategory

    AND UserID 57832 -- Exclude 'Site Owners', but do it explicitly by UserID to optimise query

    GROUP BY [UserPoints].UserID

    and

    -- When the category is not

    INSERT INTO #TopScores (

    UserID,

    RecentPoints,

    AllPoints,

    RowNumber

    )

    SELECT [UserPoints].UserID,

    SUM(CASE WHEN [Date] >= @CutoffDate THEN [PointsScored] ELSE 0 END) AS RecentPoints,

    SUM([PointsScored]) AS AllPoints,

    ROW_NUMBER() OVER(ORDER BY SUM(CASE WHEN [Date] >= @CutoffDate THEN [PointsScored] ELSE 0 END) DESC,

    SUM([PointsScored]) DESC) as RowNumber

    FROM [UserPoints]

    WHERE UserID 57832 -- Exclude 'Site Owners', but do it explicitly by UserID to optimise query

    GROUP BY [UserPoints].UserID

    What's that cutoffdate and what values can it have?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • That sure looks a lot better. 🙂

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • GSquared (7/13/2009)


    Table variables are always estimated as if they were going to only have 1 row in them. That will give you a very false low on the cost, and can end up with all kinds of other problems in the execution plan.

    They're generally okay if you can be certain they will not ever exceed about 100 rows. That's a rule-of-thumb and kind of arbitrary, but it generally works.

    Just as an FYI:

    Recently I was using a table variable, one date column, 36 rows. Clustered PK on the column. Date values from first day of current month to first day of CurrentMonth-35.

    This table was being joined to another table with < 10,000 rows.

    Execution time: 3+ minutes.

    converted table variable to temporary table (the only changes were to change @table to #table and changed DECLARE @table TABLE to CREATE TABLE #table). Execution time < 30ms.

    So, with only 1/3 of GSquared's rule-of-thumb, table variables can still be massive performance problems. IMO, just avoid them unless you need some specific feature about them that temporary tables don't have.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • would it be worth trying to add a non-clustered index to the temp table to try and get ride of the key lookup?

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • Christopher Stobbs (7/14/2009)


    would it be worth trying to add a non-clustered index to the temp table to try and get ride of the key lookup?

    Could widen the nonclustered to include all the columns, but it's only 6 rows and it can't be more than that..

    It's odd, SQL should be doing that lookup before it does the merge for the OR. If it did that, it would only be 1 row in the key lookup. The other 5 are retrieved from the cluster.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Ok, lots of information to post after various different tests. Attached are the execution plans of the insert statement. There are separate runs for it with and without points category, and with and without the addition of the index suggested by Gail.

    The outputs of the messages window are also in the zip file, with similar names to the execution plans.

    The select statement at the end seems to be performing nicely now. CPU time required is way down on what it used to be according to the statistics generated in the messages window. We haven't put it live yet - holding off til we think we've got the first part running nicely.

    Mel HarbourProject ManagerRed Gate Software

  • Oh, and I forgot to mention what @cutoffdate is doing. That's actually only ever set to one value, namely 30 days ago. It happens to calculate this value in the some of the C# classes and then pass it to the stored procedure. Again, if it would improve the query to have this calculation done in the SP, then that's fair game.

    Mel HarbourProject ManagerRed Gate Software

  • Mel,

    I haven't had the time to look at all the fine suggestions folks may have given, but if I look at your original query, I see the potential for a major slow down in the form of an ORDER BY on a SUM. I believe a little "pre-aggregation" (as Perter Larson calls it) along with a little "Divide'N'Conquer" via a temp table would go a very long way in speeding up such a query.

    I did see where you provided a table structure... I'll try to make a more substantial suggestion in the form of code tonight (if someone hasn't already beat me to it).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (7/14/2009)


    I haven't had the time to look at all the fine suggestions folks may have given, but if I look at your original query, I see the potential for a major slow down in the form of an ORDER BY on a SUM. I believe a little "pre-aggregation" (as Perter Larson calls it) along with a little "Divide'N'Conquer" via a temp table would go a very long way in speeding up such a query.

    That's already been done in Mel's initial post. The original one with the order on the sum was replaced with one with a table variable, since changed to a temp table.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Just tried Jeff's suggestion, restructured the query to do a pre-aggregation with a CTE, and that slowed the performance down (more CPU time).

    Mel HarbourProject ManagerRed Gate Software

  • Mel Harbour (7/14/2009)


    Just tried Jeff's suggestion, restructured the query to do a pre-aggregation with a CTE, and that slowed the performance down (more CPU time).

    It will. A CTE, despite it's name, is just a named subquery. Jeff was suggesting temp table, which is exactly what the revised query you initially posted used (well, after we changed table var to temp table)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Was Jeff referring to the Order By in the ROW_NUMBER() function?

    A side note, I think a summary of the steps and results of this thread would make a great article on How To Performance Tune a query.

  • I agree with Jack.

    Also Mel could you post the latest copy of the solution , I've lost track of where you all got to.

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • Sure!

    The current query is:

    CREATE TABLE #TopScores

    (

    UserID INT,

    RecentPoints INT,

    AllPoints INT,

    RowNumber INT

    )

    CREATE CLUSTERED INDEX idx_TopScores_RowNumber ON #TopScores (RowNumber)

    CREATE INDEX idx_TopScores_UserID ON #TopScores (UserID)

    INSERT INTO #TopScores (

    UserID,

    RecentPoints,

    AllPoints,

    RowNumber

    )

    SELECT UserPoints.UserID,

    SUM(CASE WHEN [Date] >= '2009-06-13' THEN [PointsScored] ELSE 0 END) AS RecentPoints,

    SUM([PointsScored]) AS AllPoints,

    ROW_NUMBER() OVER(ORDER BY SUM(CASE WHEN [Date] >= '2009-06-13' THEN [PointsScored] ELSE 0 END) DESC, SUM([PointsScored]) DESC) as RowNumber

    FROM UserPoints

    WHERE PointsCategory = ISNULL(NULL, PointsCategory)

    AND UserID 57832 -- Exclude 'Site Owners', but do it explicitly by UserID to optimise query

    GROUP BY [UserPoints].UserID

    SELECT t.RowNumber, u.UserID, u.DisplayName, t.RecentPoints, t.AllPoints

    FROM #TopScores t

    JOIN Users u ON u.[UserID] = t.UserID

    WHERE t.RowNumber <= 5 OR 769521 = t.UserID

    ORDER BY t.RowNumber

    DROP TABLE #TopScores

    Performance-wise, the problems are all restricted to the INSERT-SELECT statement.

    Mel HarbourProject ManagerRed Gate Software

  • This might be sound stupid, I'm not sure if it makes a difference at all I might need to do some testing.

    but creating the clustered Index on #TopScores after the actual insert?

    Would this help at all?

    I was under the assumption that adding to a clustered index had slightly more over head than add a clustered index, this could be rubbish so please let me know.

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life

Viewing 15 posts - 31 through 45 (of 103 total)

You must be logged in to reply to this topic. Login to reply