Join performance

  • The join below takes 18 seconds on development box and on production box its taking more than 5 mins. As I do not have enough access to execution plan on production box. I posted execution plan from development box. Number of rows are same both in development and in production. We can not afford that much time for this join.

    GEN_SUBORDERS has 35 million rows

    STAGE_RETURNS_CLS_TMP has 2283 rows.

    I have also attached Execution plan.

    Select

    S.SITE_ID,

    S.SUBORDER_ID,

    S.SUBORDER_RECEIVED_DATE

    FROM

    dbo.GEN_SUBORDERS S INNER JOIN

    dbo.STAGE_RETURNS_CLS_TMP C ON C.DS_ORDER_ID = S.SUBORDER_ID

  • Please ask someone who does have sufficient permission to get the actual execution plan from the production server.

    Is that the entire query? No filters, no aggregation?

    Have you tried exists, since it appears that you don't actually want any columns from the one table?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • From what information you have given, I believe you should look at indexing for this.

    At a minimum, you would probably want indexes with the join columns as the left-most column to support the join, e.g. gen_suborders.suborder_id and stage_returns_cls_tmp.ds_order_id.

    In SQL 2005 or later you can use include as well and add get_suborders.site_id and gen_suborders.suborder_received_date to an index on site_id for an index seek.

    Of course, the topic of indexing is a complex one and there are many considerations when adding new ones, so knowledge of the domain can affect these choices (existing clustered index, available disk space, select vs insert/update/delete etc).

  • can you please explain this further...I did not get it.... "In SQL 2005 or later you can use include as well and add get_suborders.site_id and gen_suborders.suborder_received_date to an index on site_id for an index seek."

  • include is a statement that is used when creating the index.

    USE AdventureWorks2008R2;

    GO

    CREATE INDEX IX_Document_Title

    ON Production.Document (Title, Revision)

    INCLUDE (FileName);

  • Someone droped the index at production?

    If can only make selects at production box try a select top 10 id column order by id column for each table.

    if its too slow then there are no indexes for it since it ill need to perform a full table scan.

  • This is the task we are performing....sometimes it takes 40 to 45 minutes.....and sometimes it just takes 1 minutes...

    I am also attaching actual execution plan on production...it did not take long though but, I want to over come those high bumps of 40 or 45 mins.

    UPDATE

    STAGE_RETURNS_CLS_TMP

    SET

    MATCH_FLAGS = CASE WHEN MATCH_FLAGS&1 = 1 THEN 4 ELSE MATCH_FLAGS + 4 END,

    SITE_ID_CLEAN = S.SITE_ID,

    DS_ORDER_ID_CLEAN = S.SUBORDER_ID,

    ORDER_CREATED_DATE = S.SUBORDER_RECEIVED_DATE,

    DML_OPCODE = 2,

    DML_UPDATE_DATE = GETDATE()

    FROM

    STAGE_RETURNS_CLS_TMP C INNER JOIN

    GEN_SUBORDERS S ON C.DS_ORDER_ID = S.SUBORDER_ID

    WHERE

    C.MATCH_FLAGS = 1 AND

    C.DS_ORDER_ID_CLEAN IS NULL

  • The plans are identical, so it's not missing indexes or anything like that.

    Have you checked for blocking when it runs slow? Have you checked what the wait_type is when it runs slow.

    p.s. What is that update query? If doesn't match the execution plan (which is a straight select)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • One thing I did notice is that the memory grant for the hash join is much smaller on production. Possibly greater workload so less available memory. Have you checked for hash spills (SQLTrace)?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • if that query ran multiple times and you are setting your isolation level to READ COMMITTED, it would explain its 40 minute run time behavior.

  • What happens when you put your where conditions into your join?

    Jared
    CE - Microsoft

  • SQLKnowItAll (2/1/2012)


    What happens when you put your where conditions into your join?

    If you're talking about the update, nothing will happen. It's an inner join, conditions in the where or join are completely equivalent in results and performance.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (2/1/2012)


    SQLKnowItAll (2/1/2012)


    What happens when you put your where conditions into your join?

    If you're talking about the update, nothing will happen. It's an inner join, conditions in the where or join are completely equivalent in results and performance.

    Good to know! I suppose the optimizer realizes to treat it this way? Because in my head it made sense in an order of operations, so my guess is that the optimizer knows what's going on and treats it as such.

    Jared
    CE - Microsoft

  • Both filters in the WHERE clause refer to the smaller table, so they won't impact performance so much. I would recommend creating the following index on the development server and testing, since by the query plan it is running an index scan on the large table and this would use an index seek, which is much faster. I also agree with the previous suggestion that blocking could be an issue if multiple updates are happening against the same table, but we don't know this from the information given.

    CREATE INDEX IX_YourNamingConvention

    ON dbo.GEN_SUBORDERS (SUBORDER_ID)

    INCLUDE (SITE_ID, SUBORDER_RECEIVED_DATE);

    If you use a separate filegroup for indexes, create it on that filegroup, of course.

  • This is something of a long shot based on observation. There are two key changes.

    UPDATE c SET

    MATCH_FLAGS = CASE WHEN c.MATCH_FLAGS&1 = 1 THEN 4 ELSE c.MATCH_FLAGS + 4 END,

    SITE_ID_CLEAN = S.SITE_ID,

    DS_ORDER_ID_CLEAN = S.SUBORDER_ID,

    ORDER_CREATED_DATE = S.SUBORDER_RECEIVED_DATE,

    DML_OPCODE = 2,

    DML_UPDATE_DATE = GETDATE()

    FROM STAGE_RETURNS_CLS_TMP C

    WHERE EXISTS (SELECT 1 FROM GEN_SUBORDERS S WHERE C.DS_ORDER_ID = S.SUBORDER_ID)

    AND C.MATCH_FLAGS = 1

    AND C.DS_ORDER_ID_CLEAN IS NULL

    Edit: changed to where exists


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

Viewing 15 posts - 1 through 15 (of 28 total)

You must be logged in to reply to this topic. Login to reply