OUTER APPLY performance

  • I have recently been introduced to the wonders of using APPLY. I've been working on a report, and I'm using OUTER APPLY on a subquery and I'm wondering if there might be some hidden performance cost that I'm not aware of.

    in the following sample, each customer is assigned to a shipment. a shipment can have multiple customers. each customer_carton is assigned to a customer and a customer can have multiple customer_cartons. I'm trying to get the bar_code of the oldest carton for each customer.

    SELECT s.shipment_id

    ,c.customer_id

    ,c.customer_name

    ,bc.bar_code

    FROM shipment s

    JOIN customer c ON s.shipment_id = c.shipment_id

    OUTER APPLY (

    SELECT cc.customer_id

    ,cc.bar_code

    ,RANK() OVER (PARTITION BY cc.customer_id

    ORDER BY cc.create_date ASC) AS bc_order

    FROM customer_carton cc

    WHERE cc.customer_id = c.customer_id ) bc

    WHERE bc.bc_order = 1

    If anyone has any advice on this, I'd love to hear it.

    /*

    here are some statistics

    customer_carton Scan count 9, logical reads 99, physical reads 0

    customer Scan count 10, logical reads 21, physical reads 0

    shipment Scan count 0, logical reads 20, physical reads 5

    the tables don't have much data in them at the moment so it's hard for me to tell

    what the performance might be like with more data

    */

  • Do you have the actual execution plan? Zip it up & post it.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • My employer doesn't permit me do that (the object names in the sample are fake). Looking at the execution plan the two biggest cost % are an index scan and sort inside the OUTER APPLY subquery.

    -------------------------------

    A Clustered Index Scan on customer_carton 51%

    Predicate:

    customer_carton.customer_id as cc.customer_id =customer.customer_id as c.customer_id

    Object:

    customer_carton.PK_customer_carton cc

    Output List:

    cc.create_date, cc.bar_code

    -------------------------------

    Sort 41%

    Output List:

    customer_carton.create_date, customer_carton.bar_code

    Order By:

    customer_carton.create_date asc

    I know I need to put an index on customer_carton.customer_id, but i don't know what to do about the Order By in the Rank() function. Also, I don't know what the db engine does to interpret the OUTER APPLY.

  • Estimated costs are not always the best measures. However, a clustered index scan is the same thing as a table scan. So you need to adjust the query to make better use of the clustered index or you should consider creating a nonclustered index, preferably one that is covering.

    An index on the column being sorted could help.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Frequently an APPLY can be as fast as a JOIN, however, there are many times when it will not be. It is very rare that an APPLY is ever faster than an equivalent JOIN (exceptions occur sometimes with multi-statement TVF's).

    I would suggest that you try your query with an OUTER JOIN to see if it makes any difference:

    SELECT s.shipment_id

    ,c.customer_id

    ,c.customer_name

    ,bc.bar_code

    FROM shipment s

    JOIN customer c ON s.shipment_id = c.shipment_id

    LEFT OUTER JOIN (

    SELECT cc.customer_id

    ,cc.bar_code

    ,RANK() OVER (PARTITION BY cc.customer_id

    ORDER BY cc.create_date ASC) AS bc_order

    FROM customer_carton cc) bc

    ON cc.customer_id = bc.customer_id

    AND bc.bc_order = 1

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Thanks for the help people. I'll have to compare the two methods once we get the database fully loaded, but right now the left outer joins are a bit faster. Propper indexing will pick things up the rest of the way.

    On a side note, has anyone found an efficient use for APPLY that doesn't involve UDF's?

  • RBarryYoung how were you able to make a join between cc.Customer_ID and bc.Customer_ID ? cc is inside your subquery and it is not visible outside that query.

    Your query should have thrown an error "The multi-part identifier "cc.Customer_ID" could not be bound".

    did you mean to write c.Customer_ID = bc.Customer_ID

    Thanks

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply