May 8, 2009 at 9:38 am
I have recently been introduced to the wonders of using APPLY. I've been working on a report, and I'm using OUTER APPLY on a subquery and I'm wondering if there might be some hidden performance cost that I'm not aware of.
in the following sample, each customer is assigned to a shipment. a shipment can have multiple customers. each customer_carton is assigned to a customer and a customer can have multiple customer_cartons. I'm trying to get the bar_code of the oldest carton for each customer.
SELECT s.shipment_id
,c.customer_id
,c.customer_name
,bc.bar_code
FROM shipment s
JOIN customer c ON s.shipment_id = c.shipment_id
OUTER APPLY (
SELECT cc.customer_id
,cc.bar_code
,RANK() OVER (PARTITION BY cc.customer_id
ORDER BY cc.create_date ASC) AS bc_order
FROM customer_carton cc
WHERE cc.customer_id = c.customer_id ) bc
WHERE bc.bc_order = 1
If anyone has any advice on this, I'd love to hear it.
/*
here are some statistics
customer_carton Scan count 9, logical reads 99, physical reads 0
customer Scan count 10, logical reads 21, physical reads 0
shipment Scan count 0, logical reads 20, physical reads 5
the tables don't have much data in them at the moment so it's hard for me to tell
what the performance might be like with more data
*/
May 8, 2009 at 10:53 am
Do you have the actual execution plan? Zip it up & post it.
----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software
May 8, 2009 at 12:21 pm
My employer doesn't permit me do that (the object names in the sample are fake). Looking at the execution plan the two biggest cost % are an index scan and sort inside the OUTER APPLY subquery.
-------------------------------
A Clustered Index Scan on customer_carton 51%
Predicate:
customer_carton.customer_id as cc.customer_id =customer.customer_id as c.customer_id
Object:
customer_carton.PK_customer_carton cc
Output List:
cc.create_date, cc.bar_code
-------------------------------
Sort 41%
Output List:
customer_carton.create_date, customer_carton.bar_code
Order By:
customer_carton.create_date asc
I know I need to put an index on customer_carton.customer_id, but i don't know what to do about the Order By in the Rank() function. Also, I don't know what the db engine does to interpret the OUTER APPLY.
May 8, 2009 at 12:54 pm
Estimated costs are not always the best measures. However, a clustered index scan is the same thing as a table scan. So you need to adjust the query to make better use of the clustered index or you should consider creating a nonclustered index, preferably one that is covering.
An index on the column being sorted could help.
----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software
May 8, 2009 at 2:04 pm
Frequently an APPLY can be as fast as a JOIN, however, there are many times when it will not be. It is very rare that an APPLY is ever faster than an equivalent JOIN (exceptions occur sometimes with multi-statement TVF's).
I would suggest that you try your query with an OUTER JOIN to see if it makes any difference:
SELECT s.shipment_id
,c.customer_id
,c.customer_name
,bc.bar_code
FROM shipment s
JOIN customer c ON s.shipment_id = c.shipment_id
LEFT OUTER JOIN (
SELECT cc.customer_id
,cc.bar_code
,RANK() OVER (PARTITION BY cc.customer_id
ORDER BY cc.create_date ASC) AS bc_order
FROM customer_carton cc) bc
ON cc.customer_id = bc.customer_id
AND bc.bc_order = 1
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
May 8, 2009 at 2:29 pm
Thanks for the help people. I'll have to compare the two methods once we get the database fully loaded, but right now the left outer joins are a bit faster. Propper indexing will pick things up the rest of the way.
On a side note, has anyone found an efficient use for APPLY that doesn't involve UDF's?
March 9, 2011 at 9:57 am
RBarryYoung how were you able to make a join between cc.Customer_ID and bc.Customer_ID ? cc is inside your subquery and it is not visible outside that query.
Your query should have thrown an error "The multi-part identifier "cc.Customer_ID" could not be bound".
did you mean to write c.Customer_ID = bc.Customer_ID
Thanks
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply