Blog Post

On vs. Where

,

Does it matter if you put your criteria in the ON clause or the WHERE clause?  Well, as with most things SQL the answer is, “It depends”.  If you are dealing with INNER JOIN’s then it really doesn’t matter because the query optimizer is smart enough to come up with the same execution plan for both.  For example these 2 queries evaluate to the same execution plan:

SELECT
SOD.SalesOrderDetailID
,
SOH.SalesOrderID
FROM
Sales.SalesOrderHeader AS SOH
JOIN
Sales.SalesOrderDetail AS SOD
ON
SOH.SalesOrderID = SOD.SalesOrderID
WHERE
SOH.OrderDate >= '7/1/2004'
AND
SOH.OrderDate <
'8/1/2004'

SELECT
SOD.SalesOrderDetailID
,
SOH.SalesOrderID
FROM
Sales.SalesOrderHeader AS SOH
,
Sales.SalesOrderDetail AS SOD
WHERE
SOH.SalesOrderID = SOD.SalesOrderID
AND
SOH.OrderDate >= '7/1/2004'
AND
SOH.OrderDate < '8/1/2004'

Execution Plan

The old SQL 6.5 OUTER JOIN syntax (*= and =*) has been discontinued beginning with SQL Server 2005, so you have to do the JOIN for OUTER JOIN’s in the ON as demonstrated in this code:

SELECT

   SOD.SalesOrderDetailID,

   SOH.SalesOrderID

FROM

   Sales.SalesOrderHeader AS SOH LEFT JOIN

   Sales.SalesOrderDetail AS SOD

        ON SOH.SalesOrderID = SOD.SalesOrderID

WHERE

   SOH.OrderDate >='7/1/2004' AND

   SOH.OrderDate <'8/1/2004'

Now let’s create a sandbox to play in.

If OBJECT_ID('sales.orders', 'U') Is Not Null
Begin
Drop Table
sales.orders;
End;

If OBJECT_ID('sales.order_items', 'U') Is Not Null
Begin
Drop Table
sales.order_items;
End;

If SCHEMA_ID('sales') Is Not Null
Begin
Drop Schema
sales;
End;

Go

Create Schema
sales;

Go
/*
Tables to hold sample data
*/
Create Table sales.orders
(
order_id INT IDENTITY(1,1)PRIMARY KEY,
customer_id INT
);

Create Table sales.order_items
(
order_detail_id INT IDENTITY(1, 1)PRIMARY KEY,
order_id INT,
product_id INT,
quantity INT
)

/*
Load Sample data
*/
INSERT INTO sales.orders (customer_id)
SELECT TOP 5
AO.[object_id]
FROM
sys.all_objects AS AO;

INSERT INTO sales.order_items
(
order_id,
product_id,
quantity
)
SELECT
1,
1,
7
Union ALL
Select
2,
1,
4
Union ALL
Select
3,
2,
6
Union ALL
Select
4,
2,
11
Union ALL
Select
5,
3,
1;

Now we want to return all the customers who have placed an order, but we only want to return the items where the quantity is greater than 5.  Here is method 1:
Select
O.customer_id,
OI.order_id,
OI.product_id,
OI.quantity
From
sales.orders AS O LEFT OUTER JOIN
sales.order_items AS OI ON
O.order_id = OI.order_id
Where
OI.quantity > 5;

This returns:

customer_id order_id    product_id  quantity
----------- ----------- ----------- -----------
3 1 1 7
7 3 2 6
8 4 2 11

Hmmm, we know we have orders from five customers, but this only returns the three rows.  Let’s look at the execution plan:

image

What’s that nest loops (inner join) operator?  Well, by putting the criteria for the RIGHT (second) table in the WHERE clause we essentially converted our LEFT OUTER JOIN to an INNER JOIN.  The correct way to get the data we want would be this way:

SELECT

    O.customer_id,

    OI.order_id,

    OI.product_id,

    OI.quantity

FROM

    sales.orders AS O LEFT OUTER JOIN

    sales.order_items AS OI ON

        O.order_id = OI.order_id AND

        OI.quantity > 5;

This returns what we would expect to see:

customer_id order_id    product_id  quantity
----------- ----------- ----------- -----------
3 1 1 7
5 NULL NULL NULL
7 3 2 6
8 4 2 11
17 NULL NULL NULL

And here is the execution plan:

image

Where you can see the Nested Loops (left outer join) operator.

So yes, it does matter where you put your criteria when dealing with OUTER JOINS.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating