August 7, 2009 at 2:50 pm
Hi,
Can anyone just clarify something for me. I'm gradually teaching myself SQL (with help from everyone at SSC:-) and have been working through the exercises in 'SQL Queries for Mere Mortals by Viescas and Hernandez' and have just written the following query:
select
CustomerID,
CustFirstName,
CustLastName
from
customers
where
CustomerID in
(
select CustomerID
from Orders
inner join Order_Details on Order_Details.OrderNumber = Orders.OrderNumber
inner join Productson Products.ProductNumber = Order_Details.ProductNumber
inner join Categories on Categories.CategoryID = Products.CategoryID
where CategoryDescription like 'bike%'
) and
not exists
(
select customerID
from Orders
inner join Order_Details on Order_Details.OrderNumber = Orders.OrderNumber
inner join Productson Products.ProductNumber = Order_Details.ProductNumber
where Products.ProductName like '%helmet%'
)
which doesn't work as it returns an empty result set. To get it to return the correct 2 records the final 'not exists' subquery needs to be correlated, using the customerID field, i.e.
select
CustomerID,
CustFirstName,
CustLastName
from
customers
where
CustomerID in
(
select CustomerID
from Orders
inner join Order_Details on Order_Details.OrderNumber = Orders.OrderNumber
inner join Productson Products.ProductNumber = Order_Details.ProductNumber
inner join Categories on Categories.CategoryID = Products.CategoryID
where CategoryDescription like 'bike%'
) and
not exists
(
select customerID
from Orders
inner join Order_Details on Order_Details.OrderNumber = Orders.OrderNumber
inner join Productson Products.ProductNumber = Order_Details.ProductNumber
where Products.ProductName like '%helmet%' and Orders.CustomerID = Customers.CustomerID
)
What I don't understand is why the additional ' and Orders.CustomerID = Customers.CustomerID' is needed?
It isn't required if the NOT EXISTS is replaced with NOT IN. Can someone help explain why this is?
Sorry if this is one of those questions that is always asked.
Thanks......Jay
August 7, 2009 at 3:01 pm
Easier to understand if you switch them around.
In is looking for matching in a list of values. So a where clause predicate like this one
WHERE SomeColumn IN (1,2,3,4,5)
will return true for rows where SomeColumn has a value of 1, 2, 3, 4 or 5. It works the same with a subquery
DECLARE @SomeTable (IntCol int)
Insert into @SomeTable (IntCol) Values (1)
Insert into @SomeTable (IntCol) Values (2)
Insert into @SomeTable (IntCol) Values (3)
Insert into @SomeTable (IntCol) Values (4)
Insert into @SomeTable (IntCol) Values (5)
... WHERE SomeColumn IN (Select IntCol FROM @SomeTable)
works exactly the same as the above IN.
Clear so far?
Exists doesn't check for a match, it just checks for existance of a row, so the match (if one is required) has to be done as a correlation funtion.
This will always return true
WHERE EXISTS (SELECT 1)
because SELECT 1 always, always returns a row.
DECLARE @SomeTable (IntCol int)
Insert into @SomeTable (IntCol) Values (1)
Insert into @SomeTable (IntCol) Values (2)
Insert into @SomeTable (IntCol) Values (3)
Insert into @SomeTable (IntCol) Values (4)
Insert into @SomeTable (IntCol) Values (5)
... WHERE EXISTS (Select IntCol FROM @SomeTable)
This will also return true for every single row, because Select IntCol FROM @SomeTable returns 5 rows and hence the EXISTS predicate is always true. To make it work like the IN, we need to add a comparison
DECLARE @SomeTable (IntCol int)
Insert into @SomeTable (IntCol) Values (1)
Insert into @SomeTable (IntCol) Values (2)
Insert into @SomeTable (IntCol) Values (3)
Insert into @SomeTable (IntCol) Values (4)
Insert into @SomeTable (IntCol) Values (5)
... WHERE EXISTS (Select IntCol FROM @SomeTable WHERE SomeColumn = IntCol)
Now this will behave like the IN because it's checking for matching rows and only returning true when there is a match
Exists is better for when comparisons are needed on two or more columns. For eg, this cannot be done easily with an IN
DECLARE @SomeTable (IntCol int, charCol char(1))
Insert into @SomeTable (IntCol, charCol) Values (1, 'a')
Insert into @SomeTable (IntCol, charCol) Values (2, 'a')
Insert into @SomeTable (IntCol, charCol) Values (3, 'a')
Insert into @SomeTable (IntCol, charCol) Values (4, 'b')
Insert into @SomeTable (IntCol, charCol) Values (5, 'b')
... WHERE EXISTS (Select IntCol FROM @SomeTable WHERE SomeColumn = IntCol AND SomeOtherColumn = charCol)
Do things make more sense now?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 7, 2009 at 3:09 pm
You explained that way better than I was...
CEWII
August 11, 2009 at 2:14 pm
Thanks for the answer Gail. I think I understand it now:- Without an additional clause such as a correlation or WHERE criteria then EXISTS can only return a single TRUE or FALSE, so if TRUE a 'full' result set will be returned ('full' in the sense of all records meeting the rest of the query's criteria), and if FALSE an empty set will be returned.
And my original query had an uncorrelated NOT EXISTS - so, because the condition returned true, the 'NOT' effectively inverted this and returned an empty set (NOT EXISTS and TRUE being logically equivalent to EXISTS and FALSE).
Does that make sense, or is my understanding a little garbled ?:-)
And does that mean any uncorrelated EXISTS clause that returns false will always return an empty result set?
Thanks for your help,
Regards....Jason
August 11, 2009 at 2:47 pm
jason (8/11/2009)
Does that make sense, or is my understanding a little garbled ?:-)
Pretty much. I'm busy writing up a blog post that'll go into more detail.
And does that mean any uncorrelated EXISTS clause that returns false will always return an empty result set?
Depends how and where it's used. If it's combined with another criteria in an OR, then no
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 12, 2009 at 9:09 am
not exists
(
select 'Nothing'
from Orders
inner join Order_Details on Order_Details.OrderNumber = Orders.OrderNumber
inner join Products on Products.ProductNumber = Order_Details.ProductNumber
where Products.ProductName like '%helmet%' and Orders.CustomerID = Customers.CustomerID
)
Gail does a great job of explaining it.
I was told this is a boolean yes or no, which can be a way to boost performance at times.
I usually only put the columns in for reference on the joins I need, then replace them with 'nothing'.
We have a gui query tool to run against the AS400, and when a user looks at one of these queries, it tends to keep them from 'tweaking' it. 🙂
There are some limitations to how you can use EXISTS / NOT EXISTS.
Greg E
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply