NOT EXISTS ?

  • Hi,

    Can anyone just clarify something for me. I'm gradually teaching myself SQL (with help from everyone at SSC:-) and have been working through the exercises in 'SQL Queries for Mere Mortals by Viescas and Hernandez' and have just written the following query:

    select

    CustomerID,

    CustFirstName,

    CustLastName

    from

    customers

    where

    CustomerID in

    (

    select CustomerID

    from Orders

    inner join Order_Details on Order_Details.OrderNumber = Orders.OrderNumber

    inner join Productson Products.ProductNumber = Order_Details.ProductNumber

    inner join Categories on Categories.CategoryID = Products.CategoryID

    where CategoryDescription like 'bike%'

    ) and

    not exists

    (

    select customerID

    from Orders

    inner join Order_Details on Order_Details.OrderNumber = Orders.OrderNumber

    inner join Productson Products.ProductNumber = Order_Details.ProductNumber

    where Products.ProductName like '%helmet%'

    )

    which doesn't work as it returns an empty result set. To get it to return the correct 2 records the final 'not exists' subquery needs to be correlated, using the customerID field, i.e.

    select

    CustomerID,

    CustFirstName,

    CustLastName

    from

    customers

    where

    CustomerID in

    (

    select CustomerID

    from Orders

    inner join Order_Details on Order_Details.OrderNumber = Orders.OrderNumber

    inner join Productson Products.ProductNumber = Order_Details.ProductNumber

    inner join Categories on Categories.CategoryID = Products.CategoryID

    where CategoryDescription like 'bike%'

    ) and

    not exists

    (

    select customerID

    from Orders

    inner join Order_Details on Order_Details.OrderNumber = Orders.OrderNumber

    inner join Productson Products.ProductNumber = Order_Details.ProductNumber

    where Products.ProductName like '%helmet%' and Orders.CustomerID = Customers.CustomerID

    )

    What I don't understand is why the additional ' and Orders.CustomerID = Customers.CustomerID' is needed?

    It isn't required if the NOT EXISTS is replaced with NOT IN. Can someone help explain why this is?

    Sorry if this is one of those questions that is always asked.

    Thanks......Jay

  • Easier to understand if you switch them around.

    In is looking for matching in a list of values. So a where clause predicate like this one

    WHERE SomeColumn IN (1,2,3,4,5)

    will return true for rows where SomeColumn has a value of 1, 2, 3, 4 or 5. It works the same with a subquery

    DECLARE @SomeTable (IntCol int)

    Insert into @SomeTable (IntCol) Values (1)

    Insert into @SomeTable (IntCol) Values (2)

    Insert into @SomeTable (IntCol) Values (3)

    Insert into @SomeTable (IntCol) Values (4)

    Insert into @SomeTable (IntCol) Values (5)

    ... WHERE SomeColumn IN (Select IntCol FROM @SomeTable)

    works exactly the same as the above IN.

    Clear so far?

    Exists doesn't check for a match, it just checks for existance of a row, so the match (if one is required) has to be done as a correlation funtion.

    This will always return true

    WHERE EXISTS (SELECT 1)

    because SELECT 1 always, always returns a row.

    DECLARE @SomeTable (IntCol int)

    Insert into @SomeTable (IntCol) Values (1)

    Insert into @SomeTable (IntCol) Values (2)

    Insert into @SomeTable (IntCol) Values (3)

    Insert into @SomeTable (IntCol) Values (4)

    Insert into @SomeTable (IntCol) Values (5)

    ... WHERE EXISTS (Select IntCol FROM @SomeTable)

    This will also return true for every single row, because Select IntCol FROM @SomeTable returns 5 rows and hence the EXISTS predicate is always true. To make it work like the IN, we need to add a comparison

    DECLARE @SomeTable (IntCol int)

    Insert into @SomeTable (IntCol) Values (1)

    Insert into @SomeTable (IntCol) Values (2)

    Insert into @SomeTable (IntCol) Values (3)

    Insert into @SomeTable (IntCol) Values (4)

    Insert into @SomeTable (IntCol) Values (5)

    ... WHERE EXISTS (Select IntCol FROM @SomeTable WHERE SomeColumn = IntCol)

    Now this will behave like the IN because it's checking for matching rows and only returning true when there is a match

    Exists is better for when comparisons are needed on two or more columns. For eg, this cannot be done easily with an IN

    DECLARE @SomeTable (IntCol int, charCol char(1))

    Insert into @SomeTable (IntCol, charCol) Values (1, 'a')

    Insert into @SomeTable (IntCol, charCol) Values (2, 'a')

    Insert into @SomeTable (IntCol, charCol) Values (3, 'a')

    Insert into @SomeTable (IntCol, charCol) Values (4, 'b')

    Insert into @SomeTable (IntCol, charCol) Values (5, 'b')

    ... WHERE EXISTS (Select IntCol FROM @SomeTable WHERE SomeColumn = IntCol AND SomeOtherColumn = charCol)

    Do things make more sense now?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • You explained that way better than I was...

    CEWII

  • Thanks for the answer Gail. I think I understand it now:- Without an additional clause such as a correlation or WHERE criteria then EXISTS can only return a single TRUE or FALSE, so if TRUE a 'full' result set will be returned ('full' in the sense of all records meeting the rest of the query's criteria), and if FALSE an empty set will be returned.

    And my original query had an uncorrelated NOT EXISTS - so, because the condition returned true, the 'NOT' effectively inverted this and returned an empty set (NOT EXISTS and TRUE being logically equivalent to EXISTS and FALSE).

    Does that make sense, or is my understanding a little garbled ?:-)

    And does that mean any uncorrelated EXISTS clause that returns false will always return an empty result set?

    Thanks for your help,

    Regards....Jason

  • jason (8/11/2009)


    Does that make sense, or is my understanding a little garbled ?:-)

    Pretty much. I'm busy writing up a blog post that'll go into more detail.

    And does that mean any uncorrelated EXISTS clause that returns false will always return an empty result set?

    Depends how and where it's used. If it's combined with another criteria in an OR, then no

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • not exists

    (

    select 'Nothing'

    from Orders

    inner join Order_Details on Order_Details.OrderNumber = Orders.OrderNumber

    inner join Products on Products.ProductNumber = Order_Details.ProductNumber

    where Products.ProductName like '%helmet%' and Orders.CustomerID = Customers.CustomerID

    )

    Gail does a great job of explaining it.

    I was told this is a boolean yes or no, which can be a way to boost performance at times.

    I usually only put the columns in for reference on the joins I need, then replace them with 'nothing'.

    We have a gui query tool to run against the AS400, and when a user looks at one of these queries, it tends to keep them from 'tweaking' it. 🙂

    There are some limitations to how you can use EXISTS / NOT EXISTS.

    Greg E

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply