LEFT OUTER JOINS turns to INNER JOIN automatically ????

  • Hi,

    I observed a very curious situation today while translating a query from old *= syntax to ANSI syntax in our project. The outputs were different so I analysed the Execution plan and this is what I found:

    Consider the Query (1):

    Select A.intSeq,B.intSeq,  A.dtTimeStamp,B.dtTimeStamp

    From tbl_pars_translate_text A LEFT OUTER join tbl_pars_translate_text B

    on A.strASpName=B.strASpName and A.intSeq=B.intSeq

    and B.intLanguage=2

    where A.strASpname='EmployeeProfileData' and A.intLanguage=1 order by A.intSeq

    This is self join on a table with intSeq, dtTimeStamp, intLanguage as the columns.

    The Execution plan for this query is correct, SQL Server applies left outer join on tables A and B

    However, when I change the query to (2):

    Select A.intSeq,B.intSeq,  A.dtTimeStamp,B.dtTimeStamp

    From tbl_pars_translate_text A  LEFT OUTER join tbl_pars_translate_text B

    on A.strASpName=B.strASpName and A.intSeq=B.intSeq

    where A.strASpname='EmployeeProfileData' and A.intLanguage=1

    and B.intLanguage=2

    order by A.intSeq

    the execution plan shows INNER JOIN between tables A and B and hence the output is incorrect.

    Can you explain why SQL Server 2000 changes the join type in the second instance even though LEFT OUTER JOIN has been specified.

    ?????

     

    Thanks.

     


    Prateek.
    ~~~~~~~~
    "Remember: 80% of bugs are just undocumented features waiting to be discovered!"

  • Hi Prateek,

    Check this article

    http://support.microsoft.com/default.aspx?scid=kb;en-us;176480

    Well it doesnot really elaborate on the internal logic and implications of AND and WHERE clauses for Outer Joins but it does throw little light on how they work.

     

    Prasad Bhogadi
    www.inforaise.com

  • Hi Prasad,

    I had checked this one out before I posted this query.

    What the KB article tells is actually opposite to what I had observed using the execution plan. The execution plan actually showed the table scans on the tables with the WHERE clause applied on them BEFORE the application of the JOIN / MERGE .... and besides this KB article does not talk about the conversion of the OUTER JOIN to INNER JOIN .... and that is what I actually found strange and disturbing... as it may be affect many seemingly correct queries and return incorrect results.

     

    Thanks anyway... !


    Prateek.
    ~~~~~~~~
    "Remember: 80% of bugs are just undocumented features waiting to be discovered!"

  • By moving the predicate from the join clause to the where clause, you are restricting the result set to just those rows satisfying the predicate, rather than including rows that do not.  As there is another predicate in the where clause that restricts the outer instance of the table in the self-join to a different value in the same column, the query optimazer is "smart" enough to realize that this "outer join" is actually the same as an inner join.



    --Jonathan

  • Hi,

    This is perfect logical.

    To get the second query working, use the following WHERE clause:

    ... and ( B.intLanguage=2 OR B.intLanguage is null )

    Why?

    Because the LEFT OUTER JOIN returns the result set:

    A.intseq, B.intseq, A.intlanguage, B.intlanguage

    1            A             1                   A

    2           <NULL>    2                   <Null>

     

    And then it will filter the B.intlanguage with the WHERE clause.

    Guess which records are dropped with the original version !

    Regards,

    Geert

  • Your second query implicitly removes the

    (or B.intLanguage is null)

    test by moving the test

    from the join condition

    to the predicate condition. 

    Remember the JOIN QUALIFIER tells the server how to handle null conditions on different sides of the equivalence conditions.

    So the condition is logically transformed from

    on a.col = b.col

    to the following:

    to on a.col = b.col -- INNER JOIN identity tranform

    to on (a.col = b.col or b.col is null) -- LEFT OUTER JOIN

    to on (a.col = b.col or a.col is null) -- RIGHT OUTER JOIN

    to on (a.col = b.col or a.col is null or b.col is null) -- FULL OUTER JOIN

    Note the optimizer must be able to deduce in your case that the

    join condition on A.strASpName=B.strASpName and A.intSeq=B.intSeq

    results in an inner join because either column probably cannot be null regardless of the requested JOIN QUALIFIER "LEFT OUTER JOIN"

    Peter Evans (__PETER Peter_)

  • Am I having a deja vu or is the above post more or less the same as Jonathan already said three posts above????

    Not to be offending, but once in a while it might be helpful to read through a thread before posting.

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply