inner join vs correlated query

  • Hi All,

    I want to fetch records from 4 tables. Fetching a single field from first 3 tables and more fields from 4th table.

    Query using join is like this

    Select a.field,b.field,c.field,d.* from d inner join a on a.id=d.id inner join c on a.id=c.id inner join b on a.id=b.id

    Query using correlated query is like this

    select (select field from a where a.id=d.id) as field1,(select field from b where b.id=d.id) as field2,

    (select field from c where c.id=d.id) as field3,d.* from d

    can anyone suggest me which approach is better

    thanks in advance

    Raghavendra

  • - Personally I don't like those column level select statements in the select clause.

    Maybe the engine will even parse it into a join for you, but IMO it is a pain for maintenance afterward.

    - What are your test results with regards to statistics IO and time ?

    - Does sqlserver handle the query as you would expect ?

    -- if not: why not ?

    You may even want to post both sqlplans of your actual queries and have us take a look at them. ( post as attachment )

    That provides the huge advantage of us being able to interpret the statistics sqlserver used to compile your query.

    You should expect the column level correlated selects will produce RBAR behaviour. ( Read Jeff Modens articles on how to prevent that ! )

    Always start with the simplest version. ( KISS )

    i.e. select columnlist from object1 joinclause object2 onclause ...

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • ALZDBA (12/17/2010)[hrYou should expect the column level correlated selects will produce RBAR behaviour. ( Read Jeff Modens articles on how to prevent that ! )

    "It Depends". The joins used in correlated subqueries won't cause the subquery to necessarily act as RBAR and. you are correct, correlated subqueries are frequently converted to joins. Sometimes, they're not and sometimes they can be even faster than a regular join. For example, even a WHERE IN can be faster than an INNER JOIN (lots of code races on this site to substantiate that claim) although the two operate slightly diffently when it comes to NULLS.

    I can, however, just about guarantee that a correlated subquery with a SUM() or COUNT() along with an inequality in the correlation will cause a Triangular Join. Unless you're as tricky with internals as Paul White and know exactly which index to use, then that's where you run into problems with correlation.

    The bottom line is to always test with enough data to show if something is scalable or not or get real good at reading the arrows in the actual execution plan.

    That being said... I do agree that I prefer the "look" of joins over correlated subqueries and have found that joins are a lot easier to troubleshoot, as well. That's just a personal opinion.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Hi,

    Please find attached scripts to create and insert data into tables, also I uploaded execution plan for both the queries.

    I have written 2 queries one with inner join and other with correlated query, both are taking same amount of time to retrieve 1 million records.

    first query

    Select pname,location,scheduleinfo from providerschedule as ps with (nolock)

    inner join provider as p with (nolock) on ps.pid=p.pid

    inner join m_locations as ml with (nolock) on ps.locationid=ml.locationid

    second query

    Select (select pname from provider (nolock) where pid=ps.pid) as pname,

    (select location from m_locations (nolock) where locationid=ps.locationid)as location,

    scheduleinfo from providerschedule as ps with (nolock)

  • I stick with my first advise on this one.

    Keep it simple and code regular joins.

    The column level nested selects will only give you benefits if cardinality meets a treshhold.

    I certainly wouldn't accept your second query unless is was preceded with a nice comment on why this columnar select has been used.

    ( Although I see them on a regular basis, they always result in a remark and are a PITA for maintenance and performance tuning. Figuring out what the programmer wanted to do is always hard with these.)

    As you have coded, you only expect No rows or a single row of data in your nested selects. :crazy:

    You would expect the same result as a left join, but since it is at row level, multiple return values will generate an error.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply