Multi Row Subquery

  • The example below is fictious, but hopefully explains what I want to do:

    I have two tables:

    Sales Order Header (Primary Key: SOID)

    Sales Order Details (Primary Key: SOID and Line Number)

    The Sales Order Details table also contains a field called ItemToSell.

    I need to return a list of Sales Order Headers where one or more of the Sales Order Details sells a specific item. Below is the query that I wrote. I think it is correct, but I am not sure. I would appreciate it if someone could confirm that this is the correct method or point out a better one.

    Thanks

    Select SOID, CustomerNumber, PoNumber

    From SalesOrderHeader

    Where SOID

    In (Select Distinct(SOID) From SalesOrderDetails where ItemToSell = "CAR")

    ps. For the sake of this example I hardcoded "CARS". I would not normally do this.

  • The query you've written appears to be valid. However, a better performing and more commonly accepted methodology is to use a JOIN rather than a subquery in a case like this. Consider the following:

    Select SOID, CustomerNumber, PoNumber

    From SalesOrderHeader h

    INNER JOIN SalesOrderDetails d ON h.SOID = d.SOID and d.ItemToSell = 'CAR'

    This may yield better performance, particularly on large sets of data.

    hth,

    Tim

    Tim Mitchell, Microsoft Data Platform MVP
    Data Warehouse and ETL Consultant
    TimMitchell.net | @Tim_Mitchell | Tyleris.com
    ETL Best Practices

  • Tim Mitchell (7/13/2009)


    The query you've written appears to be valid. However, a better performing and more commonly accepted methodology is to use a JOIN rather than a subquery in a case like this. Consider the following:

    Select SOID, CustomerNumber, PoNumber

    From SalesOrderHeader h

    INNER JOIN SalesOrderDetails d ON h.SOID = d.SOID and d.ItemToSell = 'CAR'

    This may yield better performance, particularly on large sets of data.

    hth,

    Tim

    I probably was not clear in my post. The reason that I didn't do a join is because I could have had more then one SalesOrderDetails row where ItemToSell was CAR. This would have resulted in the Sales Order Header being returned multiple times.

    I only wanted a unique list of Sales Order Headers where one or more of the corresponding SalesOrderDetails was a CAR.

    Am I making any sense?

    Thanks

  • Sure... you can actually do a DISTINCT query on all items:

    SELECT DISTINCT h.SOID, h.CustomerNumber, h.PoNumber

    FROM SalesOrderHeader h

    INNER JOIN SalesOrderDetails d ON h.SOID = d.SOID and d.ItemToSell = 'CAR'

    This will return only the distinct combination of all of your data elements.

    hth,

    Tim

    Tim Mitchell, Microsoft Data Platform MVP
    Data Warehouse and ETL Consultant
    TimMitchell.net | @Tim_Mitchell | Tyleris.com
    ETL Best Practices

  • Tim Mitchell (7/13/2009)


    Sure... you can actually do a DISTINCT query on all items:

    SELECT DISTINCT h.SOID, h.CustomerNumber, h.PoNumber

    FROM SalesOrderHeader h

    INNER JOIN SalesOrderDetails d ON h.SOID = d.SOID and d.ItemToSell = 'CAR'

    This will return only the distinct combination of all of your data elements.

    hth,

    Tim

    Thanks. I had actually thought of your solution initially, but for some reason it seemed wrong. Now after looking at it again, it looks like the best solution. I guess I will chalk up my initial hesitancy as a brain cramp.

    Thanks again.

  • Hi,

    You can use the following querry for better performance

    SELECT h.SOID, h.CustomerNumber, h.PoNumber

    FROM SalesOrderHeader h

    WHERE EXISTS

    (Select top 1 1

    from SalesOrderDetails d

    where h.SOID = d.SOID

    and d.ItemToSell = 'CAR')

  • manish.singh (7/16/2009)


    Hi,

    You can use the following querry for better performance

    SELECT h.SOID, h.CustomerNumber, h.PoNumber

    FROM SalesOrderHeader h

    WHERE EXISTS

    (Select top 1 1

    from SalesOrderDetails d

    where h.SOID = d.SOID

    and d.ItemToSell = 'CAR')

    While I am familar with the Top 1 of something I am not sure what the top 1 1 means?

    Thanks

  • Hi,

    The querry checks for the first occurernce of the Sales Order Id in the Sales order detail table. we can right select top 1 salesorderid (or some other column) from the details table. So it fetches the value of that particular column.

    But if we write select "1" from a particular table then it will return "1" in the result set instead of fetching values from table. Its a kind of dummy value. We can give any thing like

    Select 'A' from salesorderdetails;

    We also use these dummy values while doing the record counts or exsits checks. For example:

    Select count(*) from salesorderdetail

    can be written as

    Select count(salesorderid) from salesorderdetail

    and it can be further written as

    Select count(1) from salesorderdetail or

    Select count(2) from salesorderdetail or

    Select count('A') from salesorderdetail

    or any dummy value.

  • The second "1" is just a constant. For example, if you do the following:

    SELECT 1 FROM SaleOrderDetails

    Then it will just return a single column with 1 in it, the number of rows being the number of rows in the table.

    The theory is that returning a constant is less expensive than retrieving an actual table column, especially if said column is not covered by an index...

  • Mansfield (7/16/2009)


    The second "1" is just a constant. For example, if you do the following:

    SELECT 1 FROM SaleOrderDetails

    Then it will just return a single column with 1 in it, the number of rows being the number of rows in the table.

    The theory is that returning a constant is less expensive than retrieving an actual table column, especially if said column is not covered by an index...

    I want to thank all of the posters for helping me out with this.

    Thanks again

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply