Selecting non assigned items

Question

Post reply

Selecting non assigned items

grtn316

Ten Centuries

Points: 1239
More actions
March 24, 2008 at 2:12 pm

#180344

Hello,
I have an assigned_properties and a properties table.
I want to select all property_Ids that are not found in assigned_properties. This sounds so simple but for some reason I can't get my mind around it.
Examples?
Thanks!

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply

Jason Selburg SSC-Insane Points: 24641 More actions · Answer 1

What have you tried?

______________________________________________________________________

Personal Motto: Why push the envelope when you can just open it?

If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

Jason L. Selburg

RBarryYoung SSC Guru Points: 143329 More actions · Answer 2

Select ID

From Properties

Where ID Not IN (Select Assigned.PropertyID From Assigned)

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

grtn316 Ten Centuries Points: 1239 More actions · Answer 3

I know this is completely wrong as it selects the assigned records (and sorta duplicates efforts) but just so you can have an example:

Select * From Properties INNER JOIN (Select P1.* From Properties As P1 INNER JOIN Assigned_Project_Properties

ON Assigned_Project_Properties.Property_Id = P1.Property_Id) As P2 ON P2.Property_id = Properties.Property_Id

grtn316 Ten Centuries Points: 1239 More actions · Answer 4

grtn316

Ten Centuries

Points: 1239

March 24, 2008 at 2:35 pm

#793223

Thanks! I'm on the right track now.

Adam Haines SSC-Insane Points: 23217 More actions · Answer 5

SELECT *

FROM Property a

left JOIN Assigned_properties b

on a.propertyid = b.propertyid

where b.property_id is null

Also, you could use except

SELECT Property_id

from Property

except

select property_id

from assigned_properties

My blog: http://jahaines.blogspot.com

Jason Selburg SSC-Insane Points: 24641 More actions · Answer 6

FYI, I only asked what you've tried to make sure this wasn't homework. It still may be, but you've shown that you've put forth effort.

So ....

SELECT

*

FROM

Properties AS p

LEFT JOIN Assigned_Project_Properties AS ap

ON Assigned_Project_Properties.Property_Id = P1.Property_Id

WHERE

p.Property_Id IS NULL

Think of it this way, You are combining "Properties" and "Assigned_Project_Properties" data together on the "Property_Id". The records that do not have a cooresponding record in the "Assigned_Project_Properties" or RIGHT table because it's on the RIGHT side of the JOIN, will have a vaule of NULL for each field. So limiting the results to those gets you what you want.

Take of the WHERE clause, and it may become more apparent.

______________________________________________________________________

Personal Motto: Why push the envelope when you can just open it?

If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

Jason L. Selburg

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 7

Just my 2 cents on this one... I've found that the WHERE NOT IN that Barry used is usually more performant than outer joins. The speed difference isn't huge over a million rows, but it is faster. Compare the execution plans to see why.

I've not tried the EXCEPT intersection (or rather lack of intersection) yet.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

Jason Selburg SSC-Insane Points: 24641 More actions · Answer 8

Jeff Moden (3/25/2008)
Just my 2 cents on this one... I've found that the WHERE NOT IN that Barry used is usually more performant than outer joins. The speed difference isn't huge over a million rows, but it is faster. Compare the execution plans to see why.
I've not tried the EXCEPT intersection (or rather lack of intersection) yet.

Really? I'll have to keep that in mind and do some tests. It's not that I don't believe you, I'm just a see for myself kind of guy.

Now you've got me curious enought to see why this is the case, and to look deeper into the EXCEPT keyword.

Thanks for the 2 cents Jeff. Keep it up and I will have a whole $3.27 from you. 😉

______________________________________________________________________

Personal Motto: Why push the envelope when you can just open it?

If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

Jason L. Selburg

Adam Haines SSC-Insane Points: 23217 More actions · Answer 9

Except operates as an outer join. It compares the results of both queries and returns distinct values of the left table. These are the values that exist in the left table, but not in the right. Thus, it would provide the same execution plan as the left outer join.

My blog: http://jahaines.blogspot.com

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 10

Adam Haines (3/25/2008)
Except operates as an outer join. It compares the results of both queries and returns distinct values of the left table. These are the values that exist in the left table, but not in the right. Thus, it would provide the same execution plan as the left outer join.

Thanks for the info, Adam... have you actually done a comparison/performance test on EXCEPT?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 11

Jason Selburg (3/25/2008)
Really? I'll have to keep that in mind and do some tests. It's not that I don't believe you, I'm just a see for myself kind of guy.
Now you've got me curious enought to see why this is the case, and to look deeper into the EXCEPT keyword.

I've got some old tests laying around somewhere... I'll see if I can dig them up, as well.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

Adam Haines SSC-Insane Points: 23217 More actions · Answer 12

I have done a small scale test (not the million row type you do) and it did in fact use the same query plan. Obviously, the main difference here is the tables are not joined. The results of each query is compared, not just key columns.

For every result that differs, it is considered distinct and thus returned. To avoid this behavior on keys you should only list values that should be common among the two queries. For example, if a column named update_dt is different but everything else in the row is the same, the row is still considered distinct, therefore, the update_dt column should not be included.

My blog: http://jahaines.blogspot.com

Adam Haines SSC-Insane Points: 23217 More actions · Answer 13

After a little research it seems that it is equivlant to a left join, but the actual plan is more like a NOT IN statement.

See the link below:

http://blog.sqlauthority.com/2007/05/22/sql-server-2005-comparison-except-operator-vs-not-in/

My blog: http://jahaines.blogspot.com

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 14

Thanks for the link, Adam... that reminded me where I put my test code... and I added "EXCEPTION" to it...

First, I just don't trust execution plans, estimated or actual... especially where the % of batch comes in.

Remember I said that WHERE NOT IN was better than LEFT OUTER JOIN with a null search? Here's the proof... with a caveat that I'll explain in a minute...

--===== Select the database to use

USE AdventureWorks

GO

--===== Test the WHERE NOT IN method

PRINT REPLICATE('=',80)

PRINT '========== Test the WHERE NOT IN method =========='

SET STATISTICS IO ON

SET STATISTICS TIME ON

SELECT ProductID

FROM Production.Product

WHERE ProductID

NOT IN (SELECT ProductID

FROM Production.WorkOrder)

SET STATISTICS TIME OFF

SET STATISTICS IO OFF

GO

--===== Test the WHERE NOT EXISTS method

PRINT REPLICATE('=',80)

PRINT '========== Test the WHERE NOT EXISTS method =========='

SET STATISTICS IO ON

SET STATISTICS TIME ON

SELECT p.ProductID

FROM Production.Product p

WHERE NOT EXISTS (SELECT 1 FROM Production.WorkOrder w WHERE w.ProductID = p.ProductID)

SET STATISTICS TIME OFF

SET STATISTICS IO OFF

GO

--===== Test the EXCEPT method

PRINT REPLICATE('=',80)

PRINT '========== Test the EXCEPT method =========='

SET STATISTICS IO ON

SET STATISTICS TIME ON

SELECT ProductID

FROM Production.Product

EXCEPT --------------------->>>

SELECT ProductID

FROM Production.WorkOrder

SET STATISTICS TIME OFF

SET STATISTICS IO OFF

GO

--===== Test the LEFT OUTER JOIN method

PRINT REPLICATE('=',80)

PRINT '========== Test the LEFT OUTER JOIN method =========='

SET STATISTICS IO ON

SET STATISTICS TIME ON

SELECT p.ProductID

FROM Production.Product p

LEFT OUTER JOIN Production.WorkOrder w

ON p.ProductID = w.ProductID

WHERE w.ProductID IS NULL

SET STATISTICS TIME OFF

SET STATISTICS IO OFF

GO

... the caveat is that the fewer number of rows there are, the worse the LEFT OUTER JOIN does... but use a million rows looking for only 50,000 missing rows, and everyone becomes equal...

--===== Create the test tables

SELECT TOP 1000000

RowNum = IDENTITY(INT,1,1)

INTO #JBMTestBig

FROM Master.dbo.SysColumns t1,

Master.dbo.SysColumns t2 --Lack of join criteria makes this a CROSS-JOIN

ALTER TABLE #JBMTestBig

ADD CONSTRAINT PK_JBMTestBig_RowNum PRIMARY KEY CLUSTERED (RowNum)

SELECT TOP 950000

RowNum = IDENTITY(INT,1,1)

INTO #JBMTestSmall

FROM Master.dbo.SysColumns t1,

Master.dbo.SysColumns t2 --Lack of join criteria makes this a CROSS-JOIN

ALTER TABLE #JBMTestSmall

ADD CONSTRAINT PK_JBMTestSmall_RowNum PRIMARY KEY CLUSTERED (RowNum)

GO

--===== Rerun from here down for addition testing in same connection

--===== Test the WHERE NOT IN method

PRINT REPLICATE('=',80)

PRINT '========== Test the WHERE NOT IN method =========='

SET STATISTICS IO ON

SET STATISTICS TIME ON

SELECT RowNum

FROM #JBMTestBig

WHERE RowNum

NOT IN (SELECT RowNum

FROM #JBMTestSmall)

SET STATISTICS TIME OFF

SET STATISTICS IO OFF

GO

--===== Test the WHERE NOT EXISTS method

PRINT REPLICATE('=',80)

PRINT '========== Test the WHERE NOT EXISTS method =========='

SET STATISTICS IO ON

SET STATISTICS TIME ON

SELECT p.RowNum

FROM #JBMTestBig p

WHERE NOT EXISTS (SELECT 1 FROM #JBMTestSmall w WHERE w.RowNum = p.RowNum)

SET STATISTICS TIME OFF

SET STATISTICS IO OFF

GO

--===== Test the EXCEPT method

PRINT REPLICATE('=',80)

PRINT '========== Test the EXCEPT method =========='

SET STATISTICS IO ON

SET STATISTICS TIME ON

SELECT RowNum

FROM #JBMTestBig

EXCEPT --------------------->>>

SELECT RowNum

FROM #JBMTestSmall

SET STATISTICS TIME OFF

SET STATISTICS IO OFF

GO

--===== Test the LEFT OUTER JOIN method

PRINT REPLICATE('=',80)

PRINT '========== Test the LEFT OUTER JOIN method =========='

SET STATISTICS IO ON

SET STATISTICS TIME ON

SELECT p.RowNum

FROM #JBMTestBig p

LEFT OUTER JOIN #JBMTestSmall w

ON p.RowNum = w.RowNum

WHERE w.RowNum IS NULL

SET STATISTICS TIME OFF

SET STATISTICS IO OFF

GO

If they're all the same on the big tables and the OUTER JOIN is the slowest on smaller tables, why would you ever use the OUTER JOIN method to find the exceptions?

Also, on the link that Adam was good enough to share, there may be some problems with using TOP with EXCEPTION (dunno why you might want to do that, anyway, but who knows).

For a single column comparison, WHERE NOT IN is the easiest to write.

Like I said before, WHERE NOT IN is better than the LEFT OUTER JOIN... if more than 1 column is involved, WHERE NOT EXISTS would be my next choice. I doubt if I'd use EXCEPTION because I don't trust it yet, especially when folks say it has problems in certain areas.

But, as long as this post is, it's just my opinion. 😉 Use what you feel more comfortable with.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions