Should I Type * or 1 With the EXISTS Logical Operator

Question

Should I Type * or 1 With the EXISTS Logical Operator

Viewing 15 posts - 16 through 30 (of 38 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 1

Hugo Kornelis - Monday, April 24, 2017 9:33 AM
And scanning a table with 1,000 integer columns takes more time, because the row size is bigger and hence less rows fit on the page.

It's only going to read one row though, won't scan the whole table.
Could be the index tree depth is a factor.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Hugo Kornelis SSC Guru Points: 64748 More actions · Answer 2

GilaMonster - Monday, April 24, 2017 3:40 PM
Hugo Kornelis - Monday, April 24, 2017 9:33 AM
And scanning a table with 1,000 integer columns takes more time, because the row size is bigger and hence less rows fit on the page.
It's only going to read one row though, won't scan the whole table.
Could be the index tree depth is a factor.

Nope. Look at the plan. It's doing a Hash Match (Outer Join) between the two tables, that are each fully scanned. (With a bitmap pushed into the second scan to reduce the number of rows returned - they're still read but not returned to the scan operator unless they are a match or a hash collission with a match).
A single full scan over both tables is more efficient than scanning the product table and doing a seek for each of its one million rows.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 3

Hugo Kornelis - Monday, April 24, 2017 4:04 PM
GilaMonster - Monday, April 24, 2017 3:40 PM
Hugo Kornelis - Monday, April 24, 2017 9:33 AM
And scanning a table with 1,000 integer columns takes more time, because the row size is bigger and hence less rows fit on the page.
It's only going to read one row though, won't scan the whole table.
Could be the index tree depth is a factor.
Nope. Look at the plan. It's doing a Hash Match (Outer Join) between the two tables, that are each fully scanned. (With a bitmap pushed into the second scan to reduce the number of rows returned - they're still read but not returned to the scan operator unless they are a match or a hash collission with a match).
A single full scan over both tables is more efficient than scanning the product table and doing a seek for each of its one million rows.

Sorry, I read the article hours before replying, thought I remembered a loop join.

In that case, absolutely it's the size of the rows that causing the time increase, nothing to do with the EXISTS itself. A join would show the same behaviour

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

v.malyutin SSChasing Mays Points: 602 More actions · Answer 4

GilaMonster - Monday, April 24, 2017 4:13 PM
Hugo Kornelis - Monday, April 24, 2017 4:04 PM
GilaMonster - Monday, April 24, 2017 3:40 PM
Hugo Kornelis - Monday, April 24, 2017 9:33 AM
And scanning a table with 1,000 integer columns takes more time, because the row size is bigger and hence less rows fit on the page.
It's only going to read one row though, won't scan the whole table.
Could be the index tree depth is a factor.
Nope. Look at the plan. It's doing a Hash Match (Outer Join) between the two tables, that are each fully scanned. (With a bitmap pushed into the second scan to reduce the number of rows returned - they're still read but not returned to the scan operator unless they are a match or a hash collission with a match).
A single full scan over both tables is more efficient than scanning the product table and doing a seek for each of its one million rows.
Sorry, I read the article hours before replying, thought I remembered a loop join.
In that case, absolutely it's the size of the rows that causing the time increase, nothing to do with the EXISTS itself. A join would show the same behaviour

Why are you talking about scans when the last section of the article is about the compilation time? And once again.

Query Store and ostress

Never give up. There could be one more useful test. I am going to compile each query 100 times and take a look at statistics from Query Store. Again,

The queries are #1 and #3
The indexes are re-created
The count of columns in ProductOrder is 1024
The OPTION(RECOMPILE) is used

RonKyle SSC-Dedicated Points: 31482 More actions · Answer 5

Can you show me up an execution plan from Access?

The versions of Access that I used regularly (Access 95 - 2003) and learned TOP 1 * did not have the ability to show an execution plan. I don't work with Access on a routine basis anymore, though I recently did a desktop application in Access 2016. If the newer versions can show execution plans, that is news to me. I'll have to look.

simon.barnes Old Hand Points: 370 More actions · Answer 6

I have it in my head that the exists (select * …) construct is adherent to the ANSI standards. I don't have a source on that unfortunately.

Hugo Kornelis SSC Guru Points: 64748 More actions · Answer 7

simon.barnes - Tuesday, April 25, 2017 6:46 PM
I have it in my head that the exists (select * â€¦) construct is adherent to the ANSI standards. I don't have a source on that unfortunately.

That's what I thought, but I checked and it turns out I was wrong.

I do have several versions of the ANSI standard (the final version of SQL1992, and late draft versions of SQL2003 and SQL2011). In all those versions, the EXISTS predicate is described as accepting a "<table subquery>". If the cardinality of that subquery is zero, the EXISTS predicate is false, otherwise it is true.
A <table subquery> is any subquery that returns zero or more rows of one or more columns.

So, SELECT * is allowed in this context, but not required.

(Personally, I do like how the * represents the row instead of individual columns, which matches the semantics of EXISTS: checking the existence of a row)

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

simon.barnes Old Hand Points: 370 More actions · Answer 8

Hugo Kornelis - Wednesday, April 26, 2017 3:47 AM
simon.barnes - Tuesday, April 25, 2017 6:46 PM
I have it in my head that the exists (select * â€¦) construct is adherent to the ANSI standards. I don't have a source on that unfortunately.
That's what I thought, but I checked and it turns out I was wrong.
I do have several versions of the ANSI standard (the final version of SQL1992, and late draft versions of SQL2003 and SQL2011). In all those versions, the EXISTS predicate is described as accepting a "<table subquery>". If the cardinality of that subquery is zero, the EXISTS predicate is false, otherwise it is true.
A <table subquery> is any subquery that returns zero or more rows of one or more columns.
So, SELECT * is allowed in this context, but not required.
(Personally, I do like how the * represents the row instead of individual columns, which matches the semantics of EXISTS: checking the existence of a row)

Ah, nice! Yeah, I would've checked it out but I finally went to go and actually grab copies of 92 and the latest a few weeks ago and hadn't realised until then that they weren't free! I'm a lot more open to using proprietary functions and constructs than I used to be, on the basis that the likelihood of migrating to another DBMS is low and the likelihood of doing so seamlessly is even lower; but in a situation where it's possible to adhere to ANSI without incurring penalties I'd rather do so for readability's sake. That said, the 1/0 version does appeal to the trickster in me as well as offering the opportunity to flag up "true" SELECT * cases more easily 😀

SQLRNNR SSC Guru Points: 281324 More actions · Answer 9

Luis Cazares - Monday, April 24, 2017 9:55 AM
By the way, thank you for writing about this.
I actually had a conversation on the same subject with a co-worker today (and with another one some days ago). It's good to have proof of someone else testing it. However, one of my favorite tests is when I use WHERE EXISTS( SELECT 1/0 FROM Table....) or today I used SELECT COUNT(1/0) FROM Table.
Everyone is amazed with that construct.

I love using that construct. It throws everybody for a loop.

Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events

VSSGeorge SSCrazy Eights Points: 8146 More actions · Answer 10

I cannot take a printout of this article. It is very small when I printed.

lankymart SSC Veteran Points: 207 More actions · Answer 11

What about using?

EXISTS (SELECT NULL ...)

This reply was modified 4 years, 8 months ago by lankymart.

Banana-823045 SSCommitted Points: 1689 More actions · Answer 12

FWIW, I'd advocate writing SELECT NULL in an EXISTS mainly for the semantics. It feels weird to see * or 1 or even anything in a SELECT list that'll be discarded. Luis already mentioned the case where SELECT 1/0 won't error in EXISTS which is a solid argument that we should simply not care what there is. What better way to indicate that we really don't care by using NULL?

Richard.Bielawski SSC Rookie Points: 41 More actions · Answer 13

Did you miss a big point? There is actually a very good reason to name a specific column!

If you don't have access to every single column in the table then any value except a specific column that you do have access to will cause a security failure.

All generic values such as *, 1 or null cause all columns to be checked. If you name a specific column, that's the only column that will be checked.

Michael L John One Orange Chip Points: 26551 More actions · Answer 14

Richard.Bielawski wrote:

Did you miss a big point? There is actually a very good reason to name a specific column!
If you don't have access to every single column in the table then any value except a specific column that you do have access to will cause a security failure.
All generic values such as *, 1 or null cause all columns to be checked. If you name a specific column, that's the only column that will be checked.

In an WHERE EXISTS, the columns do not get checked. That's kind of the point of this article.

Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/

Melvis SSC Journeyman Points: 82 More actions · Answer 15

It is MUCH easier to type "ssf" and have SQL Prompt convert that to "SELECT * FROM" 😉