T-SQL counts

Question

T-SQL counts

Viewing 15 posts - 16 through 29 (of 29 total)

You must be logged in to reply to this topic. Login to reply

TomThomson SSC Guru Points: 104773 More actions · Answer 1

Jeff Moden (10/1/2013)
Actually, this is going to cause a mess for me when we upgrade from 2005 because NULL is considered to be a DISTINCT value. For 2005, the correct answer is "4". For 2008 and above, the correct answer is 3. The 2008+ version works as I would expect because just a simple COUNT(ID) would ignore the NULL. In this particular case, that would still return a 4 because even though the "2" is duplicated, there are 4 non-null values.
The reason why it's going to make a mess for me is because a lot of the developers used it fully expecting NULL to be considered a DISTINCT value. This is one of those changes in how SQL Server works that could produce some serious undetected problems if you built your code in 2005 and you upgrade later.

Your developers must be doing smething pretty queer. BOL for 2005 says COUNT(DISTINCT x) returns a count of distinct non-null values. SQL 2000 always returned a count of distinct non-null values for me, and I used it throuhghout its fully supported life and a bit longer, it was the same whether you had enterprise, standard, or MSDE versions and teh same whichever service pack you had. Maybe they are setting some queer options that break it (like ansi nulls off or something? although it would be surprising if that change the behaviour of COUNT). And vv_kirov's test shows that BOL is right for at least for one version of 2005, and Jens-Peter hasthe same with a later 2005 build.

Tom

PChiragS SSCarpal Tunnel Points: 4965 More actions · Answer 2

PChiragS

SSCarpal Tunnel

Points: 4965

October 2, 2013 at 8:10 am

#1655224

easy one..

Thanks Smith..

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 3

I struck out my previous comments. I don't know what happened last night but I can't seem to duplicate the problem today. I hate it when this happens.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

SQLRNNR SSC Guru Points: 281324 More actions · Answer 4

Jeff Moden (10/2/2013)
I struck out my previous comments. I don't know what happened last night but I can't seem to duplicate the problem today. I hate it when this happens.

I think we have all done that somewhere along the way 😉

Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events

Dana Medley SSCertifiable Points: 6764 More actions · Answer 5

Great question. Almost overlooked the distinct in the count().

Everything is awesome!

Revenant SSC-Forever Points: 42467 More actions · Answer 6

Revenant

SSC-Forever

Points: 42467

October 2, 2013 at 7:53 pm

#1655370

Nice and easy - thanks, Samith!

Mr. Kapsicum SSCertifiable Points: 6128 More actions · Answer 7

Mr. Kapsicum

SSCertifiable

Points: 6128

October 3, 2013 at 12:03 am

#1655397

didn't need to think twice,,,

easy one. 🙂

marlon.seton SSCrazy Points: 2623 More actions · Answer 8

marlon.seton

SSCrazy

Points: 2623

October 3, 2013 at 2:52 am

#1655450

Hany Helmy (10/2/2013)
Easy.

Ditto.

Jamsheer Ten Centuries Points: 1136 More actions · Answer 9

Nice and simple question. Lot to think.. thanks Samith..

Jamsheer Ten Centuries Points: 1136 More actions · Answer 10

tom.w.brannon (10/2/2013)
I believe the problem is a difference in how distinct gets used in different contexts. For the following queries
select count(distinct id) from #temp_test;
select distinct id from #temp_test;
select count(*) from (select distinct id from #temp_test) a;
the results are 3, 4 rows returned, and 4. So NULL is a distinct value but not counted as a distinct value. Maybe somebody else can answer why this makes sense.

Good work Tom.. If you use 'select count(id)' in the last select query instead of 'select count(*)' you will get the result as 3. If we don't use the field name in count(distinct) clause, it will not eliminate NULL.

Tee Time Hall of Fame Points: 3745 More actions · Answer 11

Tee Time

Hall of Fame

Points: 3745

October 3, 2013 at 6:28 am

#1655531

Thanks!

sqlnaive SSCoach Points: 17435 More actions · Answer 12

Jamsheer (10/3/2013)
tom.w.brannon (10/2/2013)
I believe the problem is a difference in how distinct gets used in different contexts. For the following queries
select count(distinct id) from #temp_test;
select distinct id from #temp_test;
select count(*) from (select distinct id from #temp_test) a;
the results are 3, 4 rows returned, and 4. So NULL is a distinct value but not counted as a distinct value. Maybe somebody else can answer why this makes sense.
Good work Tom.. If you use 'select count(id)' in the last select query instead of 'select count(*)' you will get the result as 3. If we don't use the field name in count(distinct) clause, it will not eliminate NULL.

Tom, a value of NULL indicates that the value does not exists and is unknown. For the DISTINCT keyword, null values are considered to be duplicates of each other. When DISTINCT is included in a SELECT statement, only one NULL is returned in the results, regardless of how many null values are encountered.Check the following link:

http://technet.microsoft.com/en-us/library/ms187831(v=sql.105).aspx

In case of count(DISTINCT), it returns the number of unique non null values. See the following link for count(DISTINCT):

http://technet.microsoft.com/en-us/library/ms175997.aspx