Count(*) Question

  • I've heard that Count(*) is a bad practice...

    Can anyone tell me why?

    Thanks

  • You should use SELECT COUNT(some_column) instead. If you use the *-sign you get much more I/O.

  • SQL Server has to count each and every row when you use COUNT(*). But the same data is kept elsewhere.

    Test this:

    SELECT rowcnt

    FROM sysindexes

    WHERE id = (SELECT id

                      FROM sysobjects

                      WHERE name = 'yourtablename')

    Obviously, replace yourtablename with the name of the table you want the row count from. Run it in the database that has the table you want to check and run DBCC UPDATEUSAGE ('yourdbname', 'yourtablename') first - that will update sysindexes.

    -SQLBill

  • Oh, forgot about that one. That's a better way of doing it if you just want to count all the rows i the table. But if you need a WHERE clause as well...

  • OOOPPPS ... I left out something important.

    It should have been:

    SELECT rowcnt

    FROM sysindexes

    WHERE id = (SELECT id

                      FROM sysobjects

                      WHERE name = 'yourtablename')

        AND indid < 2

    I ran the above script against one of my huge tables and it returned: 87274758 in less than one second. Running SELECT COUNT(*) against the same table gave the same result but it took longer (1 minute 1 second).

    Argument against COUNT(*)....takes a while.

    Argument for COUNT(*)....it will give the count of the rows at the moment it is run.

    Argument for the rowcnt version...very quick

    Argument against the rowcnt version...it is not up-to-the-second accurate. If your database is constantly getting new rows, this is only as accurate as the last DBCC UPDATEUSAGE.

    Bottom line: What do you need to know?

    The amount of rows at this second?

       -use COUNT(*)

    The amount of rows and there's not a lot of rows being inserted?

      -use DBCC UPDATEUSAGE followed by the rowcnt script.

    -SQLBill

  • Why does SELECT COUNT(*) take more I/O than SELECT COUNT(some_column) ? One would think it still has to step through each row to count.

     

  • No idea. I just know because I asked the same question about a year ago and someone smarter told me that.

    Damn, I never get the chance to shine

  • Also, depending on ANSI settings, a COUNT(*) can cause NULL entries to be excluded from the COUNT.

    --------------------
    Colt 45 - the original point and click interface

  • COUNT(*) counts all rows in the table for both base tables and views. The sysindexes query works for base tables and not views.

    Also, COUNT(Some_Column) may return a lower number than COUNT(*). BOL states that COUNT(expression) evaluates the expression for each row in a group and returns the number of nonnull values. So be careful which column you choose to count - make sure you use a required column or better still the primary key for the table.

    Keith

  • Count(some_column) does not work with certain datatypes eg. uniqueidentifier

  • sp_spaceused tablename will give the no of rows - much faster

  • sp_spaceused can also give the "possible innaccurate" value from sysindexes,

    unless the paramter @updateusage is set true.

    The belief that count(*) is bad practice and that you should use count(0) or count(column_name) instead is a myth.

    Maybe on some other dbsystems or earlier versions of sql server this was true, but on sql server 2000 it simply is not true.

    If you use COUNT(column_name) on a column that allows NULL than it will not count the NULL values, and it will be slower than COUNT(*) because sql server has to weed out the NULL values.

    Compare execution plans if you are not convinced.

    /rockmoose


    You must unlearn what You have learnt

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply