Multiple count() in a single SQL

  • siva 20997 (3/6/2012)


    If effeincey was the objective the C++ lanaguage would have been kicked in the first place. All object oriented lanaguages added so much to burden for the application. But we still pursued here we are today we use nothing but Object oriented code in every language. If Microsoft thought that this 386 Intel chip is useless for Windows we would have never had the windows.

    We are talking about SQL here not programming languages.

    I want to address the point of scalability. I think of scalability has reaching the masses. Not increasing the size.

    When you increase the number of users you increase the amount of data. This is when slow processes become painfully obvious. This is why t-sql needs to be scalable. What will work with 10k rows may be horrible with 100k rows.

    Microsoft wanted to kill Access in favour of SQLExpress far back as year 2000. It still has not managed it. Only by developing system entireley in SQL then you can shift to millions and then put your own front end on it be it chinese or english write in ASP,PHP, C or VB. Just like changing the skin on a mobile phone. Just like people use paypal functions to post payments use TSQL stored procedure to post invoices etc. No need to undertand how it does it or check the data. because it is packaged and shipped as such.

    HUH???

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Some food for thought.......

    100M row table (100 MILLION)

    SQL2008R2 12GB RAM

    SELECT TOP 100000000 ---- 100 MILLION rows .....

    ID = IDENTITY(INT, 1, 1),

    igroup = 1 + CAST(Abs(Checksum(Newid()) %4 ) AS INT)

    INTO SivaTest

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    CROSS JOIN sys.all_columns ac3

    CROSS JOIN sys.all_columns ac4

    ALTER TABLE [dbo].[SivaTest] ADD CONSTRAINT [PKst_] PRIMARY KEY CLUSTERED ([ID] ASC)

    GO

    CREATE NONCLUSTERED INDEX [ix] ON [dbo].[SivaTest]

    ([igroup] ASC)

    ********* HowardW's COUNT Method *********

    (1 row(s) affected)

    Table 'SivaTest'. Scan count 17, logical reads 174192, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 62977 ms, elapsed time = 4395 ms.

    Warning: Null value is eliminated by an aggregate or other SET operation.

    ------------------------------------------------------------------------------------------------------------------------------------------------------

    ********* Siva's WITH & PIVOT Method *********

    (1 row(s) affected)

    Table 'SivaTest'. Scan count 34, logical reads 349864, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 47955 ms, elapsed time = 3556 ms.

    edit execution plan attached

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • are you saying my method was faster even with the UNION ?

  • siva 20997 (3/6/2012)


    are you saying my method was faster even with the UNION ?

    Why are you obsessed with speed? Did you not see the reads and scans?

    Jared
    CE - Microsoft

  • As I said before my background is from else where

    Can you explin to me what the stats shows ?

  • siva 20997 (3/6/2012)


    As I said before my background is from else where

    Can you explin to me what the stats shows ?

    It means that your method doubled the logical reads and the scans. So this statement "Also I do not beilve as someone said that I am scanning the databse twice etc." is false. These things aren't opinions, they are facts that you are refusing to believe. Whether you believe them or not, they are in fact true.

    Jared
    CE - Microsoft

  • What I would like to know Livingstone is if everything imroves if we take the UNION out

    I think then it will half in time

    If it did that i can suggest a differnt way of working the total again based on cascading routines

  • What I meant was that it didnt read it twice because of my Group and Pivot policy

    It read twice because of the UNION

    if Livinstone will let me have the stats after removing the union and the Total column I can improce on this by improving the Pivot code or cascading to anther level with CTE

  • siva 20997 (3/6/2012)


    What I would like to know Livingstone is if everything imroves if we take the UNION out

    I think then it will half in time

    If it did that i can suggest a differnt way of working the total again based on cascading routines

    Siva...do you have access to SQL at the moment?

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • yes I have but I dont know how to get the stats etc

    I get your point.

    I will try

  • siva 20997 (3/6/2012)


    yes I have but I dont know how to get the stats etc

    I get your point.

    I will try

    cut and paste the code into a SSMS query window...suggest you change the code to 1M rows (will be faster)

    run the querys and look at the "messages" tab in the results window.

    if not clear post back....good luck

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • After the test Livingstone did I changed my code so that it does not scan twice and this is the code

    With GroupCTE(iGroup,nCount) as

    (SELECT igroup,COUNT(id) FROM SivaTest group by igroup),

    PivotedCTE(Group1,Group2,Group3,Group4) as

    (select coalesce([1],0) as Group1,coalesce([2],0) as Group2,coalesce([3],0) as Group3,coalesce([4],0) as Group4 from

    (Select iGroup,nCount from groupCTE)

    DataTable Pivot

    (Sum (nCount)

    for iGroup in ([1],[2],[3],[4])

    ) as PivotTable)

    Select (Group1+Group2+Group3+Group4) as Total,Group1,Group2,Group3,Group4 From PivotedCTE

    with that I tested 10 million rows. The timings I got are

    ************************* Howards Method *********************

    Warning: Null value is eliminated by an aggregate or other SET operation.

    (1 row(s) affected)

    Table 'SivaTest'. Scan count 1, logical reads 17347, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 3782 ms, elapsed time = 3779 ms.

    *************************** With Siva's new method ****************

    (1 row(s) affected)

    Table 'SivaTest'. Scan count 1, logical reads 17347, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 1562 ms, elapsed time = 1556 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    I have my own theorey whay this is so fast, but I like others comment first P

  • After I posted the last bit I had ideas of improving the code like

    With PivotedCTE(Group1,Group2,Group3,Group4) as

    (select coalesce([1],0) as Group1,coalesce([2],0) as Group2,coalesce([3],0) as Group3,coalesce([4],0) as Group4 from

    (Select iGroup,id from SivaTest)

    DataTable Pivot

    (count (id)

    for iGroup in ([1],[2],[3],[4])

    ) as PivotTable)

    Select (Group1+Group2+Group3+Group4) as Total,Group1,Group2,Group3,Group4 From PivotedCTE

    However the Time doubled. This again confirms a theorey I have above parralell processing but someone looking at the execution plan might confirm that. Once confirmed I will write what I think

    (1 row(s) affected)

    Table 'SivaTest'. Scan count 1, logical reads 17347, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 3875 ms, elapsed time = 3882 ms.

  • With this, below, I have to agree with Jared. It tells us that you are unwilling or unable to question your own processes. That even if shown a better method you won't except it.

    SQLKnowItAll (3/6/2012)


    siva 20997 (3/6/2012)


    I may be defensive but I am commited to my methods. No one is going to changing that.

    There's your problem. Good luck to you.

    But seeing this tells us that you may have seen the light. The reason to challenge what you yourself have written. A sudden willingness to start looking at alternatives. By doing this, you are now going to be able to grow further in your knowledge and skill in developing solid, high performance, scalable code. Keep going in this direction.

    siva 20997 (3/6/2012)


    After the test Livingstone did I changed my code so that it does not scan twice and this is the code

    With GroupCTE(iGroup,nCount) as

    (SELECT igroup,COUNT(id) FROM SivaTest group by igroup),

    PivotedCTE(Group1,Group2,Group3,Group4) as

    (select coalesce([1],0) as Group1,coalesce([2],0) as Group2,coalesce([3],0) as Group3,coalesce([4],0) as Group4 from

    (Select iGroup,nCount from groupCTE)

    DataTable Pivot

    (Sum (nCount)

    for iGroup in ([1],[2],[3],[4])

    ) as PivotTable)

    Select (Group1+Group2+Group3+Group4) as Total,Group1,Group2,Group3,Group4 From PivotedCTE

    with that I tested 10 million rows. The timings I got are

    ************************* Howards Method *********************

    Warning: Null value is eliminated by an aggregate or other SET operation.

    (1 row(s) affected)

    Table 'SivaTest'. Scan count 1, logical reads 17347, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 3782 ms, elapsed time = 3779 ms.

    *************************** With Siva's new method ****************

    (1 row(s) affected)

    Table 'SivaTest'. Scan count 1, logical reads 17347, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 1562 ms, elapsed time = 1556 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    I have my own theorey whay this is so fast, but I like others comment first P

  • I just know I'm going to regret getting involved with this one. 😛 Testing in progress...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 15 posts - 31 through 45 (of 56 total)

You must be logged in to reply to this topic. Login to reply