The "Numbers" or "Tally" Table: What it is and how it replaces a loop.

  • It looks like your code as written fails if it encounters February 28 in a non leap year. (when I did the rbar approach using a cursor, I found the row it failed at was 2/28/09).

  • Additionally, cutting and pasting your last example into my ssms and running it as is

    --===== Presets

    DECLARE @DateStart DATETIME

    DECLARE @DateEnd DATETIME

    SELECT @DateStart = '2008-01-01 06:00',

    @DateEnd = DATEADD(yy,5,@DateStart)

    --===== Display the shift number and date/times

    SELECT (t.N-1)%3 + 1 AS ShiftNumber,

    DATEADD(hh,8*(t.N-1),@DateStart) AS ShiftStart,

    DATEADD(hh,8*(t.N ),@DateStart) AS ShiftEnd

    FROM dbo.Tally t

    WHERE DATEADD(hh,8*(t.N-1),@DateStart) <= @DateEnd

    resulted in 4335 rows , and the following message:

    Msg 517, Level 16, State 1, Line 9

    Adding a value to a 'datetime' column caused overflow.

  • DECLARE @DateStart DATETIME

    DECLARE @DateEnd DATETIME

    --===== Find the min and max dates in the range of data

    SELECT @DateStart = MIN(SalesDate),

    @DateEnd = MAX(SalesDate)

    FROM dbo.vwTableau_GP_Shipments

    SELECT t.N-1+@DateStart AS ShippedDate

    FROM tempdb.dbo.Tally t

    WHERE t.N-1+@DateStart <= @DateEnd

    What values end up being used for the start and end date variables in the code above?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • I also need to know how many rows you actually have in your Tally table, please.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • DavidL (9/24/2009)


    Additionally, cutting and pasting your last example into my ssms and running it as is

    --===== Presets

    DECLARE @DateStart DATETIME

    DECLARE @DateEnd DATETIME

    SELECT @DateStart = '2008-01-01 06:00',

    @DateEnd = DATEADD(yy,5,@DateStart)

    --===== Display the shift number and date/times

    SELECT (t.N-1)%3 + 1 AS ShiftNumber,

    DATEADD(hh,8*(t.N-1),@DateStart) AS ShiftStart,

    DATEADD(hh,8*(t.N ),@DateStart) AS ShiftEnd

    FROM dbo.Tally t

    WHERE DATEADD(hh,8*(t.N-1),@DateStart) <= @DateEnd

    resulted in 4335 rows , and the following message:

    Msg 517, Level 16, State 1, Line 9

    Adding a value to a 'datetime' column caused overflow.

    I'm not quite sure what's happening on your machine. I just tested this code on 3 different machines (copied and pasted from above) and it worked just fine for all rows.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff:

    Jan 3 2000 12:00AM is the beginning date

    Sep 22 2009 12:00AM is the ending date.

    d lewis

  • :blush:

    Well, it is a large one (if a little is good, big is better, no?)

    Before I even get to the answer, let me try and explain away my ignorance.

    It's a slow day here and I got carried away with the cross join testing when creating the table. I worked my way up to (ahem) 16m rows, and figured it would be enough. There is a little bit of 'reasoning' behind that in that I have a few db's that gather production line data from plc's and end up storing masses of data. I don't yet know how detailed of info I need but it may turn out I need by the second, etc. 30 years of seconds = 15.7m. So that's where I decided to stop.

    Not much of a reason, but I guess it really means I don't yet know how to use a tally table (hence my experiments).

    So, if you please, could you explain why having too many rows in the tally table will cause your code to error out? It seems that the range of the result sets are limited by the start and end dates. BTW, thanks for all your articles! d. lewis

  • In case this is relevant, the tally table was created in tempdb as a permanent table (please comment if this is not a good thing to do), using the script from your article:

    --=============================================================================

    -- Uncomment the "use TempDB" to run this.

    -- From http://qa.sqlservercentral.com/articles/T-SQL/62867/

    -- Article by Jeff Moden 5/7/08

    --=============================================================================

    --USE TempDB --DB that everyone has where we can cause no harm

    SET NOCOUNT ON --Supress the auto-display of rowcounts for appearance/speed

    DECLARE @StartTime DATETIME --Timer to measure total duration

    SET @StartTime = GETDATE() --Start the timer

    --=============================================================================

    -- Create and populate a Tally table

    --=============================================================================

    --===== Conditionally drop

    IF OBJECT_ID('dbo.Tally') IS NOT NULL

    DROP TABLE dbo.Tally

    --===== Create and populate the Tally table on the fly

    SELECT TOP 16000000 --equates to more than 30 years of dates to the second

    IDENTITY(INT,1,1) AS N

    INTO dbo.Tally

    FROM Master.dbo.SysColumns sc1,

    Master.dbo.SysColumns sc2

    --===== Add a Primary Key to maximize performance

    ALTER TABLE dbo.Tally

    ADD CONSTRAINT PK_Tally_N

    PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100

    --===== Let the public use it

    GRANT SELECT, REFERENCES ON dbo.Tally TO PUBLIC

    --===== Display the total duration

    SELECT STR(DATEDIFF(ms,@StartTime,GETDATE())) + ' Milliseconds duration'

    I asked our sysadmin to test the code on his machine, and he got the same error as I did.

  • I ran the following select statement on the tally table created by the above script:

    select * from tempdb.dbo.tally where n<=100000

    And up to N=4354, the 'row number' is identical to 'N'. At row number 4355, however, N jumps to 20527. It looks to me that the tally table isn't storing the values I inserted in ascending order (or at least it is not returning them in that order).

    It looks a little suspicious that

    SELECT (t.N-1)%3 + 1 AS ShiftNumber,

    DATEADD(hh,8*(t.N-1),@DateStart) AS ShiftStart,

    DATEADD(hh,8*(t.N ),@DateStart) AS ShiftEnd

    FROM tempdb.dbo.Tally t

    WHERE DATEADD(hh,8*(t.N-1),@DateStart) <= @DateEnd

    failed at row 4336 (very close to the point in the above query where the N value jumps.

    I will drop and rebuild the tally table with fewer values and see what happens. Regards, d lewis

  • DavidL (9/24/2009)


    Additionally, cutting and pasting your last example into my ssms and running it as is

    --===== Presets

    DECLARE @DateStart DATETIME

    DECLARE @DateEnd DATETIME

    SELECT @DateStart = '2008-01-01 06:00',

    @DateEnd = DATEADD(yy,5,@DateStart)

    --===== Display the shift number and date/times

    SELECT (t.N-1)%3 + 1 AS ShiftNumber,

    DATEADD(hh,8*(t.N-1),@DateStart) AS ShiftStart,

    DATEADD(hh,8*(t.N ),@DateStart) AS ShiftEnd

    FROM dbo.Tally t

    WHERE DATEADD(hh,8*(t.N-1),@DateStart) <= @DateEnd

    resulted in 4335 rows , and the following message:

    Msg 517, Level 16, State 1, Line 9

    Adding a value to a 'datetime' column caused overflow.

    Hi David,

    The error occurs because SQL Server has to calculate the result of the DATEADD expression for each row in the Tally table, and with your extra huge amount of rows, this causes overflow at some point. But even without that, you are ruining performance.

    Try changing the WHERE clause to

    WHERE t.N <= DATEDIFF(hh, @DateStart, @DateEnd) / 8 + 1

    Oh and by the way - creating the tally table in tempdb is not a good option. The tempdb database is dropped and recreated every time the server restarts. You should either create it in a permanent database (maybe you have a single central database for objects used by all other databases?), or you can create it in the model database so that it will henceforth exist in every newly created database (including tempdb after every server restart).


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • DavidL (9/24/2009)


    It looks to me that the tally table isn't storing the values I inserted in ascending order (or at least it is not returning them in that order).

    No surprise there. You didn't have an ORDER BY on the query, so SQL Server is free to return the rows in any order you want it to.

    I didn't check the code you used to create and fill the tally table, so I have no idea if these numbers might be missing or were simply returned somewhere else in the return set.

    By the way - do you have a PRIMARY KEY on your tally table? If so, did you use the default clustered index to support it, or did you specify a nonclustered one?


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • A reasonably sized (12000 rows) tally table took care of

    DECLARE @DateStart DATETIME

    DECLARE @DateEnd DATETIME

    --===== Find the min and max dates in the range of data

    SELECT @DateStart = MIN(SalesDate),

    @DateEnd = MAX(SalesDate)

    FROM dbo.vwTableau_GP_Shipments

    PRINT CONVERT(varchar(25),@datestart)

    PRINT CONVERT(varchar(25),@dateend)

    SELECT t.N-1+@DateStart AS ShippedDate

    FROM tempdb.dbo.Tally t

    WHERE t.N-1+@DateStart <= @DateEnd

    not running correctly.

  • The code used to create the tally table was cut and pasted from Jeff's article (with a few of my comments in the header):

    --=============================================================================

    -- Uncomment the "use TempDB" to run this.

    -- From http://qa.sqlservercentral.com/articles/T-SQL/62867/

    -- Article by Jeff Moden 5/7/08

    --=============================================================================

    USE TempDB --DB that everyone has where we can cause no harm

    SET NOCOUNT ON --Supress the auto-display of rowcounts for appearance/speed

    DECLARE @StartTime DATETIME --Timer to measure total duration

    SET @StartTime = GETDATE() --Start the timer

    --=============================================================================

    -- Create and populate a Tally table

    --=============================================================================

    --===== Conditionally drop

    IF OBJECT_ID('dbo.Tally') IS NOT NULL

    DROP TABLE dbo.Tally

    --===== Create and populate the Tally table on the fly

    SELECT TOP 15000 --equates to more than 30 years of dates

    IDENTITY(INT,1,1) AS N

    INTO dbo.Tally

    FROM Master.dbo.SysColumns sc1,

    Master.dbo.SysColumns sc2

    --===== Add a Primary Key to maximize performance

    ALTER TABLE dbo.Tally

    ADD CONSTRAINT PK_Tally_N

    PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100

    --===== Let the public use it

    GRANT SELECT, REFERENCES ON dbo.Tally TO PUBLIC

    --===== Display the total duration

    SELECT STR(DATEDIFF(ms,@StartTime,GETDATE())) + ' Milliseconds duration'

    It appears he didn't allow for (ahem) unorthodox applications of it -- you may not have read my earlier post, but the tally table I eventually ended up using was massive. When I dropped and recreated it the sample queries from his article ran correctly.

    Thanks for the comment about where to keep the tally table. I'll create in a centrally available db. Regards, d. lewis

  • Hi David,

    Based on visual inspection of your code, I fail to see any way that this code could cause "gaps" in the tally table. I think what you saw was the result of the clustered index scan being partitioned over multiple CPU cores.

    What can happen with the code is that you get less than the 15,000 rows you expect. This depends on the number of rows in syscolumns. You use a cross join of that table to itself; the result of that will have an amount of rows equal to the square of the amount in syscolumns. So if syscolumns has less than 123 rows (the square root of 15,000, rounded up), the tally table will have less than 15,000 rows.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Hugo Kornelis (9/24/2009)


    Hi David,

    Based on visual inspection of your code, I fail to see any way that this code could cause "gaps" in the tally table. I think what you saw was the result of the clustered index scan being partitioned over multiple CPU cores.

    What can happen with the code is that you get less than the 15,000 rows you expect. This depends on the number of rows in syscolumns. You use a cross join of that table to itself; the result of that will have an amount of rows equal to the square of the amount in syscolumns. So if syscolumns has less than 123 rows (the square root of 15,000, rounded up), the tally table will have less than 15,000 rows.

    Nah... That's not it, Hugo. Look at the code... it's using Master.dbo.SysColumns. Even a brand new server will have more than 4,000 entries there.

    I've actually figure out what the problem is and I'm doing a write up on it now. I'll be back soon.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 15 posts - 256 through 270 (of 497 total)

You must be logged in to reply to this topic. Login to reply