Splitting a date range into multiple rows.Is a loop/cursor needed?

  • I have a need to process records with a date range and hours associated with them and insert a record for each day of the range into a different table with the appropriate hours. I have come up with the following WHILE Loop option but am wondering if this is the best approach.

    This process only processes at most a few hundred records at a time. Only certain records will have a date range so an after thought to the code below would be to use a set based operation to insert all the single day records first and then process the ranges. Is there a better approach than the one I have below to process these date ranges?

    Any input would be greatly appreciated.

    Thanks,

    Chad

    -- Drop tables if they exist

    IF EXISTS (SELECT * FROM dbo.sysobjects WHERE id = OBJECT_ID(N'[dbo].[TimeoffSource]') AND OBJECTPROPERTY(id, N'IsUserTable') = 1)

    DROP TABLE [dbo].[TimeoffSource]

    IF EXISTS (SELECT * FROM dbo.sysobjects WHERE id = OBJECT_ID(N'[dbo].[TimeoffDestination]') AND OBJECTPROPERTY(id, N'IsUserTable') = 1)

    DROP TABLE [dbo].[TimeoffDestination]

    -- Create tables to work with / Source and Destination

    CREATE TABLE dbo.TimeoffSource (ID INT IDENTITY(1,1), employeeid INT, startdate SMALLDATETIME,

    enddate SMALLDATETIME, hours INT, processed BIT)

    GO

    CREATE TABLE dbo.TimeoffDestination (ID INT, seq INT, employeeid INT, postdate SMALLDATETIME, hours INT)

    GO

    -- Insert test data

    INSERT TimeoffSource VALUES (111, '2/16/2009', '2/18/2009', 24, 0)

    INSERT TimeoffSource VALUES (222, '2/19/2009', '2/19/2009', 4, 0)

    INSERT TimeoffSource VALUES (333, '3/2/2009', '3/6/2009', 40, 0)

    INSERT TimeoffSource VALUES (444, '2/12/2009', '2/12/2009', 8, 0)

    -- View test data that needs to be worked

    SELECT * FROM TimeoffSource

    DECLARE @RecordCount INT, @Count INT

    DECLARE @ID INT, @empid INT, @start SMALLDATETIME, @end SMALLDATETIME, @hours INT, @dayhours INT

    -- Get number of rows to work with

    SELECT @RecordCount = count(*)

    FROM TimeoffSource

    -- Loop through each record

    WHILE @RecordCount > 0

    BEGIN

    SELECT TOP 1 @ID = ID, @empid = employeeid, @start = startdate, @end = enddate, @hours = hours

    FROM TimeoffSource

    WHERE processed = 0

    ORDER BY ID

    -- Get number of days in range

    SET @Count = DATEDIFF(d, @start, @end) + 1

    -- Get hours for each day

    SET @dayhours = @hours / @Count

    -- Based off date range we will insert a record for each day

    WHILE @Count > 0

    BEGIN

    INSERT TimeoffDestination VALUES (@ID, @Count, @empid, DATEADD(d, @Count-1, @start) , @dayhours)

    SET @Count = @Count-1

    CONTINUE

    END

    SET @RecordCount = @RecordCount-1

    UPDATE TimeoffSource

    SET Processed = 1

    WHERE ID = @ID

    CONTINUE

    END

    -- View final data

    SELECT * FROM TimeoffDestination

    ORDER BY id, seq

  • Does this query give you what you need?

    It appears to replicate the output from your script.

    A couple of points to note:

    1) I've used floating point division to calculate the hours. If you use integer division you might "lose" a few hours due to integer truncation.

    2) You need a Tally table of consecutive integers starting from 1 and including at least as many records as the largest date range in your source data.

    SELECT X.Id, seq = T.N, X.employeeid, postdate = DATEADD(day, T.N - 1, X.startdate), hours = X.hours / X.span

    FROM (

    SELECT Id, employeeid, startdate, span = DATEDIFF(day, startdate, enddate) + 1, hours = CONVERT(float, hours)

    FROM TimeoffSource

    ) X

    JOIN Tally T ON (T.N >= 1 AND T.N <= X.span)

    ORDER BY id, seq

    or similarly:

    SELECT X.Id, seq = T.N, X.employeeid, postdate = DATEADD(day, T.N - 1, X.startdate), hours = CONVERT(float, X.hours) / (DATEDIFF(day, X.startdate, X.enddate) + 1)

    FROM TimeoffSource X

    JOIN Tally T ON (T.N >= 1 AND T.N <= (DATEDIFF(day, X.startdate, X.enddate) + 1))

    ORDER BY X.Id, T.N

  • Thanks for your quick reply! Your code works great based off what I asked for. After I posted it I also thought about the truncation of hours but have been told that the front end system requires the hours to match their full day hours if it is a date range. In any case, I can see where this would still be good to build in case they change their minds in the future.

    Unfortunately, as with all things it seems, accurate requirements don't always get delivered to us up front. I found out a special circumstance concerning weekends that I need to address and would love any input anyone could offer. If they are entering a single day off request then it can be any day of the week (including a Saturday or Sunday). But, if they enter a date range and it overlaps a Saturday and/or Sunday then the hours for those days are excluded and I will not write a record for that day even though they fall in the range. They also will not count towards dividing the hours.

    So if we take my original code and included these additional Insert statements:

    INSERT TimeoffSource VALUES (555, '2/12/2009', '2/17/2009', 32, 0)

    INSERT TimeoffSource VALUES (666, '2/15/2009', '2/15/2009', 8, 0)

    These will test this case. The 555 employee is really taking off Thursday, Friday, Monday, and Tuesday so the hours are 8 hours each day (total 32). The 666 employee is actually taking the Sunday off (8 hours).

    Any thoughts on how I would handle this situation?

    A big thanks in advance for your help with this.

    Thanks,

    Chad

  • What about national holidays? Should they be treated like Saturdays and Sundays? If so, you will need to use a calendar table.

    A calendar table might be the best option even if national holidays don't need to be considered.

  • I could actually see it going either way. I know the Front-end system currently does not do anything special for holidays that fall in a date range but that could always change. So I would agree that a calendar table makes more sense.

  • Well I didn't make it as far with this as I wanted by now and leave on vacation tomorrow. I don't have a problem excluding the dates (either Sat/Sun) or using a Calendar table. But I am having a problem in dividing up the hours accordingly since this is based off the actual number of days I am writing out. Any thoughts on this?

    Chad

  • Does this do what you want?

    The first SELECT includes a row for each single-day row in the source data regardless of the weekday.

    The second SELECT in the UNION uses a Calendar table to filter out days that shouldn't be counted, such as Saturdays and Sundays, for rows in the source data where startdate < enddate.

    SELECT T.Id,

    seq = 1,

    T.employeeid,

    postdate = T.startdate,

    hours = CONVERT(float, T.hours)

    FROM TimeoffSource T

    WHERE (T.startdate = T.enddate)

    UNION ALL

    SELECT T.Id,

    seq = (SELECT COUNT(*) FROM Calendar CC

    WHERE (CC.[Date] >= T.startdate AND CC.[Date] <= C.[Date])),

    T.employeeid,

    postdate = C.[Date],

    hours = CONVERT(float, T.hours)

    / (SELECT COUNT(*) FROM Calendar CC

    WHERE (CC.[Date] >= T.startdate AND CC.[Date] <= T.enddate))

    FROM TimeoffSource T

    JOIN Calendar C ON (C.[Date] >= T.startdate AND C.[Date] <= T.enddate)

    WHERE (T.StartDate < T.EndDate)

    ORDER BY Id, seq

  • Andrew,

    Thank you for your help with this. I was stuck on trying to figure out the hours until you sent your last code. It helps to see someone else's ideas. 🙂

    Your code worked great to meet this need but I noticed that it required all dates to be in the Calendar table other than those that needed to be skipped. I decided to pursue it where I only need to enter the dates I want to skip. I also specifically coded for skipping Sat/Sun so they do not have to be in the table and I combined the ID and seq fields together to give me a unique value in one field (another requirement). Below is our combined effort. Would you mind taking a look at let me know if there is anything you would change about it if it was all yours?

    Thanks again for all the valuable input and examples you provided. It is great to learn and see that I did not need a loop or cursor to do this.

    Thanks,

    Chad

    -- Get All Single Day Entries

    SELECT CONVERT(INT,(CAST(T.Id AS VARCHAR(20)) + '01')) AS [ID],

    T.employeeid,

    postdate = T.startdate,

    weekday = DATEPART(dw,T.startdate),

    hours = CONVERT(float, T.hours)

    FROMTimeoffSource T

    WHERE(T.startdate = T.enddate)

    UNION ALL

    -- Get Ranges but exclude Saturday and Sunday

    SELECTCONVERT(INT,(CAST(T.Id AS VARCHAR(20)) + RIGHT('00' + CAST(ty.N AS VARCHAR(2)), 2))) AS ID,

    T.employeeid,

    postdate = DATEADD(day, ty.N - 1, T.startdate),

    weekday = DATEPART(dw,DATEADD(day, ty.N - 1, T.startdate)),

    hours = CONVERT(float, T.hours)

    / (SELECT COUNT(*) FROM TimeoffSource T1

    JOIN Tally ty2 ON (ty2.N >= 1 AND ty2.N <= (DATEDIFF(day, T1.startdate, T1.enddate) + 1))

    WHERE (T1.StartDate < T1.EndDate)

    AND T1.[ID] = T.[ID]

    AND DATEADD(day, ty2.N - 1, T1.startdate) NOT IN (SELECT [Date] FROM Calendar2)

    AND DATEPART(dw,DATEADD(day, ty2.N - 1, T1.startdate)) NOT IN (6,7) --DATEFIRST is set to 1 (Monday)

    )

    FROM TimeoffSource T

    JOIN Tally ty ON (ty.N >= 1 AND ty.N <= (DATEDIFF(day, T.startdate, T.enddate) + 1))

    WHERE (T.StartDate < T.EndDate)

    AND DATEADD(day, ty.N - 1, T.startdate) NOT IN (SELECT [Date] FROM Calendar2)

    AND DATEPART(dw,DATEADD(day, ty.N - 1, T.startdate)) NOT IN (6,7)

    ORDER BY ID

  • Looks OK to me.

    Presumably the number of consecutive days that an employee can take off is never going to exceed 100, so you won't run into problems with your synthetically generated ID column?

    I wouldn't bother with the casting to and from the varchar datatype to generate the ID column - just use integer arithmetic.

    --SELECT CONVERT(INT,(CAST(T.Id AS VARCHAR(20)) + '01')) AS [ID],

    SELECT T.Id * 100 + 1 AS [ID], ...

    --SELECT CONVERT(INT,(CAST(T.Id AS VARCHAR(20)) + RIGHT('00' + CAST(ty.N AS VARCHAR(2)), 2))) AS ID

    SELECT T.Id * 100 + ty.N AS [ID], ...

  • It is amazing the little simple things that you can't think of after a very long day. :doze:

    They can only request the days that fall into a pay period which is at most a month. The old VB program that I am replacing with this process was simply sticking the value (no leading zero) on the end causing all types of havoc with the ID field. This way will allow the number to stay unique and keep it together as well.

    Andrew, Thank you so much for your valuable input! It was most helpful.

    Thanks,

    Chad

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply