find the maximum length of sequences in a column

  • I have a table as follows:

    date (uk format) result

    01/01/2010 -1

    02/01/2010 -1

    03/01/2010 1

    04/01/2010 1

    05/01/2010 -1

    06/01/2010 -1

    07/01/2010 -1

    08/01/2010 -1

    09/01/2010 1

    etc

    date is sequential and result is only 1 or -1

    I want to find the longest sequence of 1's. Same for -1's

    Using above date, for 1's it's 2 (03-04/01/2010) and for -1's it's 4 (05-08/01/2010).

    There are thousands of rows and the 1's & -1's are totally random

    I've tried using various ranking/partition functions but to no avail :crazy:

    Seems so simple but it's driving me nuts

    Any suggestions?

    Thanks

  • looking at it but my first try didn't make sense;

    here's sample data for other posters to fiddle with:

    ;with MySampleData (TheDate,TheValue)

    AS

    (

    SELECT CONVERT(datetime,'20100101'),'-1' UNION ALL

    SELECT '20100201','-1' UNION ALL

    SELECT '20100301','1' UNION ALL

    SELECT '20100401','1' UNION ALL

    SELECT '20100501','-1' UNION ALL

    SELECT '20100601','-1' UNION ALL

    SELECT '20100701','-1' UNION ALL

    SELECT '20100801','-1' UNION ALL

    SELECT '20100901','1'

    )

    SELECT

    --ROW_NUMBER() OVER (PARTITION BY TheValue ORDER BY TheDate) AS RW,

    MySampleData.*

    FROM MySampleData

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • A suggestion .. read this article by Jeff Moden, it just may be what you are looking for:

    http://qa.sqlservercentral.com/articles/T-SQL/71550/

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • Oh,

    and just to make life even more interesting I forgot to say the dates are only ever Monday through Friday so a sequence of 1's may have to account for a hole in the dates for weekends. ie a 1 on a friday is in the same range as a 1 on the following monday.

    Ouch!!!

  • bitbucket-25253 (9/15/2011)


    A suggestion .. read this article by Jeff Moden, it just may be what you are looking for:

    http://qa.sqlservercentral.com/articles/T-SQL/71550/

    +1 for this as you have a somewhat classic gaps & islands problem. Adding the weekend gap does complicate it some.

  • Here is one way to tackle the problem - not yet including the weekend aspect....

    IF OBJECT_ID('tempdb..#tagged') IS NOT NULL

    DROP TABLE #tagged

    IF OBJECT_ID('tempdb..#tagged') IS NOT NULL

    DROP TABLE #counted

    ;WITH MySampleData (TheDate,TheValue)

    AS

    (

    SELECT CONVERT(DATETIME,'20100101'),'-1' UNION ALL

    SELECT '20100102','-1' UNION ALL

    SELECT '20100103','1' UNION ALL

    SELECT '20100104','1' UNION ALL

    SELECT '20100105','-1' UNION ALL

    SELECT '20100106','-1' UNION ALL

    SELECT '20100107','-1' UNION ALL

    SELECT '20100108','-1' UNION ALL

    SELECT '20100109','1'

    )

    SELECT

    MySampleData.*,

    ROW_NUMBER() OVER(ORDER BY TheDate) -

    ROW_NUMBER() OVER(PARTITION BY TheValue ORDER BY TheDate) AS group_tag

    INTO #tagged

    FROM MySampleData

    SELECT TheValue

    , group_tag

    , COUNT(*) group_count

    INTO

    #counted

    FROM

    #tagged

    GROUP BY

    TheValue

    , group_tag

    SELECT TheValue

    , MAX(group_count)

    FROM

    #counted

    GROUP BY

    TheValue

    For the weekend problem, will you have dates in the table with maybe a zero in TheValue or will weekend dates not be there at all?

    Also, ignoring weekends for the moment, will you have a row for every date in the range or will there be gaps that break the chains? So if a date is missing it is equivalent to a break in a sequence, even if the values either side are the same?

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • Thanks to those who reminded me that this is ultimately just a gaps & island problem.

    In reality I was only interested in Fridays so all the dates are 7 days apart.

    The trick is to generate a reference table of all possible dates together with a sequential row_number and then join that to my actual table.

    Please see attachment.

    Any performance improvements appreciated 🙂

    Lotusnotes

  • Viewing 7 posts - 1 through 6 (of 6 total)

    You must be logged in to reply to this topic. Login to reply