a very COMPLEX aggregation query

  • Below is code using Grouping Sets SSRS 2008...1 table scan

    DECLARE @Case_Review TABLE

    (Case_ID int, Legal_Omn varchar(30),PPL_Area varchar(30),Revision int

    ,Review_Status varchar(30),Change_Date datetime);

    insert into @Case_Review

    SELECT 220, 'Civil', 'WA_Tenant', 1, 'Proposed', '2012-04-17 12:17'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant', 2, 'Proposed', '2012-04-17 14:05'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant',3, 'Proposed', '2012-04-18 17:03'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant',4, 'Proposed', '2012-04-18 17:05'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant', 5, 'Active', '2012-04-19 12:20'

    UNION ALL SELECT 220,'Civil', 'WA_Tenant', 6, 'Active', '2012-04-19 12:45'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant',7, 'Active', '2012-04-20 08:05'

    UNION ALL SELECT 220,'Civil', 'WA_Tenant', 8, 'Resolved', '2012-05-27 12:21'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant',9, 'Resolved', '2012-05-27 12:45'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant',10, 'Resolved', '2012-05-28 12:00'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant',11, 'Resolved', '2012-05-28 12:01'

    UNION ALL SELECT 230, 'Civil', 'WA_Traffic', 1, 'Proposed', '2012-04-17 01:00'

    UNION ALL SELECT 230, 'Civil', 'WA_Traffic', 2, 'Proposed', '2012-05-29 07:00'

    UNION ALL SELECT 230, 'Civil', 'WA_Traffic', 3, 'Active', '2012-05-29 12:00'

    UNION ALL SELECT 230,'Civil', 'WA_Traffic', 4, 'Active', '2012-05-29 12:05'

    UNION ALL SELECT 230, 'Civil', 'WA_Traffic', 5, 'Active', '2012-05-29 12:00'

    UNION ALL SELECT 230, 'Civil', 'WA_Traffic', 6, 'Closed', '2012-06-01 13:00'

    UNION ALL SELECT 230, 'Civil', 'WA_Traffic', 7, 'Closed', '2012-06-01 13:01'

    UNION ALL SELECT 260, 'Civil', 'WA_Consumer', 1, 'Proposed', '2012-04-17 14:00'

    UNION ALL SELECT 260, 'Civil', 'WA_Consumer', 2, 'Active', '2012-05-20 14:25'

    UNION ALL SELECT 260, 'Civil', 'WA_Consumer',3, 'Active', '2012-07-10 15:00'

    UNION ALL SELECT 260, 'Civil', 'WA_Consumer',4, 'Closed', '2012-10-20 15:20';

    SELECT

    Legal_Omn,

    PPL_Area,

    Review_Status,

    '>30 days' = ISNULL(SUM(CASE WHEN Change_Date < DATEADD(d,30,GETDATE()) THEN 1 ELSE NULL END),0),

    '30-60 days' = ISNULL(SUM(CASE WHEN Change_Date BETWEEN DATEADD(d,30,GETDATE()) AND DATEADD(d,60,GETDATE()) THEN 1 ELSE NULL END),0),

    '>60 days' = ISNULL(SUM(CASE WHEN Change_Date > DATEADD(d,60,GETDATE()) THEN 1 ELSE NULL END),0)

    FROM @Case_Review

    GROUP BY GROUPING SETS((Legal_Omn,PPL_Area,Review_Status),(Legal_Omn,PPL_Area),(Legal_Omn))

  • drew, when I ran your code I got error:

    Msg 208, Level 16, State 1, Line 1

    Invalid object name 'CTE'.

    BUT you are correct. Proposed time for WA_Traffic should be 30-60

    --Quote me

  • SQL Padawan, I like! it is very simple. But the numbers are wrong.

    --Quote me

  • After reading this thread I'm not even close to being clear as to what you want but here is my interpretation. If there duplicate review statuses for a ppl_area(tenant)...I choose the earliest date...

    DECLARE @Case_Review TABLE

    (Case_ID int, Legal_Omn varchar(30),PPL_Area varchar(30),Revision int

    ,Review_Status varchar(30),Change_Date datetime);

    insert into @Case_Review

    SELECT 220, 'Civil', 'WA_Tenant', 1, 'Proposed', '2012-04-17 12:17'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant', 2, 'Proposed', '2012-04-17 14:05'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant',3, 'Proposed', '2012-04-18 17:03'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant',4, 'Proposed', '2012-04-18 17:05'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant', 5, 'Active', '2012-04-19 12:20'

    UNION ALL SELECT 220,'Civil', 'WA_Tenant', 6, 'Active', '2012-04-19 12:45'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant',7, 'Active', '2012-04-20 08:05'

    UNION ALL SELECT 220,'Civil', 'WA_Tenant', 8, 'Resolved', '2012-05-27 12:21'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant',9, 'Resolved', '2012-05-27 12:45'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant',10, 'Resolved', '2012-05-28 12:00'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant',11, 'Resolved', '2012-05-28 12:01'

    UNION ALL SELECT 230, 'Civil', 'WA_Traffic', 1, 'Proposed', '2012-04-17 01:00'

    UNION ALL SELECT 230, 'Civil', 'WA_Traffic', 2, 'Proposed', '2012-05-29 07:00'

    UNION ALL SELECT 230, 'Civil', 'WA_Traffic', 3, 'Active', '2012-05-29 12:00'

    UNION ALL SELECT 230,'Civil', 'WA_Traffic', 4, 'Active', '2012-05-29 12:05'

    UNION ALL SELECT 230, 'Civil', 'WA_Traffic', 5, 'Active', '2012-05-29 12:00'

    UNION ALL SELECT 230, 'Civil', 'WA_Traffic', 6, 'Closed', '2012-06-01 13:00'

    UNION ALL SELECT 230, 'Civil', 'WA_Traffic', 7, 'Closed', '2012-06-01 13:01'

    UNION ALL SELECT 260, 'Civil', 'WA_Consumer', 1, 'Proposed', '2012-04-17 14:00'

    UNION ALL SELECT 260, 'Civil', 'WA_Consumer', 2, 'Active', '2012-05-20 14:25'

    UNION ALL SELECT 260, 'Civil', 'WA_Consumer',3, 'Active', '2012-07-10 15:00'

    UNION ALL SELECT 260, 'Civil', 'WA_Consumer',4, 'Closed', '2012-10-20 15:20';

    ;WITH CTE

    AS (

    SELECT

    Case_ID,

    Legal_Omn,

    PPL_Area,

    Review_StatusID =

    CASE

    WHEN Review_Status = 'Proposed' THEN 1

    WHEN Review_Status = 'Active' THEN 2

    WHEN Review_Status = 'Resolved' OR Review_Status = 'Closed' THEN 3

    END,

    Change_Date = min(Change_Date)

    FROM @case_review

    GROUP BY

    Case_ID,

    Legal_Omn,

    PPL_Area,

    CASE

    WHEN Review_Status = 'Proposed' THEN 1

    WHEN Review_Status = 'Active' THEN 2

    WHEN Review_Status = 'Resolved' OR Review_Status = 'Closed' THEN 3

    END

    )

    SELECT

    Legal_Omn,

    PPL_Area,

    Review_StatusID =

    CASE

    WHEN Review_StatusID = 1 THEN 'Proposed'

    WHEN Review_StatusID = 2 THEN 'Active'

    ELSE NULL

    END,

    '<30 days' =

    CASE WHEN GROUPING(PPL_Area) = 1 AND GROUPING(review_statusid) = 1 THEN SUM(CASE WHEN review_statusid = 3 AND Change_Date < DATEADD(d,30,GETDATE()) THEN 1 ELSE NULL END)

    WHEN Review_StatusID = 3 THEN NULL

    ELSE ISNULL(SUM(CASE WHEN Change_Date < DATEADD(d,30,GETDATE()) THEN 1 ELSE NULL END),0)

    END,

    '30-60 days' =

    CASE WHEN GROUPING(PPL_Area) = 1 AND GROUPING(review_statusid) = 1 THEN SUM(CASE WHEN review_statusid = 3 AND Change_Date BETWEEN DATEADD(d,30,GETDATE()) AND DATEADD(d,60,GETDATE()) THEN 1 ELSE NULL END)

    WHEN Review_StatusID = 3 THEN NULL

    ELSE ISNULL(SUM(CASE WHEN Change_Date BETWEEN DATEADD(d,30,GETDATE()) AND DATEADD(d,60,GETDATE()) THEN 1 ELSE NULL END),0)

    END,

    '>60 days' =

    CASE WHEN GROUPING(PPL_Area) = 1 AND GROUPING(review_statusid) = 1 THEN SUM(CASE WHEN review_statusid = 3 AND Change_Date > DATEADD(d,60,GETDATE()) THEN 1 ELSE NULL END)

    WHEN Review_StatusID = 3 THEN NULL

    ELSE ISNULL(SUM(CASE WHEN Change_Date > DATEADD(d,60,GETDATE()) THEN 1 ELSE NULL END),0)

    END

    FROM CTE

    GROUP BY GROUPING SETS((Legal_Omn,PPL_Area,Review_StatusID),(Legal_Omn))

    ORDER BY Legal_Omn,PPL_Area,Review_StatusID

  • I am going to explain better what is needed. It's all very close but I finally realize what I haven't been saying. Note the asterisked words.

    I need to capture two statistics:

    1) The total count of legal Case_IDs that take a total of <30 days (first column), 30-60 days (second column), and >60 days (third column) to go from Proposed to Resolved status. These numbers are aggregated by Legal_Omn (eg. Civil)

    2) Next, by PPL_Area the *AVG* length of time that case_ids *grouped by* a given PPL_Area sit in Proposed and Active Status. <30days, 30-60days, >60 days (which will show where in the flow cases are sitting the longest).

    *3) the query needs to be able to handle more than three PPL_Areas as there are many more in than I have provided DDL for.

    Updated DDL. I've added two more Legal_Omn for total of three (Civil, Family, Criminal).

    DECLARE @Case_Review TABLE

    (Case_ID int, Legal_Omn varchar(30),PPL_Area varchar(30),Revision int

    ,Review_Status varchar(30),Change_Date datetime);

    insert into @Case_Review

    SELECT 220, 'Civil', 'WA_Tenant', 1, 'Proposed', '2012-04-17 12:17'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant', 2, 'Proposed', '2012-04-17 14:05'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant',3, 'Proposed', '2012-04-18 17:03'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant',4, 'Proposed', '2012-04-18 17:05'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant', 5, 'Active', '2012-04-19 12:20'

    UNION ALL SELECT 220,'Civil', 'WA_Tenant', 6, 'Active', '2012-04-19 12:45'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant',7, 'Active', '2012-04-20 08:05'

    UNION ALL SELECT 220,'Civil', 'WA_Tenant', 8, 'Resolved', '2012-05-27 12:21'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant',9, 'Resolved', '2012-05-27 12:45'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant',10, 'Resolved', '2012-05-28 12:00'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant',11, 'Resolved', '2012-05-28 12:01'

    UNION ALL SELECT 230, 'Civil', 'WA_Traffic', 1, 'Proposed', '2012-04-17 01:00'

    UNION ALL SELECT 230, 'Civil', 'WA_Traffic', 2, 'Proposed', '2012-05-29 07:00'

    UNION ALL SELECT 230, 'Civil', 'WA_Traffic', 3, 'Active', '2012-05-29 12:00'

    UNION ALL SELECT 230,'Civil', 'WA_Traffic', 4, 'Active', '2012-05-29 12:05'

    UNION ALL SELECT 230, 'Civil', 'WA_Traffic', 5, 'Active', '2012-05-29 12:00'

    UNION ALL SELECT 230, 'Civil', 'WA_Traffic', 6, 'Closed', '2012-06-01 13:00'

    UNION ALL SELECT 230, 'Civil', 'WA_Traffic', 7, 'Closed', '2012-06-01 13:01'

    UNION ALL SELECT 260, 'Civil', 'WA_Consumer', 1, 'Proposed', '2012-04-17 14:00'

    UNION ALL SELECT 260, 'Civil', 'WA_Consumer', 2, 'Active', '2012-05-20 14:25'

    UNION ALL SELECT 260, 'Civil', 'WA_Consumer',3, 'Active', '2012-07-10 15:00'

    UNION ALL SELECT 260, 'Civil', 'WA_Consumer',4, 'Closed', '2012-10-20 15:20'

    UNION ALL SELECT 270, 'Family', 'WA_Parent_Relocation', 1, 'Proposed', '2012-02-17 01:00'

    UNION ALL SELECT 270, 'Family', 'WA_Parent_Relocation', 2, 'Proposed', '2012-03-29 07:00'

    UNION ALL SELECT 270, 'Family', 'WA_Parent_Relocation', 3, 'Active', '2012-03-29 12:00'

    UNION ALL SELECT 270,'Family', 'WA_Parent_Relocation', 4, 'Active', '2012-03-29 12:05'

    UNION ALL SELECT 270, 'Family', 'WA_Parent_Relocation', 5, 'Active', '2012-03-29 12:00'

    UNION ALL SELECT 270, 'Family', 'WA_Parent_Relocation', 6, 'Closed', '2012-04-01 13:00'

    UNION ALL SELECT 270, 'Family', 'WA_Parent_Relocation', 7, 'Closed', '2012-04-01 13:01'

    UNION ALL SELECT 280, 'Criminal', 'WA_Burglary', 1, 'Proposed', '2012-01-17 14:00'

    UNION ALL SELECT 280, 'Criminal', 'WA_Burglary', 2, 'Active', '2012-04-20 14:25'

    UNION ALL SELECT 280, 'Criminal', 'WA_Burglary',3, 'Active', '2012-05-10 15:00'

    UNION ALL SELECT 280, 'Criminal', 'WA_Burglary',4, 'Closed', '2012-06-20 15:20';

    select * from @case_review;

    Next I've updated the desired result in an image file.

    --Quote me

  • Based on the explanation,testdata and your desired results...why would WA_TENANT be in the column for 30-60 days for "Active"? Look at your test data:

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant', 5, 'Active', '2012-04-19 12:20'

    UNION ALL SELECT 220,'Civil', 'WA_Tenant', 6, 'Active', '2012-04-19 12:45'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant',7, 'Active', '2012-04-20 08:05'

  • SQL Padawan (4/19/2012)


    Based on the explanation,testdata and your desired results...why would WA_TENANT be in the column for 30-60 days for "Active"? Look at your test data:

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant', 5, 'Active', '2012-04-19 12:20'

    UNION ALL SELECT 220,'Civil', 'WA_Tenant', 6, 'Active', '2012-04-19 12:45'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant',7, 'Active', '2012-04-20 08:05'

    That's really great point. The answer is because we are measuring length of time between when case was opened as Active and when case transition to Closed/Resolved. See, how if we use the first and last Active date we are ignoring that time that elapses between 04-20 and 05-27? In fact the case is active that whole time. That puts it into the 30-60 days column.

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant', 5, 'Active', '2012-04-19 12:20'

    UNION ALL SELECT 220,'Civil', 'WA_Tenant', 6, 'Active', '2012-04-19 12:45'

    UNION ALL SELECT 220, 'Civil', 'WA_Tenant',7, 'Active', '2012-04-20 08:05'

    UNION ALL SELECT 220,'Civil', 'WA_Tenant', 8, 'Resolved', '2012-05-27 12:21'

    --Quote me

  • Here is my final attempt...You probably want to re-check your test data and desired results to see if everything is sync'd properly.

    ;WITH CTE

    AS (

    SELECT

    c1.Case_ID,

    c1.Legal_Omn,

    c1.PPL_Area,

    Review_Status =

    CASE

    WHEN c1.Review_Status NOT IN ('Proposed','Active') THEN 'Resolved'

    ELSE c1.Review_Status

    END,

    Change_date =

    CASE

    WHEN MIN(c1.change_date) < MIN(DATEADD(mi,-1,c2.change_date)) THEN MIN(DATEADD(mi,-1,c2.change_date))

    ELSE MIN(c1.change_date)

    END

    FROM @case_review c1 LEFT JOIN @Case_Review c2 ON c1.Case_ID = c2.Case_ID

    AND c1.Legal_Omn = c2.Legal_Omn AND c1.PPL_Area = c2.PPL_Area

    AND CASE WHEN c1.Review_Status = 'Proposed' THEN 1 WHEN c1.Review_Status = 'Active' THEN 2 ELSE 3 END + 1

    =

    CASE WHEN c2.Review_Status = 'Proposed' THEN 1 WHEN c2.Review_Status = 'Active' THEN 2 ELSE 3 END

    GROUP BY c1.Case_ID,c1.Legal_Omn,c1.PPL_Area,CASE

    WHEN c1.Review_Status NOT IN ('Proposed','Active') THEN 'Resolved'

    ELSE c1.Review_Status

    END

    )

    SELECT * FROM

    (

    SELECT

    Legal_Omn,

    PPL_Area,

    Review_Status,

    '<30 days' =

    CASE

    WHEN GROUPING(PPL_Area) = 1 AND GROUPING(review_status) = 1 THEN SUM(CASE WHEN review_status = 'Resolved' AND Change_Date < DATEADD(d,30,GETDATE()) THEN 1 ELSE NULL END)

    WHEN Review_Status = 'Resolved' THEN NULL

    ELSE ISNULL(SUM(CASE WHEN Change_Date < DATEADD(d,30,GETDATE()) THEN 1 ELSE NULL END),0)

    END,

    '30-60 days' =

    CASE

    WHEN GROUPING(PPL_Area) = 1 AND GROUPING(review_status) = 1 THEN SUM(CASE WHEN review_status = 'Resolved' AND Change_Date BETWEEN DATEADD(d,30,GETDATE()) AND DATEADD(d,60,GETDATE()) THEN 1 ELSE NULL END)

    WHEN Review_Status = 'Resolved' THEN NULL

    ELSE ISNULL(SUM(CASE WHEN Change_Date BETWEEN DATEADD(d,30,GETDATE()) AND DATEADD(d,60,GETDATE()) THEN 1 ELSE NULL END),0)

    END,

    '>60 days' =

    CASE

    WHEN GROUPING(PPL_Area) = 1 AND GROUPING(review_status) = 1 THEN SUM(CASE WHEN review_status = 'Resolved' AND Change_Date > DATEADD(d,60,GETDATE()) THEN 1 ELSE NULL END)

    WHEN Review_Status = 'Resolved' THEN NULL

    ELSE ISNULL(SUM(CASE WHEN Change_Date > DATEADD(d,60,GETDATE()) THEN 1 ELSE NULL END),0)

    END

    FROM CTE

    GROUP BY GROUPING SETS((Legal_Omn,PPL_Area,Review_Status),(Legal_Omn))

    ) x

    WHERE Review_Status IN ('Active','Proposed') OR Review_Status IS NULL

    ORDER BY Legal_Omn,PPL_Area,CASE Review_Status WHEN 'Proposed' THEN 1 WHEN 'Active' THEN 2 ELSE 3 END

  • I appreciate it. studying it.

    --Quote me

  • Within the CIVIL category the results are all correct.

    Padawan. . .

    WA_Parent_Relocation Propoosed should be 30-60 days. Yours is <30 days.

    WA_Parent_Relocation Active is Correct.

    WA_Burglary Proposed should be >60 days. Yours is <30 days.

    WA_Burglary Active should be 30-60 days. Your is >60 days.

    .......soooo close. :crying:

    --Quote me

  • Sorry about the error in my code. I forgot to copy the entire code.

    Here is an update with changes to provide an average rather than a count. The additional categories were already accounted for.

    WITH CTE AS (

    SELECT Case_ID, Legal_Omn, PPL_Area

    ,DATEDIFF(DAY

    ,MIN(CASE WHEN Review_Status = 'Proposed' THEN Change_Date END)

    ,MIN(CASE WHEN Review_Status = 'Active' THEN Change_Date END)

    ) AS Proposed

    ,DATEDIFF(DAY

    ,MIN(CASE WHEN Review_Status = 'Active' THEN Change_Date END)

    ,MIN(CASE WHEN Review_Status IN ('Resolved', 'Closed') THEN Change_Date END)

    ) AS Active

    FROM @Case_Review

    GROUP BY Case_ID, Legal_Omn, PPL_Area

    )

    SELECT Legal_Omn, PPL_Area, Review_Status

    ,AVG(CASE WHEN Days < 31 THEN Days END) AS [<30]

    ,AVG(CASE WHEN Days BETWEEN 31 AND 60 THEN Days END) AS [30-60]

    ,AVG(CASE WHEN Days > 60 THEN Days END) AS [>60]

    FROM CTE

    CROSS APPLY (

    SELECT 'Proposed' AS Review_Status, Proposed AS Days

    UNION ALL

    SELECT 'Active', Active

    ) AS rs

    GROUP BY Legal_Omn, PPL_Area, Review_Status

    WITH ROLLUP

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Drew, I appreciate your help. Your query does almost everythinig but the numbers appear to represent the number of days a given ticket is in any status and putting THAT number under the respective column.

    Can you tweak so that numbers represent total count of case_ids found to be in the different review_statuses?

    --Quote me

  • polkadot (4/20/2012)


    Drew, I appreciate your help. Your query does almost everythinig but the numbers appear to represent the number of days a given ticket is in any status and putting THAT number under the respective column.

    Can you tweak so that numbers represent total count of case_ids found to be in the different review_statuses?

    See if you can figure it out for yourself. You need to be able to understand the code anyhow, if you're going to support it in production. This will be a good opportunity to see for yourself whether you understand.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Yes, today I will problem solve and compare contrast the 3 queries offered here.

    Drew's is the shortest and looks very elegant. Dwains was a break through and pretty much works except of one little column, and Padawan's numbers are based on another metric and is long. I personally can't tell on the surface, which will provide best performance, but Drew's looks it, anyway.

    It is very interesting to see how many approaches there are to the same problem. Drew using MIN, Dwain using Partition By and Order By DESC and Padawan GROUPING and SETS. The last one is most alien to me. All Great learning.

    Thanks. I appreciate everyone's help.

    --Quote me

  • Drew,

    please explain how 'Days' is introduced. Since you and Dwain are both using DATEDIFF to extract

    day portion of the datetime, I would have expected your CASE statements to refer to 'day' and not 'Days'.

    Where did you instantiate Days?

    --Quote me

Viewing 15 posts - 31 through 45 (of 59 total)

You must be logged in to reply to this topic. Login to reply