Get the OrderID's that only match the criteria.

  • Hello Everyone, I am stuck with a query that I am working on. Hope you can help me with your expertise.

    I have data like below.

    DECLARE @Input TABLE

    (

    OrderID INT,

    AccountID INT,

    StatusID INT,

    Value INT

    )

    INSERT INTO @Input VALUES (1,1,1,15), (2,1,1,20), (3,2,1,5), (4,2,2,40), (5,3,1,20), (6,1,2,40), (7,1,2,40)

    If an Account's value reaches 20 for StatusID = 1 and 40 for StatusID = 2, that is a called "Good". I want to find out which order made the Account become "Good".

    By looking at the data, it is understandble that AccountID 1 crossed Status ID 1's limit of 20 with order 2, but the status ID 2's limit was only crossed after the 6th order was placed. So my output should show 6 for AccountID 1.

    For AccountID 2, value of statusID 1 was 5 with orderid 3, but it reached the limit for status id 2 of 40 with order 4. But the first condition was not met. so it shouldn't be seen in the output.

    Same with AccountID 3 as well, It reached the limit of status id 1 with order 5 but the limit for order 2 wasn't reached so it should be ignored as well.

    I wrote the code as below, its working fine but I still know there are better ways to write since I will be working with atleast a million records. So I want to know if there is a better way to change this or any idea is appreciated.

    ;WITH CTE AS

    (

    SELECT OrderID,AccountID, StatusID, SUM(Value) OVER(Partition By AccountID, StatusID ORDER BY OrderID) AS RunningTotal

    FROM @Input

    )

    , CTE2 AS (

    SELECT * FROM CTE WHERE StatusID = 1 AND RunningTotal >= 20

    Union all

    SELECT * FROM CTE WHERE StatusID = 2 AND RunningTotal >= 40

    )

    , CTE3 AS (

    SELECT *, ROW_NUMBER() OVER (Partition BY AccountID ORDER BY OrderID) AS Rnum

    FROM CTE2

    )

    SELECT OrderID, AccountID

    FROM CTE3

    WHERE Rnum = 2

    If you think SSIS would be a good choice, please suggest that route as well. I am open for options and willing to learn if there is anything I am unaware of...

  • The following should give you better performance than what you currently have...

    WITH cte AS (

    SELECT

    MIN(i.OrderID) AS OrderID,

    i.AccountID,

    i.StatusID

    FROM

    #Input i

    WHERE

    (i.StatusID = 1 AND i.Value = 20) OR (i.StatusID = 2 AND i.Value = 40)

    GROUP BY

    i.AccountID,

    i.StatusID

    )

    SELECT

    MAX(t.OrderID) AS OrderID,

    t.AccountID

    FROM

    cte t

    GROUP BY

    t.AccountID

    HAVING

    COUNT(1) = 2

    AND MAX(t.StatusID) = 2;

    It still has a nasty sort operator in the execution plan but that would be cleaned up with the following index...

    CREATE NONCLUSTERED INDEX ix_Input_AccountID_Status ON #Input (AccountID,StatusID);

    If that index doesn't already exist and/or you aren't able to create it on the Input table, you could also consider replacing the CTE with a temp table so that you can index that...

    IF OBJECT_ID('tempdb..#temp') IS NOT NULL

    DROP TABLE #temp;

    SELECT

    MIN(i.OrderID) AS OrderID,

    ISNULL(i.AccountID, 0) AS AccountID,

    ISNULL(i.StatusID, 0) AS StatusID

    INTO #temp

    FROM

    @Input i

    WHERE

    (i.StatusID = 1 AND i.Value = 20) OR (i.StatusID = 2 AND i.Value = 40)

    GROUP BY

    i.AccountID,

    i.StatusID;

    ALTER TABLE #temp ADD PRIMARY KEY CLUSTERED (AccountID, StatusID);

    SELECT

    MAX(t.OrderID) AS OrderID,

    t.AccountID

    FROM

    #temp t

    GROUP BY

    t.AccountID

    HAVING

    COUNT(1) = 2

    AND MAX(t.StatusID) = 2;

    Obviously, creating the temp table index, on the fly, every time the query is executed, is expensive... But in certain circumstances the time needed to create it can be more than offset by the improvement to the final select. In any case it should be worth testing.

    HTH,

    Jason

  • Edit: removed post.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply