Cursor within DTS

  • The cursor below runs within a dts package and obviously takes quite a while to run over approx 6 million records. Would it be possible to replace the cursor with update scripts? Any ideas to replace the cursor would be appreciated. This is an inherited package running within SQL2000.

    SET NOCOUNT ON

    DECLARE @income money,

    @cost money,

    @costAdd money,

    @order_id bigint,

    @order_value bigint,

    @order_line_no bigint,

    @promoID int,

    @PromoCheck int,

    @entCode bigint

    DECLARE hhrsales_cursor CURSOR FORWARD_ONLY FOR

    SELECT net_income, cost, PL_Order, PL_Txno, PL_Order_Value, PromotionID, PromoCheckID, Entity_Code

    FROM tmp_hhrsales2

    FOR UPDATE of [Cost]

    OPEN hhrsales_cursor

    FETCH NEXT FROM hhrsales_cursor

    INTO @income, @cost, @order_id, @order_line_no, @order_value, @promoID, @promoCheck, @entCode

    set @costAdd = 0

    WHILE @@FETCH_STATUS = 0

    BEGIN

    IF (@order_value <> 0) and (@promoCheck <> 60)

    BEGIN

    IF @entCode <> 10000001054

    BEGIN

    IF @income = 0 and @cost <> 0

    BEGIN

    set @costAdd = @costAdd + @cost

    UPDATE tmp_hhrsales2

    set cost = 0

    WHERE CURRENT OF hhrsales_cursor

    insert into log_hhrsales_cos_change

    select @order_id, @order_line_no, @cost, 0

    END

    IF @income <> 0 and @costAdd <> 0

    BEGIN

    UPDATE tmp_hhrsales2

    SET cost = cost + @costAdd

    WHERE CURRENT OF hhrsales_cursor

    insert into log_hhrsales_cos_change

    select @order_id, @order_line_no, @cost, @cost+@costAdd

    set @costAdd = 0

    END

    END

    END

    -- Get the next record.

    FETCH NEXT FROM hhrsales_cursor

    INTO @income, @cost, @order_id, @order_line_no, @order_value, @promoID, @promoCheck, @entCode

    END

    CLOSE hhrsales_cursor

    DEALLOCATE hhrsales_cursor

    GO

  • At a glance it looks like the cursor is there to enforce an insert into a logging table. Is there any reason why this could not be done with a SQL trigger?

    If you can use a trigger then it would just be a standard update statement - obviously you would need to include all the IF statements into a WHERE clause - which incidently should have been done in the SELECT statement to populate the cursor.

    Rich

    Hope this helps,
    Rich

    [p]
    [/p]

  • Don't know why a trigger or any other option was not used, just inherited the package.

    Could you give an example script of your idea pls?

  • The UPDATE of the table is a special kind called a Running Totals Update. Jeff Moden has written some excellent and easily consumable articles about them for this site, well worth a read, and you will need to invest a little time on them. Your whole batch could be replaced by an UPDATE and an INSERT ... FROM, something like this:

    UPDATE tmp_hhrsales2 SET

    @costAdd = CASE

    WHEN net_income = 0 and cost <> 0 THEN @costAdd + @cost

    WHEN net_income <> 0 and @costAdd <> 0 THEN 0

    ELSE @costAdd END,

    cost = CASE

    WHEN net_income = 0 and cost <> 0 THEN 0

    WHEN net_income <> 0 and @costAdd <> 0 THEN cost + @costAdd

    ELSE cost END

    -- extra columns for log_hhrsales_cos_change: oldcost and newcost

    WHERE PL_Order_Value <> 0 and PromoCheckID <> 60 AND Entity_Code <> 10000001054

    INSERT INTO log_hhrsales_cos_change

    SELECT PL_Order, PL_Txno, oldcost, newcost

    FROM tmp_hhrsales2

    WHERE PL_Order_Value <> 0 and PromoCheckID <> 60 AND Entity_Code <> 10000001054

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Denesh Naidoo (7/20/2010)


    Have a look at the link below:

    http://searchsqlserver.techtarget.com/tip/Avoid-cursors-in-SQL-Server-with-these-methods-to-loop-over-records%5B/quote%5D

    Sorry but absolutely not. It's not the cursor that makes things slow... it's the WHILE LOOP. Converting a cursor to anything that has a WHILE LOOP is a futal effort because a WHILE LOOP is no faster than a well written Forward_Only, Read_Only, Static cursor.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Chris Morris-439714 (7/20/2010)


    The UPDATE of the table is a special kind called a Running Totals Update. Jeff Moden has written some excellent and easily consumable articles about them for this site, well worth a read, and you will need to invest a little time on them. Your whole batch could be replaced by an UPDATE and an INSERT ... FROM, something like this:

    UPDATE tmp_hhrsales2 SET

    @costAdd = CASE

    WHEN net_income = 0 and cost <> 0 THEN @costAdd + @cost

    WHEN net_income <> 0 and @costAdd <> 0 THEN 0

    ELSE @costAdd END,

    cost = CASE

    WHEN net_income = 0 and cost <> 0 THEN 0

    WHEN net_income <> 0 and @costAdd <> 0 THEN cost + @costAdd

    ELSE cost END

    -- extra columns for log_hhrsales_cos_change: oldcost and newcost

    WHERE PL_Order_Value <> 0 and PromoCheckID <> 60 AND Entity_Code <> 10000001054

    INSERT INTO log_hhrsales_cos_change

    SELECT PL_Order, PL_Txno, oldcost, newcost

    FROM tmp_hhrsales2

    WHERE PL_Order_Value <> 0 and PromoCheckID <> 60 AND Entity_Code <> 10000001054

    Oh, be careful, Chris...

    I might be missing something but I believe you need to make two minor changes to the code to guarantee that nothing interferes with the calculations including parallelism...

    UPDATE tmp_hhrsales2 SET

    @costAdd = CASE

    WHEN net_income = 0 and cost <> 0 THEN @costAdd + @cost

    WHEN net_income <> 0 and @costAdd <> 0 THEN 0

    ELSE @costAdd END,

    cost = CASE

    WHEN net_income = 0 and cost <> 0 THEN 0

    WHEN net_income <> 0 and @costAdd <> 0 THEN cost + @costAdd

    ELSE cost END

    -- extra columns for log_hhrsales_cos_change: oldcost and newcost

    FROM tmp_hhrsales2 WITH(TABLOCKX) --<<< HERE

    WHERE PL_Order_Value <> 0 and PromoCheckID <> 60 AND Entity_Code <> 10000001054

    OPTION (MAXDOP 1) --<<< AND HERE

    INSERT INTO log_hhrsales_cos_change

    SELECT PL_Order, PL_Txno, oldcost, newcost

    FROM tmp_hhrsales2

    WHERE PL_Order_Value <> 0 and PromoCheckID <> 60 AND Entity_Code <> 10000001054

    Also, there needs to be a clustered index on that table in the order that the update is supposed to be executed.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • HAve attached the table attributes which includes indexes. Pls let me know your thoughts.

    What do the two changes you've made do, how do they affect the data??

  • Jeff Moden (7/20/2010)


    Chris Morris-439714 (7/20/2010)


    The UPDATE of the table is a special kind called a Running Totals Update. Jeff Moden has written some excellent and easily consumable articles about them for this site, well worth a read, and you will need to invest a little time on them. Your whole batch could be replaced by an UPDATE and an INSERT ... FROM, something like this:

    Oh, be careful, Chris...

    I might be missing something but I believe you need to make two minor changes to the code to guarantee that nothing interferes with the calculations including parallelism...

    Also, there needs to be a clustered index on that table in the order that the update is supposed to be executed.

    Thanks Jeff, you're missing nothing, those are important points and it does no harm to reiterate them here.

    The OP's query used to build the result set on which the cursor operates has no WHERE clause and yet it's a running totals update with partitions where variables are reset to zero. Even if there is a clustered index on the table in the order that the update is supposed to be executed, there's no guarantee that the rows will be FETCHed in the correct order...

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • LadyG (7/21/2010)


    HAve attached the table attributes which includes indexes. Pls let me know your thoughts.

    What do the two changes you've made do, how do they affect the data??

    The changes I've proposed will do the same as the existing code but considerably faster. The code is unlikely to work correctly as it stands - there's no sample data to test against - you will need to amend and test it yourself. Jeff's article[/url] will guide you in the right direction. Notice where he's put comments into my code in his post - these are important, you will find a full explanation in the article.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hi Chris,

    Does this means I will have to introduce a new column, newcost, to the tmp_hhrsales2 table?

  • LadyG (7/22/2010)


    Hi Chris,

    Does this means I will have to introduce a new column, newcost, to the tmp_hhrsales2 table?

    You will need at least one new column, probably oldcost. If you can describe the table as being offline from the production tables for the purposes of performing this update, then it shouldn't be an issue.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hi Chris,

    Script doesn't quite work. I'm not getting the values expects, getting 0s and nulls. What is wrong?? Have attached attributes for both tables.

    SET NOCOUNT ON

    DECLARE @income money,

    @cost money,

    @costAdd money,

    @order_id bigint,

    @order_value bigint,

    @order_line_no bigint,

    @promoID int,

    @PromoCheck int,

    @entCode bigint,

    @new_cost money

    UPDATE tmp_hhrsales2 SET

    @costAdd = CASE

    WHEN net_income = 0 and cost <> 0 THEN @costAdd + @cost

    WHEN net_income <> 0 and @costAdd <> 0 THEN 0

    ELSE @costAdd END,

    cost = CASE

    WHEN net_income = 0 and cost <> 0 THEN 0

    WHEN net_income <> 0 and @costAdd <> 0 THEN cost + @costAdd

    ELSE cost END

    -- extra columns for log_hhrsales_cos_change: oldcost and newcost

    FROM tmp_hhrsales2 WITH(TABLOCKX)

    WHERE PL_Order_Value <> 0 and PromoCheckID <> 60 AND Entity_Code <> 10000001054

    OPTION (MAXDOP 1)

    INSERT INTO log_hhrsales_cos_change

    SELECT PL_Order, PL_Txno, oldcost, new_cost

    FROM tmp_hhrsales2

    WHERE PL_Order_Value <> 0 and PromoCheckID <> 60 AND Entity_Code <> 10000001054

  • In the absence of scripts to create the tables and populate them with data, it's very difficult to tell.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • The problem is with the last part of the script, insert & select. Obvioulsy, as new_cost doesn't exist in the sales table, the calculated data needs to go somewhere in order to extract into the log table. Or am I completely on the wrong train of thought?

Viewing 15 posts - 1 through 15 (of 32 total)

You must be logged in to reply to this topic. Login to reply