Solving the Running Total and Ordinal Rank Problems (Rewritten)

  • Hugo and Tony,

    Let me try to summarise my position on this.

    I don't in fact use the quirky update in permanent or routine production. I never have and probably never will for all the good reasons given by you guys, to various extents. I don't believe I have every posted a quirky-based solution on the forums either (here or elsewhere). I have, however, used it for one-off 'production' jobs where its speed really mattered, and where I could run it personally. That's where its utility lies for me. I (seriously) don't presume to influence others' decisions on the matter.

    I do consider myself a reasonably competent SQL person - and I don't think Quirky Update is for just any J. Random DBA out there who is relatively new to SQL or doesn't want to know how it works.

    Huge respect to Jeff for spreading the set-based mentality and encouraging people to learn more about the product, and all the other good stuff he does. Even if you find the Quirky Update morally abhorrent (lol) he doesn't deserve abuse for it. You guys sure could back it off a notch or two - you all have good points, but it is possible to over-egg this particular pudding. I'm just saying ๐Ÿ˜‰

    In routine production, I personally choose SQLCLR methods* - a subject about which I am occasionally prepared to be religious about. SQLCLR is also not for everyone - it's advanced stuff, which the Quirky Update method arguably is too. I really don't think any more warnings or caveats need adding to the article.

    * - I may also consider Hugo's running total method in future because it is kinda cool. I might just implement it in C# :laugh:

    @hugo: INDEX(0) was in the original article (as I think I mentioned) so it's not a new rule - just one Jeff left out on the rewrite. My ROW_NUMBER extension is new, however. I think the rules are pretty simple actually: (1) Don't do anything flash - keep it simple; (2) See rule one.

    Tony: There's no such thing as total disagreement - at least not as far as I am concerned! I'm just saying that your performance script doesn't compare with the three fastest methods available. The rest of our discussion can stay where it is - I'm happy that truth lies somewhere between black and white there. You may expect blog responses from me in future though :w00t:

    Paul

  • Some interesting posts there, folks. I have to delay participation until after I get home from work tonight.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Hi Paul,

    Thanks for the clarification of your position. I'll clarify mine too.

    I, too, have an enormous respect for Jeff. I always read his articles here with great interest and I totally understand his popularity on this site. But with that popularity comes responsibility. People with far less SQL Server experience than we have come to trust his word, to take his advice for granted. And that's where the danger lies - they might not always distinguish the good advice from the bad.

    You also mention that Jeff doesn't deserve the abuse we're giving him. Apparently, my responses come across as abusive. I am very sorry for that; it's definitely not my intention. I am a firm believer in attacking opinions, but never attacking the persons holding or expressing them. That is what I try to do in this discussion - convince Jeff that the quirky update is not the golden bullet he believes it is, and presents it as, but in fact a very dangerous method that has a large chance of ending up as a shot in one's own foot. And if that fails, then at least make as many other readers as possible see this, and prevent them from implementing this in their systems.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Hugo,

    I have the greatest respect for your work too - I particularly enjoy your SQLBlog entries, for example, and as I mentioned the set-based iteration method is very impressive.

    I think perhaps I could have worded my previous post better - I think 'abuse' was far too strong, and wasn't actually directed at you. There have been a number of posts on threads I have been active on recently which have started to sound a bit personal. You have always been extremely good at targeting the subject matter rather than the speaker!

    So, anyway. The points you have made a good ones for sure. Personally, I don't attach quite so much importance to the risks of others misusing the method. The quirky update is certainly a more advanced topic, requiring a little care from the programmer - but I don't think that's a good enough reason to say it shouldn't be promoted or at least discussed.

    Overall, I think Jeff has done an excellent job of raising awareness of the method and promoting discussion. Seeking to somehow 'censor' the method in a attempt to prevent others from doing daft things doesn't seem like an approach that has been particularly successful, historically speaking. Much better, in my view, to clarify exactly what the method is, how it works, and what the attendant risks might be. I think Jeff has done a great job with that.

    On a slightly different note, have your considered extending you set-based iteration method to implement other analytic functions like LEAD, LAG, or SUM() OVER (x PRECEDING) for example?

    Cheers

    Paul

  • Paul White (2/4/2010)


    On a slightly different note, have your considered extending you set-based iteration method to implement other analytic functions like LEAD, LAG, or SUM() OVER (x PRECEDING) for example?

    No. Not yet, at least. Who knows what the future will bring... (But don't hold back on my account; feel free to beat me to it anytime!)


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Hugo Kornelis (2/4/2010)


    No. Not yet, at least. Who knows what the future will bring... (But don't hold back on my account; feel free to beat me to it anytime!)

    Thanks! I may well do that - with full credit naturally.

  • --------------------------------------------------------------------------------------------------------------------------------------

    tony rogerson (2/4/2010)


    Also - the warning is just that other people have been bitching that this feature is undocumented rather than an up front warning that "never ever use this in a production environment as you are relying on undocumented behaviour that may change right under your feet on the next service pack or cumulative update you apply".

    Not quite true. You must have missed the part where I clearly stated in the article as the very first thing after the introduction...

    Quote from the article...


    Warning:

    ...

    ...

    ...

    In fact, even Microsoft has stated that there is no guarantee that this method will work correctly all the time.

    tony rogerson (2/4/2010)


    If you do that and put it in big red bold letters right at the start of your article then I'd be happy!

    Heh... fess up. Even that wouldn't make you happy because all you're concerned about is that someone might actually use the "Quirky" update. Except for the color not being to your satisfaction, what I wrote qualifies as a very strong warning that even Microsoft doesn't support the method. I wonโ€™t say that it shouldnโ€™t be used in production because I (quite obviously) donโ€™t believe that. I even included verification code in the article that can be executed immediately after using the โ€œQuirkyโ€ Update in production.

    The "bitching" that you refer to wasn't bitching... it was all a part of that same clearly marked warning section that, except for the 3 part SET statement being available, stated that all aspects of the method are undocumented and therefor unsupported. I even warned against the undocumented multi-row variable overlay.

    As for the behavior possibly changing at the next CU or Service Pack, so what? First, fallbacks to a cursor have been clearly identified in the article. Second, even fully documented features change without warning in CUโ€™s and Service Packs. Take what MS did in SQL Server 2000, for example. SP 4 made a change to the very-well-documented-in-Books-Online sp_MakeWebTask that broke a huge amount of code by changing the permissions fromโ€ฆ

    Permissions

    The user must have SELECT permissions to run the specified query and CREATE PROCEDURE permissions in the database in which the query will run.

    ... to ...

    To execute sp_makewebtask you must be a member of the sysadmin fixed server role.

    I can't speak for anyone else, but that instant overnight change to a very well documented feature could have killed a lot of very well established, tried and true, fully Microsoft supported code for me. Why didn't it? Because I do the same thing that everyone else should do before they push a CU or SP onto a production box... TEST THE HELL OUT OF IT! :hehe: Once we established that SP4 caused the break, we did the same thing that I'll do if the "Quirky" update or anything else (MS supported or not) breaks... write a work around and I included that workaround in the article; Revert to a cursor or write a CLR (heh.. knock it off Paulโ€ฆ it tickles. :-P).

    Except for possibly a CLR to do the same, a properly configured "Quirky" update blows all other methods away for performance, is just as accurate as a cursor, and has been since the early days of Sybase and certainly since SQL Server has been in existence. Paul White pointed out (and I implied in the article) that it has survived hot fixes (like the hot fix for when UPDATE ignored WHERE IS NULL in 2k SP 2), CUs, SPs, and even bad SPs (2k SP3 was replaced by 3A because of some nasty faults). If it ever does break and if the best practice of testing CUs and SPs is followed (and it should be in all cases), then there will be plenty of time to work around this incredibly useful, very high performance, undocumented feature just like I had to work around the code break on the fully documented and supported sp_MakeWebTask feature.

    tony rogerson (2/4/2010)


    But, to publically recommend this method is just irresponsible to be frank.

    Ok... I'm going to just go ahead and take that as a seriously unqualified and non-observant ad hominem attack on your part. I included several unmistakable warnings (although I may have to rethink that because you missed them :-P) about the method including the fact that Microsoft stated that it will not always work in a paragraph right after the Introduction. I repeated many of the warnings throughout the document and stressed some of them again in the conclusion. I gave workarounds for the method in case someone understandably felt uneasy about the method which included a rock solid (albeit slow, thanks for the correction on that Hugo) cursor alternative and a recommendation that a CLR could be used as an alternative. I warned that the reason for the rewrite is that some things that will break the method were discovered and unabashedly and frankly included what they were as well as the fixes. I wrote verification code that could be put into production to check the "Quirky" update and fully explained its purpose. Just to make sure that no one miscopied that and other code, I included it in a carefully marked attachment. I even stated the painfully obvious in Rule 10 that you MUST test the method to make sure it works after you've written it (even you couldnโ€™t miss that oneโ€ฆ it has RED lettering :hehe:). I'm pretty sure that doesn't qualify me or the article as "irresponsible". ๐Ÿ˜‰

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Hugo... my sincere apologies for not addressing you tonight. I worked late and I wanted to address Tony's posts first. I'll try to address your good concerns tomorrow night. Since that will be a Friday night, I'll probably have a beer or two on the way home and beg you, in advance, to forgive any spelling errors I may make. ๐Ÿ™‚

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (2/4/2010)


    Hugo Kornelis (2/4/2010)


    Hi Jeff,

    As indicated in my previous message, I somehow lost sight of this discussion. Sorry for my sudden disappearance from this topic.

    First, thank you for optimizing my alternative running totals code even further. I hope you don't mind me using the optimized version of this code (with attribution, of course) for future presentations, blog posts, articles, etc.

    Absolutely not a problem. Thanks for asking. Glad I could make a contribution. Thanks, Hugo.

    When reading over all the new posts, I didn't find any reaction to one of the points I raised. It's about the claim that nobody has yet managed to break the quirky update. I did post at least two bits of code in this thread that did break it. You dismissed the first as using a JOIN; I objected to that, as an IN with a constant list rather than a subquery is not equivalent to a JOIN, but to a series of OR'ed conditions, and then posted another code snippet that breaks it without using IN or OR. If you ever responded to my objection or the last snippet, I overlooked the reaction - can you please tell me the page number to re-read? And if you never responded, than I hope you will do that this time.

    For your convenience, here is the last code I posted again:

    DECLARE @Counter INT;

    SET @Counter = 0;

    UPDATE dbo.TransactionDetail

    SET @Counter = AccountRunningCount = @Counter + 1

    FROM dbo.TransactionDetail WITH (TABLOCKX)

    WHERE TransactionDetailID BETWEEN 120000 AND 120010

    OPTION (MAXDOP 1);

    GO

    --===== Select all the rows in order by the clustered index

    SELECT *

    FROM dbo.TransactionDetail

    WHERE TransactionDetailID BETWEEN 120000 AND 120010

    ORDER BY AccountID, Date, TransactionDetailID;

    My apologies. You're correct. I didn't address that. I'll go back and look at your original post to make sure I get the context right.

    Hi Hugo,

    Sorry this took longer than expected. Ah... looking back at the original post, I remember. What this proves is that my explanation as to why it works sucks and is in need of some revamping. ๐Ÿ˜›

    In the testing I just did, it would appear that the "source" in the execution plan may be more important than the destination. I ran the following queries and only the ones that either had both a clustered index update and a clustered index scan or just a clustered index update with no additional source worked as advertised. Your good example uses a clustered index update but has a clustered index seek as a source and that's when things seem to go astray.

    I don't know if you have the time, Hugo, but if you do, I'd sure appreciate your good verification so I can change the article to match. I'll also add the caveat that I don't fully trust execution plans and that the correct order of the update should still (always) be verified by using something like the verification code which is included in the article. Here's the code I used...

    ------------------------------------------------------------------------------------------

    --===== This fails to work in the order of the clustered index.

    -- The "source" in the execution plan is a Clustered Index SEEK.

    DECLARE @Counter INT

    SELECT @Counter = 0

    UPDATE dbo.TransactionDetail

    SET @Counter = AccountRunningCount = @Counter + 1

    FROM dbo.TransactionDetail WITH (TABLOCKX)

    WHERE TransactionDetailID BETWEEN 120000 AND 120010

    OPTION (MAXDOP 1)

    GO

    ------------------------------------------------------------------------------------------

    --===== This works correctly in the order of the clustered index.

    -- It only uses a Clustered Index Update and has no separate "source".

    -- After running the verification code to prove that it actually worked,

    -- I would trust it in the future.

    DECLARE @Counter INT

    SELECT @Counter = 0

    UPDATE dbo.TransactionDetail

    SET @Counter = AccountRunningCount = @Counter + 1

    FROM dbo.TransactionDetail WITH (TABLOCKX)

    WHERE AccountID BETWEEN 50 AND 60

    OPTION (MAXDOP 1)

    GO

    ------------------------------------------------------------------------------------------

    --===== This works correctly in the order of the clustered index. HOWEVER,

    -- it uses a Clustered Index Update and a NonClustered Index Scan

    -- on a different index. Since it's a NonClustereded Index Scan,

    -- I wouldn't trust it even if the verification code said it worked

    -- correctly because of the potential for a "Merry-Go-Round" index

    -- in the future.

    DECLARE @Counter INT

    SELECT @Counter = 0

    UPDATE dbo.TransactionDetail

    SET @Counter = AccountRunningCount = @Counter + 1

    FROM dbo.TransactionDetail WITH (TABLOCKX)

    WHERE Date >= '20050701' AND Date < '20050801'

    OPTION (MAXDOP 1)

    GO

    ------------------------------------------------------------------------------------------

    --===== This works correctly in the order of the clustered index.

    -- It only uses a Clustered Index Update and a Clustered Index SCAN

    -- as the "source".

    -- After running the verification code to prove that it actually worked,

    -- I would trust it in the future.

    DECLARE @Counter INT

    SELECT @Counter = 0

    UPDATE dbo.TransactionDetail

    SET @Counter = AccountRunningCount = @Counter + 1

    FROM dbo.TransactionDetail WITH (TABLOCKX)

    WHERE Amount BETWEEN 10 AND 20

    OPTION (MAXDOP 1)

    GO

    ------------------------------------------------------------------------------------------

    As a side bar, there's no practical reason to do any of these "WHERE" type of updates except for the one that uses "WHERE AccountID ..." which would explain why I didn't test them before the article. That's not a good excuse... it's just the reason. Your query does just what it was supposed to do, though... show that just a Clustered Index Update is not a sufficient check to determine whether the "Quirky" Update had any chance of being successful or not.

    Thanks for the help and your testing, Hugo. I'm looking forward to your feedback. Of course, if you don't have the time or the inclination, I'll understand that, as well.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Hey does anyone know if it's possible to do this solution but insure that the fields/variables are updated in a specific order for each row?

    Like say I have a table with columns (A, B, C). The value of A depends on the values of the previous row B and C, and B and C depend on the current row's A. Whenever I try this strategy to perform this I get random results (sometimes it works, sometimes it doesn't)

  • Gabriel P (2/18/2010)


    Hey does anyone know if it's possible to do this solution but insure that the fields/variables are updated in a specific order for each row?

    Like say I have a table with columns (A, B, C). The value of A depends on the values of the previous row B and C, and B and C depend on the current row's A. Whenever I try this strategy to perform this I get random results (sometimes it works, sometimes it doesn't)

    Sounds like a great question to post on the appropriate forum to me :laugh:

    I see how your question relates to the article, but it's a separate question really.

    I'll look out for your thread.

    Paul

  • Hi Gabriel,

    I think you've hit why myself, Hugo and others have been saying don't use this Quirky Update technique.

    Look at another approach - seriously; an approach that works, is documented and does not rely on physical ordering of rows and isn't so reliant on many conditions being met in order for it to work - conditions that may change between builds of SQL Server.

    If you are stuck on your problem then start a new thread but stating what you are trying to do.

    Many thanks,

    Tony.

  • tony rogerson (2/18/2010)


    I think you've hit why myself, Hugo and others have been saying don't use this Quirky Update technique.

    I think you're making an unsound logical leap there Tony ๐Ÿ˜‰

    tony rogerson (2/18/2010)


    Look at another approach - seriously; an approach that works, is documented and does not rely on physical ordering of rows and isn't so reliant on many conditions being met in order for it to work - conditions that may change between builds of SQL Server.

    Physical ordering of rows? Really? Are you sure that is what you mean?

    There are very few conditions necessary to make the Quirky Update work, and they haven't changed.

    None of the huge changes in SQL Server from 6.5 to current builds have broken it either.

    I also hope that by 'another approach' you don't mean a CURSOR! :laugh:

    tony rogerson (2/18/2010)


    If you are stuck on your problem then start a new thread but stating what you are trying to do.

    Agreed. I look forward to an energetic discussion on that thread...

    Paul

  • Yes - I do mean a cursor; a static one and not a "update where current".

    I don't honestly know what your problem is here Paul - even Jeff has found other problems with his routine (see a couple of posts up).

    IF you are saying you are happy for this to be used in a production environment then frankly should be sacked because it is clear and proven that its not repeatable behaviour that is GUARENTEED to work.

    Hey - less just drawn the line; I believe people should be told they shouldn't be using the quirky update so when I see people trying to use it I'll be professional and tell them DON'T :w00t:

    Tony.

  • tony rogerson (2/18/2010)


    Yes - I do mean a cursor; a static one and not a "update where current".

    I don't honestly know what your problem is here Paul - even Jeff has found other problems with his routine (see a couple of posts up).

    IF you are saying you are happy for this to be used in a production environment then frankly should be sacked because it is clear and proven that its not repeatable behaviour that is GUARENTEED to work.

    Hey - less just drawn the line; I believe people should be told they shouldn't be using the quirky update so when I see people trying to use it I'll be professional and tell them DON'T :w00t:

    Tony.

    You take me too seriously Tony!

    Everyone is free to decide for themselves, and you and I are entitled to hold different opinions and advocate them enthusiastically.

    I don't have a problem with your point of view, per se, I just disagree about how big an issue it is.

    Quirky Update is an advanced technique, that true. You can't beat it for speed or correctness on a large data set, if you can follow a few simple and pretty intuitive rules. Cursor-based solutions promote bad habits and take geological time to complete on large data sets. I might argue that it is irresponsible to support that method when the set-iteration method is generally so much better.

    I do think you go too far with your employment-related remarks. Take a breath!

    Paul

Viewing 15 posts - 121 through 135 (of 307 total)

You must be logged in to reply to this topic. Login to reply