UPDATE using a self-join

  • Tom.Thomson (1/13/2011)


    OK, so Merge gives an error message, which is certainly a good thing (it surprised me, because reading the MS documentation I didn't spot that; but with hindsight it doesn't surprise me, becaude MS documentation often buries useful information where you are unlikely to find it). But what does that error message say? It says

    Error message


    Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.

    And what does BoL say about modifying the ON clause to do that? It says

    BoL


    Caution It is important to specify only the columns from the target table that are used for matching purposes. That is, specify columns from the target table that are compared to the corresponding column of the source table. Do not attempt to improve query performance by filtering out rows in the target table in the ON clause, such as by specifying AND NOT target_table.column_x = value. Doing so may return unexpected and incorrect results.

    In other words doing what the error message suggests you do to avoid the problem will deliver incorrect results. Not exactly useful, is it?

    If the writer of the query is awake and avoids the multiple match, there's no problem with either syntax - except that the Merge description suggests that modifying the ON condition to avoid the multiple match will cause incorrect results; that strikes me as saying the Merge syntax is inferior to the alternative, where the multiple match issue can be avoided without introducing errors which seems to be beyond the Capability of Merge's ON clause. Again, as I said before, I don't really know what's going on here, it's more than 15 years since I stopped being payed to concern myself with (amongst other things) SQL definitions and getting clear and solid declarative semantics for the language, so I'm nowhere near up to date, I just get the feeling that something doesn't really work (because it's not sufficiently expressive, perhaps).

    The way to handle it isn't in the ON clause, it's in the definition of the data source. Use a CTE or inline derived table, regulate which rows you want in there, not in the ON clause.

    Yes, you can do it in the ON clause. It's just not recommended.

    And no, that doesn't contradict the error message. The note is specifically about trying to improve performance by making the engine do unnecessary/unpredictable computations, not about trying to make it give correct results. The problem they specifically are trying to avoid is using a NOT statement can result in a "no-match in target" situation that you didn't anticipate, possibly resulting in inserts or other operations you don't want. Until you're really comfortable with Merge, it's a little tough to predict everything it will do (the Output clause can really help with that, especially if you test while wrapped in a transaction that you can roll back).

    Is the error message and documentation less than clear? Absolutely. Was Joe's post less than clear? Definitely. But those are kind of givens, like gravity points down, and young men will do stupid things to impress women. But Joe is right that it is a somewhat better solution than Update From, for a technical reason. He's just really bad at expressing this in writing. That's why I don't buy his books - if all of his writing is this muddled and hyper-aggressive, why pay for it?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Damn! I thought I'd never say this but score 1 for Joe.

    Actually the FROM syntax is very useful as a previous poster mentioned in that the UPDATE can be commented out and replaced with a SELECT to see the count of rows. I can quickly eyeball what's about to be updated. Afte that I usually put a GROUP BY HAVING COUNT(*) > 1 in the select just to make sure there isn't something that I don't know about in the data. Not a perfect solution since it could pop up periodically.

    Todd Fifield

  • CELKO (1/14/2011)


    Since there is a move among MVPs to deprecate this in favor of MERGE and no other product uses it, I would clean up my code and get with the 21-sst Century.

    CELKO (1/14/2011)


    Jeff, the old stuff I used to post before MERGE did blow up in the proper fashion.

    UPDATE Orders

    SET some_col

    = (SELECT item_price

    FROM OrderDetails

    WHERE OrderDetails.order_nbr = Orders.order_nbr)

    WHERE EXISTS

    (SELECT *

    FROM OrderDetails

    WHERE OrderDetails.order_nbr = Orders.order_nbr);

    This will of course result in an error, and even the most junior of junior programmers will eventually figure out (probably by asking a senior) what's wrong. At that point, either the query is corrected to match the request, or a note is sent back to management asking for a clarification of the ambiguity in the request.

    Since there is a move among MVPs to deprecate this in favor of MERGE and no other product uses it, I would clean up my code and get with the 21-sst Century.

    Heh... leave it to you. I was mostly agreeing with you and you turn around and bad mouth me with your "rapier wit". How about you get with the 21st Century and realize you don't need to be a sarcastic steamer to everyone all the time to get your point across? 😉

    So far as a "move among MVPs to deprecate this in favor of MERGE", they need to take it one step further to actually do it right if they're going to do it at all. Deprecate INSERT and DELETE as well as UPDATE because MERGE does it all. After all, DELETE in T-SQL also has a FROM clause.

    So far as "no other product uses it", so what? I don't believe in the myth of truly portable SQL especially when it comes to writing high performance batch code. 😉 Are we getting closer to true portability becoming a reality? Yes but for true portability, everyone would have to have exactly the same features in their engines which would also stifle the benefits of competition between vendors.

    Speaking of "21st century", have you figured out that you don't need to use an archaic push stack to convert Adjacency Lists to Nested Sets, yet?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • CELKO (1/14/2011)


    Tom: MERGE gives an error because it has to by definition. Values in a column are scalars; this is part of First Normal Form. I cannot put multiple scalars into that column. An UPDATE has to be deterministic by definition, like any relational operator, so I cannot put in a random value. An UPDATE that has a match has to be completed, so I cannot skip it. The only solutionis to report the truth -- we have a problem, Houston!

    Yes, as I said before I'm happy with it giving an error - and incidentally I'd be happy is UPDATE...FROM did the same. Although the Merge syntax has some advantages over UPDATE...FROM it also has a disadvantage - it isn't backwards compatible with SQL 2005 or SQL 2000 and from my experience of trying to get customers to pay for SQL upgrades people will be stuck with maintaining systems using these versions long after the end of standard support.

    Ever work with PICK or other NFNF databases? They can keep multiple scalars in a column.

    NFNF databases? I once inherited an SQL database that contained an NFNF table! The first thing I did was to make all code that I could control that operated on that table convert anyrelevant rows into a FNF temporary table, do the required manipulations without any NFNF nonsense, and then update affected rows in the original table by converting back to SRF (Standard Rubbish Form). I am only able to account for the existence of NFNF relational databases in the same way as I account for people preferring COM to .NET (ie by attributing it to the indominitable stupidity of developers).

    Also, if you played with Sybase, their version of UPDATE. FROM.. was a little different at one time. They made a total out of the duplicate matches and over-wrote the old value with it.

    That's awful. Of course I can do that in T-SQL very easily using either "quirky update" (which may require safeguards if the aggregation required is non-commutative or non-associative) or by doing an aggregation using group by, whether I use Merge or Update...From it is extremely straight forward to do. But that it should be done behind the scenes withouut a clear indication that the author of the query required it - that really is awful.

    Jeff, the old stuff I used to post before MERGE did blow up in the proper fashion.

    UPDATE Orders

    SET some_col

    = (SELECT item_price

    FROM OrderDetails

    WHERE OrderDetails.order_nbr = Orders.order_nbr)

    WHERE EXISTS

    (SELECT *

    FROM OrderDetails

    WHERE OrderDetails.order_nbr = Orders.order_nbr);

    I like that - it's nice and clean.

    This will of course result in an error, and even the most junior of junior programmers will eventually figure out (probably by asking a senior) what's wrong. At that point, either the query is corrected to match the request, or a note is sent back to management asking for a clarification of the ambiguity in the request.

    Since there is a move among MVPs to deprecate this in favor of MERGE and no other product uses it, I would clean up my code and get with the 21-sst Century.

    Deprecating Update...From in favour of Merge would be crazy - the right thing to do is deprecate Update altogether, since it can't do anything Merge can't do. It would probably be a good idea to give it a few years for the development community out there to discover Merge before deprecating Update though.

    edit: getting the quote brackets right.

    Tom

  • siamak.s16 (1/12/2011)


    No but he has a book you can buy that proves it!

    Joe should just not bother actually making a different post for every thread he posts in and just write "Your code sucks, buy my book to improve it!" for everything. That's pretty much what he does anyway.

    Not necessarily true, siamak.s16.

    I understand your frustration over responses like this, but if you keep posting comments like this, then people are going to start responding to you the same way that you're responding to him.

    We all need a lesson in manners lately, as we've all been guilty of flaming certain people just because we've formed biased opinions about those people's posts. We all need to remember SSC is a professional forum. How we behave here is a reflection of how we act in the workplace. Given how many potential coworkers and employers frequent these forums, it is perhaps the better choice to avoid baiting and respond with facts and questions that allow for professional debates.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • tfifield (1/14/2011)


    Damn! I thought I'd never say this but score 1 for Joe.

    Actually the FROM syntax is very useful as a previous poster mentioned in that the UPDATE can be commented out and replaced with a SELECT to see the count of rows. I can quickly eyeball what's about to be updated. Afte that I usually put a GROUP BY HAVING COUNT(*) > 1 in the select just to make sure there isn't something that I don't know about in the data. Not a perfect solution since it could pop up periodically.

    Todd Fifield

    You can still get more than an idea of how the update will end up looking.

    begin transaction;

    merge ...

    output inserted.*;

    commit;

    Don't run the commit till you're happy with the data in the output grid.

    If you don't want to hold a transaction open, use rollback on the test run, then commit when you're sure it's giving you what you want.

    The Output clause has been possible since SQL 2005, and is better than eyeballing a Select version of an Update syntax, since you don't have to change any of the command in order to run it.

    Edit: For clarity.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Gus,

    You busted my chops on this one. I currently only have 1 client on 2008 so I don't actually user the MERGE statement much.

    The idea of dumping the output into a table variable that won't be affected by a ROLLBACK is very good. I'm going to use it next time.

    Todd Fifield

  • I'm with Jeff. The update gives me exactly what I would expect.

  • CELKO (1/31/2011)


    Actually, INSERT and UPDATE are single table operations, while MERGE requires a source and target.

    I haven't moved up to SQL 2k8 yet, so haven't had a chance to use MERGE. But that's good information to know that I didn't know before. Thanks for mentioning it.

    CELKO (1/31/2011)


    Successful projects start on lesser SQLs (even ACCESS ) then have to move to DB2, Oracle, Teradata or whatever.

    Including Access? What other lesser SQLs are there? I'm assuming you mean non-robust solutions, so I'm truly curious as Access (and maybe Lotus) is the only one I know for sure.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Brandie Tarvin (2/1/2011)


    CELKO (1/31/2011)


    Actually, INSERT and UPDATE are single table operations, while MERGE requires a source and target.

    I haven't moved up to SQL 2k8 yet, so haven't had a chance to use MERGE. But that's good information to know that I didn't know before. Thanks for mentioning it.

    CELKO (1/31/2011)


    Successful projects start on lesser SQLs (even ACCESS ) then have to move to DB2, Oracle, Teradata or whatever.

    Including Access? What other lesser SQLs are there? I'm assuming you mean non-robust solutions, so I'm truly curious as Access (and maybe Lotus) is the only one I know for sure.

    Would FoxPro count?

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • Stefan Krzywicki (2/1/2011)


    Brandie Tarvin (2/1/2011)


    CELKO (1/31/2011)


    Actually, INSERT and UPDATE are single table operations, while MERGE requires a source and target.

    I haven't moved up to SQL 2k8 yet, so haven't had a chance to use MERGE. But that's good information to know that I didn't know before. Thanks for mentioning it.

    CELKO (1/31/2011)


    Successful projects start on lesser SQLs (even ACCESS ) then have to move to DB2, Oracle, Teradata or whatever.

    Including Access? What other lesser SQLs are there? I'm assuming you mean non-robust solutions, so I'm truly curious as Access (and maybe Lotus) is the only one I know for sure.

    Would FoxPro count?

    Would filemaker count?

  • Stefan Krzywicki (2/1/2011)


    Brandie Tarvin (2/1/2011)


    CELKO (1/31/2011)


    Actually, INSERT and UPDATE are single table operations, while MERGE requires a source and target.

    I haven't moved up to SQL 2k8 yet, so haven't had a chance to use MERGE. But that's good information to know that I didn't know before. Thanks for mentioning it.

    CELKO (1/31/2011)


    Successful projects start on lesser SQLs (even ACCESS ) then have to move to DB2, Oracle, Teradata or whatever.

    Including Access? What other lesser SQLs are there? I'm assuming you mean non-robust solutions, so I'm truly curious as Access (and maybe Lotus) is the only one I know for sure.

    Would FoxPro count?

    Only if you also count CSV files as "databases".

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (2/1/2011)


    Stefan Krzywicki (2/1/2011)


    Brandie Tarvin (2/1/2011)


    CELKO (1/31/2011)


    Actually, INSERT and UPDATE are single table operations, while MERGE requires a source and target.

    I haven't moved up to SQL 2k8 yet, so haven't had a chance to use MERGE. But that's good information to know that I didn't know before. Thanks for mentioning it.

    CELKO (1/31/2011)


    Successful projects start on lesser SQLs (even ACCESS ) then have to move to DB2, Oracle, Teradata or whatever.

    Including Access? What other lesser SQLs are there? I'm assuming you mean non-robust solutions, so I'm truly curious as Access (and maybe Lotus) is the only one I know for sure.

    Would FoxPro count?

    Only if you also count CSV files as "databases".

    I've never used FoxPro, but I know people who swear by it.

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • GSquared (2/1/2011)


    Stefan Krzywicki (2/1/2011)


    Brandie Tarvin (2/1/2011)


    CELKO (1/31/2011)


    Successful projects start on lesser SQLs (even ACCESS ) then have to move to DB2, Oracle, Teradata or whatever.

    Including Access? What other lesser SQLs are there? I'm assuming you mean non-robust solutions, so I'm truly curious as Access (and maybe Lotus) is the only one I know for sure.

    Would FoxPro count?

    Only if you also count CSV files as "databases".

    Don't get me started. I know Business Users who swear Excel is a database. <*headdesk*>

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Brandie Tarvin (2/1/2011)


    Don't get me started. I know Business Users who swear Excel is a database. <*headdesk*>

    It isn't? 😛

    I know I've seen people make the case that Vlookup is superior to JOIN because you do the join and get data back at the same time :cool::w00t:



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

Viewing 15 posts - 16 through 30 (of 37 total)

You must be logged in to reply to this topic. Login to reply