There Must Be 15 Ways To Lose Your Cursors… Part 2: Just Put It in a S

Question

There Must Be 15 Ways To Lose Your Cursors… Part 2: Just Put It in a S

Viewing 15 posts - 286 through 300 (of 316 total)

You must be logged in to reply to this topic. Login to reply

Mike C SSC-Insane Points: 23224 More actions · Answer 1

Jeff Moden (7/22/2010)
Mike C (7/21/2010)
Well... I've seen multirow string concatenations return completely incorrect results before based on simple things (like table indexes) at the whim of the optimizer. No need to intentionally try to break it, just an honest index change on a table to try to optimize for a particular query and suddenly the results are completely wrong. I believe Itzik posted a very simple example a while ago.
The main problem I have with the quirky update and multirow variable aggs is when the calculation depends on specific ordering, since ordering just isn't guaranteed... especially (but not limited to) when we throw parallel processing into the mix. That's why I wanted to look at your example a little deeper, to see if it depends on ordering of inputs to generate the correct output. It may not, in which case it may be perfectly safe without throwing itself on the mercy of the optimizer 🙂
In all fairness, in your article on Running Totals you list several rules just to keep the quirky update from breaking; it seems you have to be hell bent on avoiding unintentional breaks to make it work correctly 🙂 Check out Itzik's article (I'll see if I can find a link), you may think another new rule is in order 🙂
Yep.... kinda like a cursor or any other code. If you write it wrong, you get incorrect results. 😉 The rules are pretty simple for usage and I did include verification code the combination of which is still faster than a cursor. The key is, if you don't trust it, don't use it. I've been using if for many years and it's never gone haywire. If you follow the rules, it won't.

I agree with your comparison of this method to cursors. Like a cursor, this method requires you to override the optimizer's ability to put together an efficient plan to return your results. Unlike cursors, the result of the multirow concatenation method is subject to changes in indexes, hardware, the position of the sun in the sky, or any other internal or external factor that could cause the optimizer to produce a different plan 🙂 ...Then you have to come up with new rules 🙂

Personally I don't trust it because the result that is returned relies on the optimizer's judgment, and how badly you want to override the optimizer's ability to produce an efficient plan. I don't trust it because I don't believe the optimizer should have any control over what your end result actually is -- it should only have a vote in how to most efficiently deliver the result you asked for. 😉

Mike C SSC-Insane Points: 23224 More actions · Answer 2

peter-757102 (7/22/2010)
Mike C,
I understand you point, I hope you can see mine too. In fact I never used the querky update anywhere as I never needed it. But more demanding cases might and as such I don't write it off, it just does not have my preference.
Read the following and you will realize that querky update it the last thing you should worry about!
Recently I "discovered" some pretty insane behavior in what seems like perfectly fine code. This ** might ** not be new to you, but I was really surprised/annoyed. It was a case where expressions in the select clause were executed before all conditions in a matching where clause were met.
I always used to assume there where stages of processing, where filtering was first performed to reduce the size of the working set to a minimum as this makes perfect sense. It turns out this is not always happening and operations in the select clause can error on data that functionally should never get processed due to filtering in the where clause. In such cases you need to write protection code (case when ... else ... end) in the select clause, duplicating the relevant where conditions from the where clause!!! Derrived tables and other fancy constructions will not work and you never know beforehand when such a case turns up!
Something as simple as:
select
left( 'abc', T1.v )
from
T1
where
T1.v >= 0
;
Might go wrong when T1.v can be a negative number and you should therefore write the query as:
select
case when T1.v < 0 then null else left( 'abc', T1.v ) end
from
T1
where
T1.v >= 0
;
I find this sort of thing many more times disturbing/dangerous then using a quirky update deliberatly and with full knowledge and documentation. Why? Because I suspect hardly anyone will know they have to protect their expressions to such a silly extend and the world therefore must be full of non-working code. In fact I expect that ** nearly all ** string processing code in use today to be sensitive to this problem. I knew expression order in the where clause sometimes must be enforced, but never that this problem extended beyond the where clause and into the select clause. Nor have I ever read of this problem anywhere!
I find this a total mess and hope it is just an SQL Server 2008 implementation bug that can be fixed and not standard behavior. Because if it is standard, I think the people responsible for setting the standard are in dire need of some practical re-education. And I express myself quite mildly here.

Hi Peter,

I understand your point. I just don't like leaving the quality of my results up to the optimizer is all 🙂 My point is that cursors are simply a means to override the optimizer. If someone was intent on overriding the optimizer's ability to do it's job anyway then why not use the documented methods? I don't see any benefit to using this particular method, but I do see a lot of potential downside.

As for your other issue -- there is a logical processing order, and the WHERE clause logically comes early in the processing, but that doesn't mean it's performed that way in real time. By definition all that has to happen is that no matter what order the operations in the query are performed, the results generated must be the same as if it were executed in the prescribed logical order (errors/exceptions aside). There are very few areas in SQL in which order of operation within a statement is explicitly guaranteed.

In this case it sounds as if the optimizer decided that it would be more efficient to query the T1.v column once and use that one scan to fulfill both references to it in your query. It apparently thought this was more efficient than scanning the table once to fulfill the WHERE clause and then spooling an intermediate result set to temporary storage, only to scan those intermediate results again to fulfill the T1.v reference in your projection. Except where it's explicitly documented, relying on specific order of execution within a query is a recipe for disaster :w00t:

Thanks

Mike C

TomThomson SSC Guru Points: 104773 More actions · Answer 3

peter-757102 (7/22/2010)
I find this sort of thing many more times disturbing/dangerous then using a quirky update deliberatly and with full knowledge and documentation. Why? Because I suspect hardly anyone will know they have to protect their expressions to such a silly extend and the world therefore must be full of non-working code. In fact I expect that ** nearly all ** string processing code in use today to be sensitive to this problem. I knew expression order in the where clause sometimes must be enforced, but never that this problem extended beyond the where clause and into the select clause. Nor have I ever read of this problem anywhere!
I find this a total mess and hope it is just an SQL Server 2008 implementation bug that can be fixed and not standard behavior. Because if it is standard, I think the people responsible for setting the standard are in dire need of some practical re-education. And I express myself quite mildly here.

Are you sure this bug (if it's not a bug I will give up using T-SQL as it's not a useful language) is restricted to string processing code? Why not simple arithmetic too, so that I have to write

select case when T1.y <> 0.0

then T1.x/T1.y

else null

end as ratio

from T1

where IsNull(T1.y,0.0) <> 0.0

or something equally ridiculous?

There is an interesting statement in BoL at http://msdn.microsoft.com/en-us/library/ms190623.aspx

BoL
A SELECT statement defines only the following:
* The format of the result set. This is specified mostly in the select list. However, other clauses such as ORDER BY and GROUP BY also affect the final form of the result set.
* The tables that contain the source data. This is specified in the FROM clause.
* How the tables are logically related for the purposes of the SELECT statement. This is defined in the join specifications, which may appear in the WHERE clause or in an ON clause following FROM.
* The conditions that the rows in the source tables must satisfy to qualify for the SELECT statement. These are specified in the WHERE and HAVING clauses.

(similar statements are included in BoL for SQLS 2008, SQLS2005, SQLS 2000)

It's noteworthy that the select statement does not, according to this BoL entry, specify how the values in the result set relate to the values in the tables from which the results are derived! If that were true, the language would serve no useful function - but evidently someone at MS believes it to be true.

Perhaps the bug you describe is related to this bizarre belief ("qualify for the SELECT statement" is open to all sorts of interpretation, isn't it?). But I hope it's just an accidental bug.

Tom

Mike C SSC-Insane Points: 23224 More actions · Answer 4

Tom.Thomson (7/22/2010)
peter-757102 (7/22/2010)
I find this sort of thing many more times disturbing/dangerous then using a quirky update deliberatly and with full knowledge and documentation. Why? Because I suspect hardly anyone will know they have to protect their expressions to such a silly extend and the world therefore must be full of non-working code. In fact I expect that ** nearly all ** string processing code in use today to be sensitive to this problem. I knew expression order in the where clause sometimes must be enforced, but never that this problem extended beyond the where clause and into the select clause. Nor have I ever read of this problem anywhere!
I find this a total mess and hope it is just an SQL Server 2008 implementation bug that can be fixed and not standard behavior. Because if it is standard, I think the people responsible for setting the standard are in dire need of some practical re-education. And I express myself quite mildly here.
Are you sure this bug (if it's not a bug I will give up using T-SQL as it's not a useful language) is restricted to string processing code? Why not simple arithmetic too, so that I have to write
select case when T1.y <> 0.0
then T1.x/T1.y
else null
end as ratio
from T1
where IsNull(T1.y,0.0) <> 0.0
or something equally ridiculous?
There is an interesting statement in BoL at http://msdn.microsoft.com/en-us/library/ms190623.aspx
BoL
A SELECT statement defines only the following:
* The format of the result set. This is specified mostly in the select list. However, other clauses such as ORDER BY and GROUP BY also affect the final form of the result set.
* The tables that contain the source data. This is specified in the FROM clause.
* How the tables are logically related for the purposes of the SELECT statement. This is defined in the join specifications, which may appear in the WHERE clause or in an ON clause following FROM.
* The conditions that the rows in the source tables must satisfy to qualify for the SELECT statement. These are specified in the WHERE and HAVING clauses.
(similar statements are included in BoL for SQLS 2008, SQLS2005, SQLS 2000)
It's noteworthy that the select statement does not, according to this BoL entry, specify how the values in the result set relate to the values in the tables from which the results are derived! If that were true, the language would serve no useful function - but evidently someone at MS believes it to be true.
Perhaps the bug you describe is related to this bizarre belief ("qualify for the SELECT statement" is open to all sorts of interpretation, isn't it?). But I hope it's just an accidental bug.

Sorry to hear you're leaving T-SQL :crying: Unfortunately you probably won't find much love in any dialect of SQL. The problem (?) is that the SQL standard doesn't define physical processing steps. All that is left to the vendors. The standard simply says here's the keywords, here's the definition of the keywords, here's what you put in and here's what you get out. It defines logical processing order, but not the physical steps required to get from A to B. Every vendor uses their own physical implementation optimizations to get results back quickly, and that often means minimizing scans.

Let's walk through this briefly and see what steps would be required to implement the original query to perform it in logical order steps:

select

left( 'abc', T1.v )

from

T1

where

T1.v >= 0

;

Now let's say we have 10 million rows in T1 and 5 million of them match the where clause (T1.v >= 0). Here's what we need to do:

1. Scan the 10 million row table (look at every single row) and compare to 0.

2. Dump the 5 million rows that match the WHERE clause into a temp table or other intermediate storage.

3. Scan the 5 million rows that match the WHERE clause and perform the projection against them (left( 'abc', T1.v )).

We just scanned 15 million rows and stored 5 million rows in temporary storage to fulfill this query. Now let's look at one way we can optimize this:

1. Scan the 10 million row table.

2. Execute the WHERE clause and the projection at the same time on each row since both clauses are using the same columns.

Obviously the so-called "buggy" execution plan requires the developer to be more careful, but gracefully handling invalid function parameter and divide-by-zero errors are not new problems in programming and shouldn't come as a surprise to most devs. Obviously if you wanted to force the physical order of execution you can always forcefully serialize the intermediate results yourself:

CREATE TABLE #t

(

v INT

);

INSERT INTO #t (v)

SELECT T1.v

FROM T1

WHERE T1.v >= 0;

SELECT LEFT( 'abc', #t.v)

FROM #t;

Mike C

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 5

Mike C (7/22/2010)
Obviously the so-called "buggy" execution plan requires the developer to be more careful, but gracefully handling invalid function parameter and divide-by-zero errors are not new problems in programming and shouldn't come as a surprise to most devs.

Hmmm... sounds just like the rules for a quirky update. 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 6

peter-757102 (7/22/2010)
Jeff Moden (7/22/2010)
Peter,
I've not tried this particular example but I'm pretty sure that you don't actually need the index hint when you use an ORDER BY to get the correct order. The MAXDOP 1 is the important part.
Check the link Jeff, you see that it is required to make it function as expected as the data is retrieved from a non-clustered covering index instead of the clustered key. In itself not a problem, where it not that the concatenation is processed BEFORE the order by takes effect, resulting in just one tiny string instead of a concatenation of all strings.

Thanks, Peter. I'll check.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

Mike C SSC-Insane Points: 23224 More actions · Answer 7

Jeff Moden (7/22/2010)
Mike C (7/22/2010)
Obviously the so-called "buggy" execution plan requires the developer to be more careful, but gracefully handling invalid function parameter and divide-by-zero errors are not new problems in programming and shouldn't come as a surprise to most devs.
Hmmm... sounds just like the rules for a quirky update. 😉

What's that? If you divide two numbers make sure the dividend is not zero? :-). Most devs learned that rule in grade school 🙂

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 8

Mike C (7/22/2010)
Jeff Moden (7/22/2010)
Mike C (7/22/2010)
Obviously the so-called "buggy" execution plan requires the developer to be more careful, but gracefully handling invalid function parameter and divide-by-zero errors are not new problems in programming and shouldn't come as a surprise to most devs.
Hmmm... sounds just like the rules for a quirky update. 😉
What's that? If you divide two numbers make sure the dividend is not zero? :-). Most devs learned that rule in grade school 🙂

Nah... I was talking about the developer being careful. 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

Mike C SSC-Insane Points: 23224 More actions · Answer 9

Jeff Moden (7/22/2010)
Mike C (7/22/2010)
Jeff Moden (7/22/2010)
Mike C (7/22/2010)
Obviously the so-called "buggy" execution plan requires the developer to be more careful, but gracefully handling invalid function parameter and divide-by-zero errors are not new problems in programming and shouldn't come as a surprise to most devs.
Hmmm... sounds just like the rules for a quirky update. 😉
What's that? If you divide two numbers make sure the dividend is not zero? :-). Most devs learned that rule in grade school 🙂
Nah... I was talking about the developer being careful. 😉

Ahhh, now I understand 😉 I suppose, like the quirky update, with the right combination of query hints, options, hardware, and indexes you could probably find a way to force the query optimizer to execute the WHERE clause first. At least until the optimizer comes up with a new plan. Knowing that the physical order of execution within a query is not guaranteed would make me hesitant to trust that solution though. I think I'd rather go with the slightly simpler solution, like this:

select

left( 'abc', NULLIF(T1.v, 0) )

from

T1

where

T1.v >= 0

;

Or this:

select T1.x / NULLIF(T1.y, 0) as ratio

from T1

where T1.y <> 0

But then again I tend to enjoy the simple things in life. 😀

TomThomson SSC Guru Points: 104773 More actions · Answer 10

Mike C (7/22/2010)
Sorry to hear you're leaving T-SQL :crying: Unfortunately you probably won't find much love in any dialect of SQL. The problem (?) is that the SQL standard doesn't define physical processing steps. All that is left to the vendors. The standard simply says here's the keywords, here's the definition of the keywords, here's what you put in and here's what you get out.

I think you are missing the point. The (silly) arithmetic example isn't what it's about, what it's about is the elimination of semantics from the T-SQL language. The BoL entry I quoted says quite clearly and plainly that the select statement + the state of the database when it is executed do NOT between them define what you get out. For example if I execute the following code

set transaction isolation level serializable

begin tran

create table #T (a int check (a=1) default (1), b int check (b=2) default(2))

insert #T (TABLOCKX) values (1,2)

select T.a+T.b as c from #T T

commit tran

I will get a result set of some rows with a single column, probably named c. It's not really clear that in the row(s) I get in that result set the single column c will have type int. It's far from clear that I'll get just one row. in fact the BoL page says that the select statement doesn't specify that. It is explicitly stated in that BoL page that this select statement is NOT guaranteed to return me the value 3 in column of the single (if I do get just one) row, since there the select statement specifies no relation between the values in the source tables and the values in the result set, except perhaps as affected by where and having clauses, if that is conceivable (unlikely, except maybe for aggregates): certainly not by the select clause or its output list, and it's the output list that matters here.

When this kind of idiocy appears in BoL we all should be terrified: that may be the MS philopsophy behind which parts of the select statement's apparent semantics (enshrined in the ANSI/ISO SQL standards, I believe, but apparently not in T-SQL) which the optimser (for SQL, if not for T-SQL) is required not to destroy. Or it may be just that someone who hasn't a clue has been writing chunks of BoL - and from your comment on this issue (and on that raised by another which prompted me to comment) I think you've failed to understand all of (1) just what that quotation says and (2) what most of us believe the semantics of SQL is (per ANSI and per ISO, despite what MS sometimes asserts to the contrary on silly BoL pages like the one I quoted).

I don't much care about the "quirky update". If one is adequately careful it works, provided one is careful to include adequate sanity checks in the code to ensure that it still works even if MS does something completely insane with the optimiser (as, according to Pete-<numbers> - sorry , I forgot which numbers - they have, although this particular insanity doesn't affect quirky update any more than it affects ordinary selects). I can't see any problem in using it (I'm a pessimist: SQL is such a defective language and no-one is going to fix it so MS will leave this rather silly extension in, with restrictions of course, instead of implementing a proper reduction operator; neither MS nor the standards bodies care about extended relational algebra operators, particularly not about the functional ones; but they will never be able to afford to destroy quirky update until they introduce some of those operators, so we can carry on using it with care). Of course "adequately careful" is fairly extreme - just as it is for fully specified, fully supported, and not even deprecated MS stuff (I've been bitten, sorry to sound sour about MS consistency).

Tom

Mike C SSC-Insane Points: 23224 More actions · Answer 11

Tom.Thomson (7/22/2010)
Mike C (7/22/2010)
Sorry to hear you're leaving T-SQL :crying: Unfortunately you probably won't find much love in any dialect of SQL. The problem (?) is that the SQL standard doesn't define physical processing steps. All that is left to the vendors. The standard simply says here's the keywords, here's the definition of the keywords, here's what you put in and here's what you get out.
I think you are missing the point. The (silly) arithmetic example isn't what it's about, what it's about is the elimination of semantics from the T-SQL language. The BoL entry I quoted says quite clearly and plainly that the select statement + the state of the database when it is executed do NOT between them define what you get out. For example if I execute the following code
set transaction isolation level serializable
begin tran
create table #T (a int check (a=1) default (1), b int check (b=2) default(2))
insert #T (TABLOCKX) values (1,2)
select T.a+T.b as c from #T T
commit tran
I will get a result set of some rows with a single column, probably named c. It's not really clear that in the row(s) I get in that result set the single column c will have type int. It's far from clear that I'll get just one row. in fact the BoL page says that the select statement doesn't specify that. It is explicitly stated in that BoL page that this select statement is NOT guaranteed to return me the value 3 in column of the single (if I do get just one) row, since there the select statement specifies no relation between the values in the source tables and the values in the result set, except perhaps as affected by where and having clauses, if that is conceivable (unlikely, except maybe for aggregates): certainly not by the select clause or its output list, and it's the output list that matters here.
When this kind of idiocy appears in BoL we all should be terrified: that may be the MS philopsophy behind which parts of the select statement's apparent semantics (enshrined in the ANSI/ISO SQL standards, I believe, but apparently not in T-SQL) which the optimser (for SQL, if not for T-SQL) is required not to destroy. Or it may be just that someone who hasn't a clue has been writing chunks of BoL - and from your comment on this issue (and on that raised by another which prompted me to comment) I think you've failed to understand all of (1) just what that quotation says and (2) what most of us believe the semantics of SQL is (per ANSI and per ISO, despite what MS sometimes asserts to the contrary on silly BoL pages like the one I quoted).
I don't much care about the "quirky update". If one is adequately careful it works, provided one is careful to include adequate sanity checks in the code to ensure that it still works even if MS does something completely insane with the optimiser (as, according to Pete-<numbers> - sorry , I forgot which numbers - they have, although this particular insanity doesn't affect quirky update any more than it affects ordinary selects). I can't see any problem in using it (I'm a pessimist: SQL is such a defective language and no-one is going to fix it so MS will leave this rather silly extension in, with restrictions of course, instead of implementing a proper reduction operator; neither MS nor the standards bodies care about extended relational algebra operators, particularly not about the functional ones; but they will never be able to afford to destroy quirky update until they introduce some of those operators, so we can carry on using it with care). Of course "adequately careful" is fairly extreme - just as it is for fully specified, fully supported, and not even deprecated MS stuff (I've been bitten, sorry to sound sour about MS consistency).

I agree, I am completely missing your point. It seems like you have more than one point to make in there, but I'm not getting any of them. Let me see if I can try to understand your points:

1. You're upset about code formatting by developers that makes the result of SQL unclear. Fortunately that's easily fixable, though it requires a commitment from the developers who write code. BTW, I believe the output from your code sample above will actually be an error message.

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

BEGIN TRANSACTION;

CREATE TABLE #T

(

a INTEGER DEFAULT (1),

b INTEGER DEFAULT (2),

CONSTRAINT Allow_Nothing_But_1_In_a CHECK (a = 1),

CONSTRAINT Allow_Nothing_But_2_In_b CHECK (b = 2)

);

INSERT INTO #T (a, b) WITH (TABLOCKX)

VALUES (1,2);

SELECT (T.a + T.b) AS c -- This query returns 1 column; an INTEGER + an INTEGER yields an INTEGER

FROM #T T;

COMMIT TRANSACTION;

2. You're upset about mistakes and problematic documentation. I often find mistakes in technical documentation -- spent nearly a week picking apart the SQLCMD utility documentation in BOL when 2005 was first released, for instance. Fortunately there are a couple of ways to constructively deal with that issue. http://connect.microsoft.com and the Feedback button on the BOL and MSDN pages come to mind. There's also the web forums, which MS' documentation team often frequents soliciting feedback.

3. You don't like SQL in general. That's perfectly understandable -- there are a lot of people who don't like SQL, and that's a personal choice. Of course every query language, programming language, scripting language, markup language, or any other form of computer-based language has idiosyncracies and issues. Whether it's data type casting (implicit, explicit), lack of features, etc., there will always be room for improvement. MS accepts feature requests on their Connect website also.

peter-757102 SSCertifiable Points: 6877 More actions · Answer 12

I very much like SQL even while over the time I encounter or read about quite a few quirks. The problem I have with the case I brought forward is that it complicates code even for basic simple functionality which is the opposite of what I like so much about SQL. The real life case was a bit more complex, but my example demonstrated the principle quite nicely.

In its heart SQL is of course a data retrieval language and it does that excelent IMHO, often far beyond my expectations. But we frequently go out of this boundary and include all sorts of processing beyond simple test and include processing in select clauses as well. It is here that quirks show up that can spoil the party if you rightfully desire always perfectly working code. It can get complex and ugly (unreadable) real fast if you got many conditions to be met.

The SQL principle of describing what you want and let the server figure out how to get those results to you is perfectly fine. But it could be so much better if those exceptions where handled more graceful and never spoiled the end result at all. With this I mean that an exception in a row that does not contribute to the end result must be kept hidden IMHO. This would boost the usefulness and robustness of the language many times over without complicating work for the query writer.

Anyway, its an interesting sub-discussion and a bit off-topic here ;). I really have to make a simple as possible, working example case to demonstrate the issue. The code in question had run for over a year without any issue and suddenly there it was, broken for no obvious reason. Lucky for me, it was generic functionality and I am used to encapsulate this sort of logic in functions. I was able to fix this in one spot instead of re-writing and complicating many separate queries.

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 13

Mike C (7/22/2010)
Jeff Moden (7/22/2010)
Mike C (7/22/2010)
Jeff Moden (7/22/2010)
Mike C (7/22/2010)
Obviously the so-called "buggy" execution plan requires the developer to be more careful, but gracefully handling invalid function parameter and divide-by-zero errors are not new problems in programming and shouldn't come as a surprise to most devs.
Hmmm... sounds just like the rules for a quirky update. 😉
What's that? If you divide two numbers make sure the dividend is not zero? :-). Most devs learned that rule in grade school 🙂
Nah... I was talking about the developer being careful. 😉
Ahhh, now I understand 😉 I suppose, like the quirky update, with the right combination of query hints, options, hardware, and indexes you could probably find a way to force the query optimizer to execute the WHERE clause first. At least until the optimizer comes up with a new plan. Knowing that the physical order of execution within a query is not guaranteed would make me hesitant to trust that solution though.

No... you don't understand, Mike. People jump through all sorts of hoops with indexes and whether or not to use include and whether or not to use covering indexes. A great amount of time is spent explaining clustered indexes and the fact that non clustered indexes are appended with the clustered index and how to defrag and rebuild indexes. Why? To trick the optimizer into doing things faster and that's all I've done with the quirky update. You say it's not quaranteed but even a SELECT with a WHERE clause isn't guaranteed. SQL SERVER 2000 had to have a hot fix after sp2 because if parallelism occurred during an UPDATE that had a WHERE clause with an IS NULL in it, the IS NULL was ignored and all rows were updated. I know this because it happened to the folks at work. NOTHING is guaranteed to work so use everything that you can. Or not... use a cursor if it makes you nervous. Just stop bad mouthing folks that have found things that work that you can't break if you follow some simple rules... guaranteed or not and whether they go against your grain or not.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

Mike C SSC-Insane Points: 23224 More actions · Answer 14

Jeff Moden (7/23/2010)
Mike C (7/22/2010)
Jeff Moden (7/22/2010)
Mike C (7/22/2010)
Jeff Moden (7/22/2010)
Mike C (7/22/2010)
Obviously the so-called "buggy" execution plan requires the developer to be more careful, but gracefully handling invalid function parameter and divide-by-zero errors are not new problems in programming and shouldn't come as a surprise to most devs.
Hmmm... sounds just like the rules for a quirky update. 😉
What's that? If you divide two numbers make sure the dividend is not zero? :-). Most devs learned that rule in grade school 🙂
Nah... I was talking about the developer being careful. 😉
Ahhh, now I understand 😉 I suppose, like the quirky update, with the right combination of query hints, options, hardware, and indexes you could probably find a way to force the query optimizer to execute the WHERE clause first. At least until the optimizer comes up with a new plan. Knowing that the physical order of execution within a query is not guaranteed would make me hesitant to trust that solution though.
No... you don't understand, Mike. People jump through all sorts of hoops with indexes and whether or not to use include and whether or not to use covering indexes. A great amount of time is spent explaining clustered indexes and the fact that non clustered indexes are appended with the clustered index and how to defrag and rebuild indexes. Why? To trick the optimizer into doing things faster and that's all I've done with the quirky update. You say it's not quaranteed but even a SELECT with a WHERE clause isn't guaranteed. SQL SERVER 2000 had to have a hot fix after sp2 because if parallelism occurred during an UPDATE that had a WHERE clause with an IS NULL in it, the IS NULL was ignored and all rows were updated. I know this because it happened to the folks at work. NOTHING is guaranteed to work so use everything that you can. Or not... use a cursor if it makes you nervous. Just stop bad mouthing folks that have found things that work that you can't break if you follow some simple rules... guaranteed or not and whether they go against your grain or not.

I'm in total agreement with you 99.9% of the time on just about every topic Jeff, and you know that well. The fact that people spend a "lot of time" tweaking indexes to improve performance is not the same as spending time tweaking indexes to "change the content of your end result"; at least not in my mind. The two concepts to me seem disconnected.

I'm not particularly interested in convincing you that indexes (or lack of them) should not change the content of your results. Or number of processors, etc. If you believe the content of your output should be determined by your indexes, index hints, etc. then go for it.

At any rate, I respect your opinion and since you feel I'm "bad mouthing folks", I'm outta here and done with this thread.

Out here.

Mike C

UMG Developer SSChampion Points: 13482 More actions · Answer 15

Tom.Thomson (7/22/2010)

Mike C (7/22/2010)
set transaction isolation level serializable begin tran create table #T (a int check (a=1) default (1), b int check (b=2) default(2)) insert #T (TABLOCKX) values (1,2) select T.a+T.b as c from #T T commit tran
I will get a result set of some rows with a single column, probably named c. It's not really clear that in the row(s) I get in that result set the single column c will have type int. It's far from clear that I'll get just one row. in fact the BoL page says that the select statement doesn't specify that. It is explicitly stated in that BoL page that this select statement is NOT guaranteed to return me the value 3 in column of the single (if I do get just one) row, since there the select statement specifies no relation between the values in the source tables and the values in the result set, except perhaps as affected by where and having clauses, if that is conceivable (unlikely, except maybe for aggregates): certainly not by the select clause or its output list, and it's the output list that matters here.

I don't understand, that is a basic SELECT statement with no quirky update or aggregation, so since you have one record in the table you will get one record returned. (Well if the syntax errors are fixed first.) If you have 2 records in the table you would get two records back with a single column that contained a value of 3. (Unless one of the records contains a NULL in either column in which case the corresponding row will return NULL for the value of the C column.)