Unexpected consequence of update from multiple rows

Question

Post reply

Unexpected consequence of update from multiple rows

Glen Harland

SSC Enthusiast

Points: 153
More actions
October 28, 2009 at 6:35 pm

#211475

I was surprised recently when a query that updates a record from a recordset with multiple rows succeeded without giving an error and seemed to arbitrarily select the row used in the update.
If you attempt to update a record from a subquery expression that returns multiple rows, you get an error, as you would expect. When the multiple rows are from a joined subquery however (using the FROM clause of the update statement), the update succeeds and one of the rows is used for the update.
My expectation was that both updates would fail. I'm pretty certain that some other database engines would fail in both situations (e.g. Ingres?). Does anyone have an explanation for why the update succeeds and which row gets used for the update?
Here is some code to demonstrate:
-- Create tables.
if exists (select name from [dbo].sysobjects
where name = 'temp_account' and type = 'U')
drop table temp_account
if exists (select name from [dbo].sysobjects
where name = 'temp_transaction' and type = 'U')
drop table temp_transaction
create table temp_account
(account_key varchar(10) not null,
account_name varchar(40) null,
balance01 money null,
balance02 money null)
create table temp_transaction
(tid integer identity (1,1) not null,
account_key varchar(10) not null,
trans_type varchar(1) not null,
trans_amount money not null)
-- Test data.
insert into temp_account
(account_key,
account_name,
balance01,
balance02)
values ('TEST001','Test Account 1',0,0)
insert into temp_account
(account_key,
account_name,
balance01,
balance02)
values ('TEST002','Test Account 2',0,0)
insert into temp_transaction
(account_key,
trans_type,
trans_amount
)
values ('TEST001','P',120)
insert into temp_transaction
(account_key,
trans_type,
trans_amount
)
values ('TEST001','P',222)
insert into temp_transaction
(account_key,
trans_type,
trans_amount
)
values ('TEST001','R',-40)
-- Update single record from multiple rows - succeeds and arbitrarily selects row to use in update.
-- Balance is set to 342 which is sum for P transaction type. R transactions are ignored.
-- Or is balance first updated by R transactions, then P (row-by-row)?
update temp_account
set balance01 = ta.balance01 + rs1.trans_amount
from temp_account as ta inner join
(select tt.account_key,
tt.trans_type,
sum(tt.trans_amount) as trans_amount
from temp_transaction as tt
where tt.trans_type in ('P','R')
group by tt.account_key, tt.trans_type
) as rs1
on ta.account_key = rs1.account_key
select *
from temp_account
-- Update from multiple rows - fails with error message, as expected.
-- Msg 512, Level 16, State 1, Line 1
-- Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
update temp_account
set balance02 = (select sum(tt.trans_amount) as trans_amount
from temp_transaction as tt
where tt.trans_type in ('P','R')
and tt.account_key = ta.account_key
group by tt.trans_type)
from temp_account as ta
select *
from temp_account

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Lynn Pettis SSC Guru Points: 442447 More actions · Answer 1

The problem is in your code. Please note the changes I have made in both versions.

update temp_account set

balance01 = ta.balance01 + rs1.trans_amount

from

temp_account as ta inner join

(select

tt.account_key,

-- tt.trans_type,

sum(tt.trans_amount) as trans_amount

from

temp_transaction as tt

where

tt.trans_type in ('P','R')

group by

tt.account_key

) as rs1

on ta.account_key = rs1.account_key

select *

from temp_account

update temp_account set

balance02 = isnull(

( select

sum(tt.trans_amount) as trans_amount

from

temp_transaction as tt

where

tt.trans_type in ('P','R')

and tt.account_key = ta.account_key

group by

tt.account_key), 0)

from

temp_account as ta

select *

from temp_account

The Dixie Flatline SSC Guru Points: 53353 More actions · Answer 2

If you attempt to update a record from a subquery expression that returns multiple rows, you get an error, as you would expect.

The situation you are describing here is one in which the subquery is being used as an expression. That is to say it is being used in a context where only a single value is expected to be returned, as if from another column, a function, a formula, a CASE expression, etc.

UPDATE OrderLog

SET OrderTotal = (select sum(amt) from OrderDetail where OrderID = 1075)

-- not -- SET OrderTotal = (select sum(amt), count(*) from OrderDetail where OrderID = 1075)

-- not -- SET OrderTotal = (select amt from OrderDetail where OrderID = 1075)

-- (in situations where any orders would have multiple amts)

WHERE OrderID = 1075

In this context, receiving multiple values from an expression is unacceptable, and the error results. An error would result if it were multiple columns being returned as well as multiple rows.

However, in the second case there are multiple rows in the input set. But the target column is only being set to a single value. Looking at the example below, it appears that SQL is arbitrarily deciding which row to get that single value from. I can't cite authority for this, but I would bet money that it is simply using the first valid source value it finds. This makes sense if you think about it, because waiting for the rest of the valid rows in the set would be just as arbitrary and would waste resources.

declare @Target table (rowID int identity(1,1), data varchar(100))

insert into @target

select 'Apples'

declare @Source table (data varchar(100))

insert into @Source

select 'Oranges' union all

select 'Bananas' union all

select 'Watermelons'

select 'Before' as [@source],* from @source

select 'Before' as [@target],* from @target

update @target

set data = src.data

output deleted.data as [from], inserted.data as [to]

from @target t

cross join @source src

where t.rowID =1

select 'After' as [@target],* from @target

It is the developer's responsibility to ensure that the correct value is used to update, and I can sympathize with your desire for an error message to keep unexpected multiple rows out. But this is the way it works.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Glen Harland SSC Enthusiast Points: 153 More actions · Answer 3

Lynn Pettis (10/28/2009)
The problem is in your code. Please note the changes I have made in both versions.

Hi Lynn, and thanks. Perhaps I wasn't clear about my purpose. I know the code is wrong and what the problems are. I was just surprised that any of the update statements in my example code work. I thought they would both cause an error and perhaps SQL Server is mis-behaving. The results of the update that works seem arbitrary - something that is unusual, in my experience, with SQL programming and set theory.

Glen Harland SSC Enthusiast Points: 153 More actions · Answer 4

It is the developer's responsibility to ensure that the correct value is used to update, and I can sympathize with your desire for an error message to keep unexpected multiple rows out. But this is the way it works.

Yes, I think you are right, it is taking the first value it comes across, which is surprising to me. Thanks for your response.

The Dixie Flatline SSC Guru Points: 53353 More actions · Answer 5

You're welcome. Best of luck to you, Glen.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 6

Glen Harland (10/28/2009)
It is the developer's responsibility to ensure that the correct value is used to update, and I can sympathize with your desire for an error message to keep unexpected multiple rows out. But this is the way it works.
Yes, I think you are right, it is taking the first value it comes across, which is surprising to me. Thanks for your response.

As with many other such subtle points, it IS unfortunately documented, but rather poorly. If you read through BOL's description of the UPDATE statement, some where buried in the middle is a statement to the effect that "if I row is going to be updated as a result of a one to many join, the update occurs based on the first row it encounters, and that this row is then locked and cannot be updated any further" by that same update statement.

I know that doesn't help your current conundrum.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Glen Harland SSC Enthusiast Points: 153 More actions · Answer 7

Thanks for everyone's comments.

Just to illustrate that the SQL Server behaviour is not necessarily obvious ...

With the test data I used above, the following SQL Server query results in 120 (first row found):

UPDATE temp_account

SET temp_account.balance01 = temp_account.balance01+temp_transaction.trans_amount

from temp_account INNER JOIN temp_transaction

ON temp_account.account_key=temp_transaction.account_key

The same query in Sybase ASA results in 302 - the sum of the three rows.

In Access 2003, the following equivalent query also results in 302 (sum of the three rows):

UPDATE temp_account INNER JOIN temp_transaction

ON temp_account.account_key=temp_transaction.account_key

SET temp_account.balance01 = temp_account.balance01+temp_transaction.trans_amount;

So SQL Server takes the first row, Sybase ASA and Access 2003 add the rows up, and from memory, Ingres reports an error (Ingres circa 1998 - I can't verify this since I don't have access to the software).

Is SQL Server the only database that arbitrarily takes the first row? Is there a standard? Does it matter? 😉