select statement with inner join returns more rows than equivalent update?

Question

select statement with inner join returns more rows than equivalent update?

petlock

Right there with Babe

Points: 717
More actions
February 24, 2010 at 11:29 am

#216011

I have one table in an accounting system that I need to update based on a spreadsheet. The spreadsheet has an identifier for the records that need to be touched. When I look at these two queries I consider them to be identical in scope but clearly SQL server does not. "Table1" is the target table in the accounting database to RECEIVE the update. "Table2" is the spreadsheet that was brought into SQL 2005 using the linked server method but then was selected into a table also located in the accounting database to keep the syntax easy.
select ACS.Contract_Start_Date, zzp.start
FROM table1 as ACS INNER JOIN
table2 zzp ON ACS.Alarm_Account = zzp.SG#
WHERE (ACS.Inactive = 'N')
This returns 560 rows.
However when I run
update ACS set ACS.Contract_Start_Date = zzp.start
FROM table1 as ACS INNER JOIN
table2 zzp ON ACS.Alarm_Account = zzp.SG#
WHERE (ACS.Inactive = 'N')
it touches only 346 rows and I don't understand why.
Table1's design (leaving only the relevant fields the query touches) is
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[table1](
[Customer_System_Id] [int] IDENTITY(1,1) NOT NULL,
[Customer_Id] [int] NOT NULL,
[Alarm_Account] [nvarchar](25) NULL,
[Contract_Start_Date] [datetime] NULL
) ON [PRIMARY]
CREATE TABLE [dbo].[table2](
[PAS_AR] [float] NULL,
[PAS#] [nvarchar](255) NULL,
[SG#] [nvarchar](255) NULL,
[Customer Name] [nvarchar](255) NULL,
[Site Name] [nvarchar](255) NULL,
[Site Address] [nvarchar](255) NULL,
[Start] [datetime] NULL,
[Term] [float] NULL,
[Renewal] [float] NULL
) ON [PRIMARY]
I don't think it's due to a difference in data types since I'm setting a datetime with another datetime and the join would seem to be OK since the number of rows (560) referenced by the select is the number of rows in the spreadsheet. It's got to be some syntax I'm not understanding in the update.
Any ideas?

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 1

It sounds to me that table1 and table2 have a 1 to many relation. In other words - the "left" side of the join is providing 346 rows, and the "right" side sometimes has multiple "children" to the matching rows on the left.

The update is going to counted only the rows actually updated (i.e. just the left side). This brings up another question for you to handle: assuming a row in Table1 relates to more than one row in table2, which one of those rows in table2 should provide the data? As of right now - you likely are getting unpredictable results isnce it will use the first child value it finds (which isn't predictable and could change between execution runs.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

petlock Right there with Babe Points: 717 More actions · Answer 2

THANKS for looking at it and... I mis-spoke. My spreadsheet has 569 rows... saw the 0 as a 9. Yes, there's a 1-to-many relation between table1 and table2. I consider that table1's information is "more correct" than in "table2." There could be duplicates of each on both sides which is why I'm testing for invalid although i know the join doesn't take the where clause into account. I don't know how to include the formula in the join although I've seen it done. That still won't ensure lack of duplicates but it'll be close.

My problem is... I need to be able to identify "which" records got updated and which did not so that the user who created the spreadsheet can go back and manually do the ones they typo'd.

WayneS SSC Guru Points: 95405 More actions · Answer 3

Use the OUTPUT clause on the UPDATE statement to return back the rows being updated.

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

petlock Right there with Babe Points: 717 More actions · Answer 4

Thanks... is there a join type that will support a one-to-many update, if only to mark the occurrences in the original file as rejects? So far google searches haven't yielded a clear answer 🙂

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 5

Sorry to say - but no. It's baked into the UPDATE (it's actually described, albeit badly in the BOL entry about UPDATE).

If you have any ability to add a column, I would consider adding the primary k from table2 in table1 ("where did I get the START column"). Otherwise a subsequent join back to table1 back on account_alarm AND the start shoul be able to tell you which row did the update.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

petlock Right there with Babe Points: 717 More actions · Answer 6

Where would an output clause fit into this query? BOL is, as usual, 99% incomprehensible. I want to output ACS.Alarm_Account into table3 ?

(table3 does not yet exist, I'm thinking like select into) This isn't contained in a procedure so I don't know that I can use a @temp table

update ACS set ACS.Contract_Start_Date = zzp.start

FROM table1 as ACS INNER JOIN

table2 zzp ON ACS.Alarm_Account = zzp.SG#

WHERE (ACS.Inactive = 'N')

I'm putting the output clause before the from:

update ACS set ACS.Contract_Start_Date = zzp.start output ACS.Alarm_Account

FROM table1 as ACS INNER JOIN

table2 zzp ON ACS.Alarm_Account = zzp.SG#

WHERE (ACS.Inactive = 'N')

Without the output the right number of rows is returned.

With output I get "The multi-part identifier ACS.Alarm_Account could not be bound"

Thanks for your help I'm soooo close to having this done.

petlock Right there with Babe Points: 717 More actions · Answer 7

I figured out where OUTPUT goes but it doesn't help on the production database.

update ACS set ACS.Contract_Start_Date = zzp.start output inserted.Alarm_Account

FROM table1 as ACS INNER JOIN

table2 zzp ON ACS.Alarm_Account = zzp.SG#

WHERE (ACS.Inactive = 'N')

returns "The target table table1 of the DML statement cannot have any enabled triggers if the statement contains an OUTPUT clause without INTO clause."

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 8

Create a table variable or temp table, and output into it.

declare table @outputtbl(alarm_account int);

update ACS set ACS.Contract_Start_Date = zzp.start

output inserted.Alarm_Account

into @outputtbl(alarm_account)

FROM table1 as ACS INNER JOIN

table2 zzp ON ACS.Alarm_Account = zzp.SG#

WHERE (ACS.Inactive = 'N');

Select * from @outputtbl;

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?