Very basic question - Insert or Update?

Question

Very basic question - Insert or Update?

rosedd

Mr or Mrs. 500

Points: 570
More actions
January 21, 2005 at 7:23 am

#61227

Assuming that I do not know if a record exists (and about a 50% chance that it does), what is the most efficient method of either inserting or updating?
Check first for existence.
Try the update. If nothing updated, insert.
Anything else?
TIA
David

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply

Wayne Lawton Hall of Fame Points: 3532 More actions · Answer 1

G'day,

Strictly as a matter of personal preference, I prefer to check first then either update or insert as appropriate. Why? Because tyring the update and handling a failure is essentially building an error handler for a "normal" condition. My preference, as a matter of style, is to use error handlers for errors. But I freely admit, this is largely a matter of personal preference and style.

Wayne

rosedd Mr or Mrs. 500 Points: 570 More actions · Answer 2

Wayne,

For some reason I thought that a failed update did not set up an error. I thought that it "succeeded", but the row count returned 0.

David

Edwin Frydendall SSCrazy Points: 2361 More actions · Answer 3

Edwin Frydendall

SSCrazy

Points: 2361

January 21, 2005 at 8:13 am

#538522

If Exists()

Update

Else

Insert

bdohmen SSCrazy Points: 2996 More actions · Answer 4

Why not check the rowcount after initiating the update statement. If nothing is updated, then issue an insert statement.

UPDATE ...

IF @@ROWCOUNT=0 -- No records updated, so it doesn't exist and needs to be inserted

INSERT ...

Brian

rosedd Mr or Mrs. 500 Points: 570 More actions · Answer 5

Brian,

That is what I was thinking - try to update and if the rowcount is 0 do the insert.

I guess that my question boils down to which is more efficient:

select count(*)... and then update or insert (whichever is appropriate)

or just try to update, inserting if no records are updated

If the update attempt is not much more costly than count(*), then it is probably better because the update will be successful a number of times.

David

PBirch SSCrazy Eights Points: 8483 More actions · Answer 6

It is probably more efficient to use IF EXISTS() instead of Count(). Depends on the size of your table. Exists checks for the values and stops when it finds the first one. Count will count everything whether it finds something or not.

Just a thought.

Quand on parle du loup, on en voit la queue

Frank Kalis SSC Guru Points: 111183 More actions · Answer 7

EXISTS() will almost certainly be faster than COUNT(). EXISTS can stop once it has detected one single row that matches the condition and returns TRUE, while COUNT needs to run through all rows. Even after it has found a matching row.

But, what I would like to know. It this a regular load process? Just asking, because it might be a chance that you can truncate the table first and than insert all rows rather than trying costly to determine if there are already existing rows or not.

--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

Schalk Lubbe SSC Veteran Points: 206 More actions · Answer 8

Hi

I agree with Wayne and Edwin. It might be a matter of taste, but it just looks neater (and better planned) to go the route of

if exists()

update

else

delete

Schalk

rosedd Mr or Mrs. 500 Points: 570 More actions · Answer 9

Thanks Everyone,

Frank, this is an Internet application where the insert/update happens one record a time, so I cannot truncate the table.

Looks like if exists() gets the nod. It does seem cleaner to me too.

David

Kenneth Wilhelmsson SSC-Dedicated Points: 30043 More actions · Answer 10

On the other hand... it depends on how critical performance is, but actually performing an exists check before doing either an update or an insert is just a waste of time and resources

Why then, you may wonder?

Well, you are 100% certain that you want either an update (50% of the time) or an insert (50% of the time). Since it's a tie, it doesn't matter much if you attempt an update before an insert or the other way around, what you do know, is that you already have the PK for the table known... If you hadn't, you couldn't check for existance in the first place, right?

Anyway, instead of always doing two statements 100% of the time (ie always exists followed by either update or insert) you could instead do two statements 50% of the time (ie when there was nothing to update, an insert was required) and just one statement the other 50% of the time (ie it was an update, and then nothing more)

For that purpose, it would be the most efficient route to just attempt the update right away, without wasting time on other stuff. When the update is done, check @@rowcount as suggested earlier. 50% of the time, the proc ends here, the other 50% it moves on to do the insert. If you always check for existance first - you will always do 100% work - not 50/50 + 100/50

/Kenneth

noeld SSC Guru Points: 96590 More actions · Answer 11

And Adding to Kenneth the

if Exists()

Update

else

Insert

is a BAD assumption. Supposed you are on a heavily transactional environment (Web App is a good candidate) and the exists Fails for two users at the same time, that will meant that one of them will be able to insert and the other one won't unless you use higher isolation level lockings which will indeed reduce performance if you follow that route.

Please, do as Kenneth explained above (DIRECT UPDATE with Key or DIRECT INSERT TESTING FOR KEY NON EXISTENCE) and you will be OK

HTH

* Noel

rosedd Mr or Mrs. 500 Points: 570 More actions · Answer 12

Just when I thought that it was all settled...

This is not a heavily transactional environment and the only person likely to update or insert a record is the person who belongs to the ID affected, but this issue will come up again.

I had thought that the deciding factor would be how efficient/inefficient an attempted update is. i.e., if it is efficient, why not try the update (you'll succeed a percentage of the time). If it is not efficient, then it makes sense to test for existence first.

Noel brought up a good point about two exists() failing and both users then attempt to insert. But what happens if two users try to update a record that does not exist, at the same time? Wouldn't that be the same as both testing for existence? Wouldn't they both attempt to insert at the same time if their updates failed? How does record locking work here? Would it lock for the update, but not the exists()? Couldn't there still be a problem?

David

noeld SSC Guru Points: 96590 More actions · Answer 13

David, Let me simplify the scenario so that you understand what I mean:

Update TblName Set fld1 = x, Fld2 =y ... where pk =@pkey

if @@rowcount = 0

begin

insert into TblName (fld1, fld2)

select x, y -- no from clause, just a single record

where not exist (select * from TblName where pk = @pkey)

end

I am leaving @@error checking out of the equation for simplicity but have a look at it. If both updates fail one of the inserts is going to succeed before the other and then you still get atomicity in the process. Some designers opt for the Table lock mechanism to simulate pessimistic locking and others will simply perform 2 round trips with a single action but without a doubt this is the fastest (although not the most flexible) approach

Do you see it?

HTH

* Noel

Wayne Lawton Hall of Fame Points: 3532 More actions · Answer 14

hmmm. It seems we have made the original question a little broader by introducing the related issues of reliability and atomicity. Dare I suggest we might consider a transaction wrapping the actual insert/update statements? If we introduce transactions then I suspect the overhead of begin/commit/rollback may well make the original question moot.

Next question: how is "update followed by a row check" better or worse than an "if exists followed by an update"? In either case we will subsequently do an insert 50% of the time.

One possibility might be the actual server activity associated with implementing the "exists" check versus the server activity implementing the update to @@rowcount and the subsequent test of @@rowcount. I do not personally know the internals well enough to assess which is faster under the hood.

Thoughts?

Have a great day!

Wayne