SSIS Loading data from row number

Question

SSIS Loading data from row number

Jonathan Marshall

SSCrazy

Points: 2633
More actions
June 10, 2012 at 8:56 am

#257428

I started a load that ended up failing because of an SSIS error
SSIS Error Code DTS_E_PRIMEOUTPUTFAILED
This is just loading data from one table to another.
Wondering if I can start where it failed.
So if 1000 rows were loaded
Start a load from 1001.
Any suggestions!
Jonathan

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

Koen Verbeeck SSC Guru Points: 259015 More actions · Answer 1

I don't think you can.

That would require the SELECT statement to return results skipping certain rows, which is practically impossible without an ORDER BY clause and hence virtually unfeasible to implement.

I would just solve the error and rerun the package.

If data is already inserted before the error, you should consider transactions or a way to detect which rows were already inserted.

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

Jonathan Marshall SSCrazy Points: 2633 More actions · Answer 2

Forgot to lock out a certain group from the database when doing a dataload.

So your saying I cannot load data between a certain number of rows

Error 0xc0202009: Data Flow Task 1: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80004005.

An OLE DB record is available. Source: "Microsoft SQL Server Native Client 10.0" Hresult: 0x80004005 Description: "Transaction (Process ID 66) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.".

Jonathan

Jonathan Marshall SSCrazy Points: 2633 More actions · Answer 3

Is there a better way of loading data of a hundred million+ rows.

If it fails you have to start from the beginning and lose over a days worth of processing.

There isn't a way to start from a certain row and load the rest of the data?

I would believe there has to be some solution when loading lots of data.

Jonathan

Koen Verbeeck SSC Guru Points: 259015 More actions · Answer 4

Jonathan Marshall (6/11/2012)
Is there a better way of loading data of a hundred million+ rows.
If it fails you have to start from the beginning and lose over a days worth of processing.
There isn't a way to start from a certain row and load the rest of the data?
I would believe there has to be some solution when loading lots of data.
Jonathan

If a transaction fails, it fails.

However, if your data load is so huge, you'd be better of splitting it in manageable chunks and load them in parallel.

If one fails, you wont lose as much time.

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 5

You can load between ranges, which is what you're looking for, but not by row#, that's a construction meant for our sanity, not for use. You can use ranges of Identity columns though.

One thing you might look into if this is a significant problem is using checkpoints and wrapping the chunks into one, so on restart you only pick up where the failure occurred. You'll need to double check what you need to do for cleanups but it's an option.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

Lynn Pettis SSC Guru Points: 442447 More actions · Answer 6

Evil Kraig F (6/11/2012)
You can load between ranges, which is what you're looking for, but not by row#, that's a construction meant for our sanity, not for use. You can use ranges of Identity columns though.
One thing you might look into if this is a significant problem is using checkpoints and wrapping the chunks into one, so on restart you only pick up where the failure occurred. You'll need to double check what you need to do for cleanups but it's an option.

IIRC, checkpoints don't work on a loop. There may have even been an article written on this here on SSC.

Definitely something to look into before implementing.

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 7

Lynn Pettis (6/11/2012)
Evil Kraig F (6/11/2012)
You can load between ranges, which is what you're looking for, but not by row#, that's a construction meant for our sanity, not for use. You can use ranges of Identity columns though.
One thing you might look into if this is a significant problem is using checkpoints and wrapping the chunks into one, so on restart you only pick up where the failure occurred. You'll need to double check what you need to do for cleanups but it's an option.
IIRC, checkpoints don't work on a loop. There may have even been an article written on this here on SSC.
Definitely something to look into before implementing.

Hrm, swore they did... alright, ignore that advice for now! :hehe: I haven't done enough with checkpoints to know either way for sure.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

Jarom Richardson SSC Enthusiast Points: 195 More actions · Answer 8

I am not sure exactly what you are trying to do here, but it looks like you are getting your error in the source of your data flow. Sometimes that error is caused by changing data types within the source query, but it could be something else as well.

As far as loading by number, depending on what you are trying to do and how the data works, I think there are some possibilities. If there is a identity key that that you are using that always auto increments and that is what you are keying off of, then you could try the following:

Control Flow: Execute SQL task

Result Set = Single Row

Variable = Some Integer variable to hold value (lets call it iKey)

SQL = SELECT MAX(IntegerKey) FROM Table

This should return the highest number and store it in your variable

Then you can use that iKey in your Source in your data flow if you use a variable to load your SQL from. In order to do that you will either need to use an Expression to construct your source SQL using the iKey in your WHERE clause so that you are only selecting records greater than the value of iKey. The other option is to use a script task to construct the SQL variable if you are not comfortable with Expressions.

Hope this helps.