Single Data flow containing parallel Source and Destination Flows

Question

Single Data flow containing parallel Source and Destination Flows

robinrai3

SSCrazy

Points: 2810
More actions
June 28, 2012 at 5:19 pm

#258609

hi All,
I am using a single dataflow that is populating OLE DB staging tables from respective OLE DB data sources. So they all run in parallel. My question is the next task is an execute sql task that has to run when only when the dataflow task completely finishes. i.e all staging tables have been populated. Can i assume the dataflow is synchronous in that it will only proceed to the next control task when it is completed , or do i have to do something else to check its done. If yes what is the defacto/best practice way of doing this. Should i be using a sequence container to wrap the dataflow in and look at precedence constraints? I'm in early days of SSIS ...so bear with me...need a steer
thx
Robin

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply

Koen Verbeeck SSC Guru Points: 259015 More actions · Answer 1

The dataflow will only finish if every component in the dataflow has finished. In other words, when every parallel path has finished. So you're good, just connect the Execute SQL Task to the dataflow with a green arrow.

Be aware: having too much parallel flows in one dataflow can affect performance. Keep it limited to about 5 parallel flows.

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

Phil Parkin SSC Guru Points: 245813 More actions · Answer 2

robinrai3 (6/28/2012)
hi All,
I am using a single dataflow that is populating OLE DB staging tables from respective OLE DB data sources. So they all run in parallel. My question is the next task is an execute sql task that has to run when only when the dataflow task completely finishes. i.e all staging tables have been populated. Can i assume the dataflow is synchronous in that it will only proceed to the next control task when it is completed , or do i have to do something else to check its done. If yes what is the defacto/best practice way of doing this. Should i be using a sequence container to wrap the dataflow in and look at precedence constraints? I'm in early days of SSIS ...so bear with me...need a steer
thx
Robin

Not sure whether it is 'best practice', but I would say that having a single dataflow for multiple sources and destinations is likely to be difficult to maintain and troubleshoot.

My preference would be to construct multiple dataflows - one for each source/destination - and put them all in a Sequence container, maintaining your required degree of parallelism. Connect the ExecuteSQL task to the Sequence container to enforce your required processing order.

If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.