Single Data flow containing parallel Source and Destination Flows

  • hi All,

    I am using a single dataflow that is populating OLE DB staging tables from respective OLE DB data sources. So they all run in parallel. My question is the next task is an execute sql task that has to run when only when the dataflow task completely finishes. i.e all staging tables have been populated. Can i assume the dataflow is synchronous in that it will only proceed to the next control task when it is completed , or do i have to do something else to check its done. If yes what is the defacto/best practice way of doing this. Should i be using a sequence container to wrap the dataflow in and look at precedence constraints? I'm in early days of SSIS ...so bear with me...need a steer

    thx

    Robin

  • The dataflow will only finish if every component in the dataflow has finished. In other words, when every parallel path has finished. So you're good, just connect the Execute SQL Task to the dataflow with a green arrow.

    Be aware: having too much parallel flows in one dataflow can affect performance. Keep it limited to about 5 parallel flows.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • robinrai3 (6/28/2012)


    hi All,

    I am using a single dataflow that is populating OLE DB staging tables from respective OLE DB data sources. So they all run in parallel. My question is the next task is an execute sql task that has to run when only when the dataflow task completely finishes. i.e all staging tables have been populated. Can i assume the dataflow is synchronous in that it will only proceed to the next control task when it is completed , or do i have to do something else to check its done. If yes what is the defacto/best practice way of doing this. Should i be using a sequence container to wrap the dataflow in and look at precedence constraints? I'm in early days of SSIS ...so bear with me...need a steer

    thx

    Robin

    Not sure whether it is 'best practice', but I would say that having a single dataflow for multiple sources and destinations is likely to be difficult to maintain and troubleshoot.

    My preference would be to construct multiple dataflows - one for each source/destination - and put them all in a Sequence container, maintaining your required degree of parallelism. Connect the ExecuteSQL task to the Sequence container to enforce your required processing order.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply