Change multiple rows to Distinct in Data Flow Task

  • This is a bit of a weird one.

    I have SQL data being sent to a mainframe that has to be unpivoted and sent in 8 rows per Contract to the mainframe. In an error-checking task, I have to check 20+ fields for bad values before the final load. Because of this, my source SQL proc is pulling back the 8 rows per contract in the data flow task. However, when an error is found on one field, I want it to parse down to 1 distinct row for the error, not 8 rows for the same contract & error.

    Example:

    ContractNum LastName FirstName

    1234 Tarvin NULL

    FirstName & LastName can't be blank. They are required. So I have a conditional split checking for that. Then it splits into two Derived Column transformations which add FieldName and FieldValue.

    My final "error" flat file consists of the following fields (so the business unit knows what they have to correct):

    ContractNum FieldName FieldValue

    1234 FirstName NULL

    However, because the initial source is pulling 8 rows, the same information gets repeated 8 times over in my flat file (as below).

    ContractNum FieldName FieldValue

    1234 FirstName NULL

    1234 FirstName NULL

    1234 FirstName NULL

    1234 FirstName NULL

    1234 FirstName NULL

    1234 FirstName NULL

    1234 FirstName NULL

    1234 FirstName NULL

    1234 FirstName NULL

    I cannot change the source. There are distinct values in each of the 8 rows and those distinct values need to be error-checked. So I need a way of taking those 8 rows AFTER the error discovery and making it DISTINCT down to 1 row.

    I don't see a way of doing that save the OLE DB Command transformation, but I can't figure out a good command to use for this.

    Any thoughts?

    Thanks in advance.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Assuming that all 8 errors lines show the exact same output, you could perhaps create a staging table and send the errors into it. This would cause the table to contain 8 rows per unique error. In a second dataflow task you could query the distinct records and output to the error file.

  • Assuming that all 8 errors lines show the exact same output, you could perhaps create a staging table and send the errors into it. This would cause the table to contain 8 rows per unique error. In a second dataflow task you could query the distinct records and output to the error file.

  • All eight rows will show the same output. But package design was already completed a couple of weeks ago. I'm hoping to avoid major redesign given the dependencies on this particular Data Flow Task.

    Any other possibilities?

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • -- edit -- see below.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Add a process at the end to remove the dups from the flat file ... ?

    Not elegant, but avoids the major redesign.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • After getting all the error rows just keep an aggregate transformation task.

    Do a group by on all the three fields in the aggregate transformation task (Just check the column name and set the operation as group by for each column selected). So that it will group all the repeating rows as a single distinct row and then send it to OLEDB destination.

  • sriram979 (4/14/2009)


    After getting all the error rows just keep an aggregate transformation task.

    &lt blink &gt I didn't even think of that one. Thanks! I'll give it a try.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply