SSIS : Insert/update source to destination table

  • i'm agree for the SCD wizard, but i thought u wanted something different.

    i have a real pleasure to read all that post on this subject 🙂

    I didn't use the codeplex one, and the merge option as well, and actually for me, dimension should not contains millions rows 🙂

    but i accept the fact that sometime, we don't have choices.

    :hehe:

  • Brandon Carl Goodman (4/23/2010)


    Very nice. I have had to implement the check sum before. What I did was build a checksum value over the values of columns (in this case all of them) and performed and insert, update, delete based upon that value. I like your setup in your last post. However, I believe that we may be getting slightly off base of helping a fellow with his issue.:-) Pleasure discussing strategy with you.

    It's my pleasure as well 🙂

    Off topic, off topic... The tile is SSIS: Insert/Update soruce to destination.

    Exactly what we're discussing 😀

    How did you implement the checksum, I've never actually done it before.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Here is a link for Phil Brammer article on using the checksum component for ssis.

    http://www.ssistalk.com/2007/03/09/ssis-using-a-checksum-to-determine-if-a-row-has-changed/

    When I implemented the checksum, I did all of it from T-SQL. Allow me to explain -

    1. I have a base and a staging table identical in schema. Data from my import is loaded into my staging table. On the staging table I have added an additional column (checksum value) that is a computed column contain a checksum over all of the columns of the staging table, with the exception of a few columns. As data is loaded into staging table, checksum value is computed on the fly.

    2. Once data has been loaded into the staging table, I perform a comparison between the staging table and base table.

    a. Delete - delete from base table where business key not exist in staging table.

    b. Update - create a checksum value over the columns of base table and compare that checksum value and business key against the checksum value and the business key of the staging table. If not equal update, else ignore.

    c. Insert - where business key not exist in base table.

    If you like I could create a mockup of the syntax for you if that would make more sense. I am more of a visual person my self so sometimes this helps. A picture is worth a thousand words.

  • Thanks for the link, very informative. However, at my current project I'm not allowed to install custom components, so yeah 🙂

    Correct me if I'm wrong, but to me it seems that the set-up that you outlined just makes a perfect copy of the staging table in the base table. (rows that are not in the staging table are deleted, rows that are not in the base table are inserted, changed rows are updated. Hence, you end up with the staging table).

    Isn't it simpler to just truncate the base table, perform a load of the staging table and re-build indexes?

    (or did I miss something. That's possible, it's friday :-))

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Business requirements state that I am not allowed to truncate the base table as it needs to available at all times. I will work on a mockup for you and post it.

  • I have not forgotten about mock-up. Having a busy Friday.

  • CREATE TABLE [Database].[dbo].[Table_Stage]

    ([AccountNumber]char(16),

    [Name]varchar(26),

    [Address]varchar(26),

    [Checksum_Col]AS Checksum([AccountNumber], [Name], [Address]))

    CREATE [Database].[dbo].[Table_Base]

    ([AccountNumber]char(16),

    [Name]varchar(26),

    [Address]varchar(26))

    /*----------------------------------------------------------------------------------------------------------------------------

    Delete where not exists.

    ----------------------------------------------------------------------------------------------------------------------------*/

    DELETE

    FROM [Database].[dbo].[Table_Base]

    WHERE [AccountNumber] NOT IN(SELECT [AccountNumber] FROM [Database].[dbo].[Table_Stage]) --delete based upon business key

    /*----------------------------------------------------------------------------------------------------------------------------

    Update records that have a differing checksum value based upon business key (AccountNumber.

    ----------------------------------------------------------------------------------------------------------------------------*/

    UPDATE [Database].[dbo].[Table_Base]

    SET [AccountNumber]=stage.[AccountNumber],

    [Name]=stage.[Name],

    [Address]=stage.[Address]

    FROM [Database].[dbo].[Table_Base] base

    INNER JOIN [Database].[dbo].[Table_Stage] stage

    ON base.[AccountNumber] = stage.[AccountNumber]

    WHERE CHECKSUM(base.[AccountNumber],

    base.[Name],

    base.[Address]) <> stage.[Checksum_Col]

    /*----------------------------------------------------------------------------------------------------------------------------

    Insert where not exists.

    ----------------------------------------------------------------------------------------------------------------------------*/

    INSERT INTO [Database].[dbo].[Table_Base]

    ([AccountNumber],

    [Name],

    [Address])

    SELECT [AccountNumber],

    [Name],

    [Address]

    FROM [Database].[dbo].[Table_Stage]

    WHERE [AccountNumber] NOT IN (SELECT [AccountNumber] FROM [Database].[dbo].[Table_Base])

    As promised a mock-up. Enjoy the weekend.

  • Thanks!

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Seems I am not understanding something. If the idea is to use ETL tools like ssis to speedup development and to avoid hard coding of store procedures why do we need to go and type the sql for a update / insert / merge inside a ssis package to be able to do a simple task like a upsert / merge statement, It seems to me to defy the purpose.

    It would be faster just to type the same procedure and compile as a store procedure. Probably save you time as well.

  • The code that is listed is actually a compiled stored procedure that is called from the SSIS package. The code is merely for demonstration purposes.

    This post is also two years old.

  • Thanks, I am using Visual studio 2010 to develop the ssis. Doing a oracle to oracle elt job. However I have not been able to find a component that can handle a update into a table seamlessly, without writing some code. If that is the case I can just as well use PL/SQL for the ETL. Do you have any suggestions?

  • cbrown-779669 (7/3/2012)


    Thanks, I am using Visual studio 2010 to develop the ssis. Doing a oracle to oracle elt job. However I have not been able to find a component that can handle a update into a table seamlessly, without writing some code. If that is the case I can just as well use PL/SQL for the ETL. Do you have any suggestions?

    You're right, SSIS doesn't have a component to do updates efficiently.

    The best option is to write updates to a staging table and do the update with a good old SQL UPDATE statement.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • This was removed by the editor as SPAM

Viewing 13 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply