best method for populating a fact table in DTS?

  • Hi All,

    I am new to dimesnional modeling and data warehousing. I was wondering if someone could just show me a real example method they use to populate a fact table with the dimension tables surrogate keys? Do you update the fact table keys, and then insert the data into fact table, use T-SQL and utilize LEFT JOINS to populate the Fact Table keys during the insert, do a lookup in your transformation task? What works best and is most effiecient? I would really appreciate any help on this topic ! Thanks !

    Pat

  • This site has many free articles that will go into depth and help you out.

    http://www.rkimball.com/html/designtips.html


    ------------------------------
    The Users are always right - when I'm not wrong!

  • Thanks for the reply Jonathan ! That is a good site.

    Would you happen to have a sample for populating the fact table with dimesnion surrogate keys in SQL?

  • ....just putting this one back to the top...anyone have any examples of their own method for this?....Thanks.

  • Apologies for delay in getting back to you.

    Unfortunately, I have never used surrogate keys before as I have never come across the need and think they add complexity to data warehouse modelling.


    ------------------------------
    The Users are always right - when I'm not wrong!

  • Use DTS features like Lookups and DataPumps as little as possible. Their performance is orders of magnitudes slower than a plain T-SQL INSERT INTO ... SELECT ... statement. Debuggging them is a royal p.i.t.a. Source code control and version comparison is next to impossible.

    Use plain T-SQL stored procs and use DTS for scheduling and workflow. Put your stored proc source in VSS for source code control.

    Using T-SQL, you load the dimensions first. Presumably each dimension has a native or natural key and a system generated surrogate key.

    You take your raw data that goes into the fact table, join it to the dimensions based on the native keys and subsitute surrogate keys.

    eg - a simple Fact table that summarizes sales amounts by account, product and date:

    Insert Into SalesFact

    (

       ProductKey,

       AccountKey,

       DateKey,

       TotalSales

    )

    Select

      p.ProductKey,

      a.AccountKey,

      d.DateKey,

      Sum(raw.SalesAmount)

    From Staging..RawSales As raw

    Inner Join ProductDimension As p

      On (raw.NativeProductIdentifier = p.NativeProductIdentifier)

    Inner Join AccountDimension As a

      On (raw.NativeAccountIdentifier = a.NativeAccountIdentifier)

    Inner Join DateDimension As d

      On (d.TheDate = system.LoadingDate)

     

     

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply