March 4, 2006 at 12:36 pm
Hi All,
I am new to dimesnional modeling and data warehousing. I was wondering if someone could just show me a real example method they use to populate a fact table with the dimension tables surrogate keys? Do you update the fact table keys, and then insert the data into fact table, use T-SQL and utilize LEFT JOINS to populate the Fact Table keys during the insert, do a lookup in your transformation task? What works best and is most effiecient? I would really appreciate any help on this topic ! Thanks !
Pat
March 6, 2006 at 2:03 am
This site has many free articles that will go into depth and help you out.
http://www.rkimball.com/html/designtips.html
March 6, 2006 at 8:25 pm
Thanks for the reply Jonathan ! That is a good site.
Would you happen to have a sample for populating the fact table with dimesnion surrogate keys in SQL?
March 9, 2006 at 11:23 am
....just putting this one back to the top...anyone have any examples of their own method for this?....Thanks.
March 16, 2006 at 4:06 am
Apologies for delay in getting back to you.
Unfortunately, I have never used surrogate keys before as I have never come across the need and think they add complexity to data warehouse modelling.
March 16, 2006 at 3:32 pm
Use DTS features like Lookups and DataPumps as little as possible. Their performance is orders of magnitudes slower than a plain T-SQL INSERT INTO ... SELECT ... statement. Debuggging them is a royal p.i.t.a. Source code control and version comparison is next to impossible.
Use plain T-SQL stored procs and use DTS for scheduling and workflow. Put your stored proc source in VSS for source code control.
Using T-SQL, you load the dimensions first. Presumably each dimension has a native or natural key and a system generated surrogate key.
You take your raw data that goes into the fact table, join it to the dimensions based on the native keys and subsitute surrogate keys.
eg - a simple Fact table that summarizes sales amounts by account, product and date:
Insert Into SalesFact
(
ProductKey,
AccountKey,
DateKey,
TotalSales
)
Select
p.ProductKey,
a.AccountKey,
d.DateKey,
Sum(raw.SalesAmount)
From Staging..RawSales As raw
Inner Join ProductDimension As p
On (raw.NativeProductIdentifier = p.NativeProductIdentifier)
Inner Join AccountDimension As a
On (raw.NativeAccountIdentifier = a.NativeAccountIdentifier)
Inner Join DateDimension As d
On (d.TheDate = system.LoadingDate)
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply