Conforming dimensions

  • Hi.

    I'm pretty new to dw and currently having some problems with building conformed dimensions. I'm trying to conform the payment code dimension from 2 different source systems.

    Below is an extract of some of the recs stored in the first source (A):

    paycodedescription
    10cheque
    11cash
    12direct
    22internet

    From another source system (B), the recs are as such:

    paycodedescription
    CHcheque
    CAcash
    AXaxs

    My conformed dimension on this payment code would consists of a surrogate key, paycode (natural key) and the payment description. However, seeing that the natural key from both sources are stored differently (one as numerals and one as alpha codes), what is the best way of doing this?

    Do i store natural key as numerals and for payment codes which are available in B but not in A, create a new natural key for that code?

    TIA!

     

     

     

  • Hi Stephanie,

    I would suggest having columns "PaycodeA" and "PaycodeB" in your dimension table, and therefore storing both. This is because there isn't really a 'natural' key - particularly if you have to create one in the warehouse (as in your example of creating new natural keys for codes that are in B but not in A). If you throw away the paycodes from source B, then you won't be able to match fact rows from source B to your dimension.

    What do you think?

    Matt.

  • I agree with Matt, especially because with the expansion in internet banking, it seems to me that neither set of codes is truly exhaustive. What if, down the road, the 'cash' option needs to be expanded into 'hard cash', 'internet cash', etc.? Management will change requirements

  • Ok, that means something like:

    PaycodeKeyPaycodeAPaycodeBDescription</TD

    110CHcheque
    211CAcash
    312direct
    422intranet
    5AXaxs

    Yes, I think that would seem better. At least would be able to match the codes from the conformed dimension back to the source system easily.

    Thks a lot for the suggestion!

     

  • Hi,

    how about converting your PaycodeA to Char and then concatenate it with PaycodeB, it will give you One Natural Key. This way, you save one column.

    How about it?

    Sami

  • Stephenie,

    Actually what you did makes good sence for the data warehouse.  You don't necessarily want to use natural keys in the data warehouse.  Conforming data means coming up with a single way to look at the same data from disparate data sources that use different codes for the same type of information; source system a: 10 cheque -- source system b: CH cheque.  In the data warehouse you don't want to use both codes, as you would need to know both codes for queries to pull out all information for payments by cheque.  What you came up with creates a mapping between the seperate systems to a common set of codes for the data warehouse.

  • Hi Stephanie,

    My view is that you should always think of the end users first. End users want consistency and a single view of the customer. They don't want to query different codes for different products - it requires too much knowledge. Therefore, you should ask them which of the two types they'd rather query - why would they want both codes? I think they should be steered towards the alpha codes as these are easy to check. It's not intuitive that code 10 is cheque for example.

    Then, assuming the users go for the character codes, convert the numeric codes into the same character codes in the ETL and use numeric surrogate keys as per the data warehousing norm. Hope this helps.

  • I agreed with Alan. It is the 'Transformation' piece in the ETL process. It should be in the meta data table. You should ask the users what they wanted to see in the Data warehouse.

    Meta data table should look something liked

    Paycode VARCHAR

    DWPaycode VARHCHAR

    Paycode DWPaycode

    10 Cheque

    CK

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply