Partitioning a DataWareHouse

  • Hi Is there a general guidelines for partitioning a datawarehouse ie:How to choose the partitioning column for the fact and dimension tables? most of the examples I have seen are partitioned on DATE,but in our case the DATE column on our FACT Table is updated frequently and not a good canditate for clustered index. Also does the Fact and Dimension talbes use the same partitioning scheme/Functions?

    We have three or four Fact Tables and about 10 dimension tables. We are expecting the DW to be about 1.5 TGB initially and Grow in size.

    Any information is appreciated.

    Thanks

  • The general guidance on partitioning is that you partition a table either to help during quering or to help during archiving/purging... or both.

    In general a date column is a good candidate providing a substantial number of queries include such date column in the predicate.

    I would say that a good partition-key candidate column should ...

    1- Be present in most user queries

    2- Ensure a somehow balanced number of rows per partition

    3- ideally help in a future archive and purge process

    Having said that, not all tables have to be partitioned. I only think about partitioning when a table is a candidate to hold a population in excess of 50M rows.

    Hope this helps.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply