A hybrid design?

  • Hi All,

    I don't want to go into specifics here because i'm not after an answer as such, hopefully just recommended research material perhaps some articles, white papers or products, or your own experiences...

    If you have a single OLTP source, thats quite complex. Not a huge amount of data, 20 - 30 thousand rows a day and the reporting requirements change frequently, then what is the best solution?

    A couple of years ago i built a data mart, text book stuff really, star schema etc with a cube and the reports are delivered through reporting services. This has worked well.

    The business has undergone a lot of changes really and the reporting requirments are changing more frequently and the analysts are constantly looking for data from different area's of the system.

    The problem is that there is a significant bottle neck in development time when new requirments are raised in adding new data into the mart to meet these requirments.

    I'm thinking about a new solution, a sort of hybrid between a data mart and the OLTP schema, i think by retaining the OLTP Id's instead of using surrogates, leaving the granularity alone, storing attributes in the fact table (hierarchical data in dimensions) that i would reduce the complexity of the transformations and have a more flexible design.

    I could easilly write another few pages on this, hopefully this is enough for someone to give me some advice though..

    Many thanks

    Bob

     

     

     

     

     

  • How many dimension tables and fact tables do you have?

    What is the maximum number of dimensions in the fact table do you have?

    What kind of bottleneck do you have ? Do you have constantly changing the fact tables?

  • hi loner, thanks for taking an interest..

    currently at 3 fact tables and about 25 dimensions...

    the bottle neck comes from the data warehouse being so abstract and the OLTP source being so complicated, there is something like 800 tables in the source. What tends to happen is a change is made to the production system or somthing happens commercially that provokes new reporting requirements but because of the time it takes to go through the design process, add the tables develop the SSIS packages, test it and back date the data etc then alot of the time we end up querying straight from the oltp source which is not ideal. This is why i'm trying to come up with a solution that can react faster to change, have lots of detail and be easier to query that the oltp system...

     

     

     

  • 800 tables in the source system?  Do you have to load the sources everyday?  Also 3 facts tables and 25 dimensions, so on average each fact table has more than 8 dimensions ?  According to Kimball, that is not good.  It will slow down your query.

    How do you load your data?  totally refresh or incrementally?

    Maybe you need to have a new design.

  • don't take this the wrong way, the new design is kind of why i posted this topic.

    I'm by no means an expert, but i'm fairly well experienced with data warehouses and i've built this one correctly to the best of my knowledge, i agree with the more than 8 dimensions thing, but what's the alternative? split the fact table?? also the tables don't contain a massive amount of rows, so i'm not all that concerned about the width of the table as much...

    i have to do a refresh, not full, there are business rules that come into affect which means i have to refresh 3 months each night.

    My feeling is that the text book aproach to data warehousing is wrong for this system.

     

     

     

  • In my experience there is nothing wrong with using a hybrid method for reporting. Most often than not you will end up with a need for transactional reporting that doesn't require star schema databases.

    If you are not loading your mart straight from the source, which is not a good idea anyway, I would recommend to use the Staging database (which is a copy of the OLTP db) to leverage for some transactional level reporting. You can set up a backup/restore or attach/detach job to keep it in sync with production once a day or however often your refresh schedule is. This way you don't need to create any additional transformation packages. You can handle the business logic in views or special reporting table loads if necessary. If you have time, you can normalize the OLTP model into a third normal form on your reporting server bringing only the tables you need for reporting.

    Hope that helps.

    olga

  • Hi olga, thanks, that is useful, it gives me some confidence in my idea and some very useful tips, can you recommend any resource where i can read though some developers experiences perhaps in more detail? If not don't worry about it, i appreciate your response and it will come in very useful, thank you.

    Cheers,

    Bob.

     

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply