Split Huge table to several table, will be faster ?

  • Hi All,

     

    I have a huge of transaction data every day. My table will grow very fast.

    See below scenario, let say I split my table to each table per month.

    What is the advantage and disavantange ?

     

    Original table
    trx_tbl
    trxnotrxdatecustnametotal
    11000011/11/2005A1000
    11000022/11/2005B850
    11000033/11/2005D650
    12000011/12/2005A120
    12000032/12/2005E235
    12000025/12/2005A452
    10000013/10/2005B125
    10000038/10/2005W450
    09000101/9/2005C30
    split table by month
    trx_tbl_nov
    trxnotrxdatecustnametotal
    11000011/11/2005A1000
    11000022/11/2005B850
    11000033/11/2005D650
    trx_tbl_dec
    trxnotrxdatecustnametotal
    12000011/12/2005A120
    12000032/12/2005E235
    12000025/12/2005A452
    trx_tbl_oct
    trxnotrxdatecustnametotal
    10000013/10/2005B125
    10000038/10/2005W450
    trx_tbl_sept
    trxnotrxdatecustnametotal
    09000101/9/2005C30
    Create View trx_tbl_all 
    as
    select * from trx_tbl_sept
    union
    select * from trx_tbl_oct
    union
    select * from trx_tbl_nov
    union
    select * from trx_tbl_dec

     

     

    Thanks and Regards,

     

     

    Jonny

  • It will help in the fact that the indexes you build on each table will be smaller.

    But will mainly help if you query by the item you separate the tables by, in your case your separating by month, it will help queries you you query by month. but in order for the optimizer to figure out which table to select the data from, you need to create check constraints on the trxDate column.

    You will not be able to create an index on the view, because you are using a Union.

    If you do not usually query by Date, but more commonly by Customer, then you might want to create tables by smaller segment of customer. and not by date.

     

  • This is known as partitioning.

    See the BOL section titled "Partioned Views" for more info. You view needs to use UNION ALL, not UNION and as mentioned, you need a check constraint on the column being used to partition.

    Advantages: If partioned by date, historical partitions can be indexed to 100% FillFactor since they are static. Purging can be done by altering the view to drop partitions, rather than deleting. You can "federate" by placing some of the tables on a different linked server.

  • I would name the tables trx_MMYYYY as in trx_112005, we all know it is a tbl, depending on the number of rows you could also consider trx_QYYYY as in trx_42005. Easier to manage IMHO.

    The down side is when you need to union great numbers of tables to accomplish the query as in a 10 year AVG.

    Andy

  • Just being curious, how many rows will be inserted into this table per day? "Huge" is rather vague...

    Anyway, dividing the data into many tables is something I would try to avoid if possible. I would prefer to have one table (trx_tbl) that holds all current data and then one or more archive tables, where the data would be moved after some time - a month, quarter, year, whatever suits your needs. Hundred thousand rows per day could still work fine with archiving data only once in a year.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply