Data partition?

  • Hi All,

    I 've a some big tables and i want to cut-off data of back-years, what should I do now, cause i also wants to view as my user demands, how can i achive this, as my existing application is directly querying those tables, and don't want to change them all. How can I reduce 1 big table into multiple tables while getting the data with a single query (I don't want to change my existing application by implementing UNION ALL), is there is any way?

    GOVIND

     

  • You can separate the BIG_TABLE into SMALL_TABLE1, SMALL_TABLE2 etc. Then create a view called BIG_TABLE that does the UNION ALL of the smaller tables. Your application won't notice any difference.

  • Thanks, but I don't want to this.

    Cause doing this will cause lots code changes. Please give me some other alternate.

     

    Govind

  • There is no alternative, at least not until SQL Server 2005. But why will this cause code changes? The beauty of it is that SQL Server automatically handles reads and writes to the view and delegate them to the correct base table(s). What does your code do that will make it break?

  • Because I have queried those big tables directly by their name, I've not create any view as wrapper against them. So it will cause lots of code + stored-procedures changes. What you say?

    Thanks

    Govind

  • But that is exactly what the view solves.

    You have this now: SELECT * FROM BIGTABLE

    Now you create SMALLTABLE1 and SMALLTABLE2, move all the rows there and then drop BIGTABLE. Finally you create a view named BIGTABLE, with this definition:

    SELECT * FROM SMALLTABLE1 UNION ALL SELECT * FROM SMALLTABLE2

    You can now continue to use your old code (SELECT * FROM BIGTABLE). From the applications (and stored procedures) point of view it is exactly the same as it was before. The results are the same and nothing needs to be changed. By specifying CHECK constraints in the base tables you use a partitioning column that decides which data goes into each table. Under the hood, when you execute an INSERT into the view SQL Server will now use this information and determine which table to insert the data into.

    Look up Partitioned Views in Books Online for full information.

  • Yes, this will solve my problem but... My database is merge-replicated and It will take longer to re-create my subscribers. And dropping some table will be... but I'll try this.

    Thanks Chris!!

    Govind

     

  • Chris, please guide me about partition on replicated data? Now is it possible to segregate data after its being published?

    Govind

     

  • Seams really simple and interesting

    But what about performance issues?

    Using the view that makes the UNION instead of the BigTable would degrade performance or may instead boost it?

    How should the constraints be set?

    Regards

    Giovanni

  • Exactly, how constraints will gonna work? This I don't know at the moment, let me try this on some sample database to get some results, but about performance, as Chris said earlier that SQL Server is fine tunned to judge which way it has to look to gather the data, once a VIEW is created with CHECK statements.

    Waiting for replies here

    Govind

  • Partitioning is good in SQL Server.

    You follow the rules for PARTITIONED VIEWS in BOL and after a bit of playing around with them you will notice that probably it's exactly what you need. Doing queries that requires TABLE SCAN will probably be slightly slower but if you implement the partitioning solution probably SQL Server will decide that it only requires to do a TABLE SCAN of one of your partitions instead, depending of course on your queries.

    If you partition on a DateTime column, doing removal of data that has gone 'out of scope' (older then x months/x days or whatever) will be so easy and fast ( drop table  instead of lengthy deletes&nbsp that you will never look back

    One of my current tables are growing with around 20 millions of records per hour ( in reality per day but we only collect one hour of data per day ) and without partitioning this would require slightly more juice then the currently dedicated SQL Server has.

    Probably not all cases will benefit from partitioning ( because of the administration it requires and the time to set it up ) but in alot of cases it does. I definitly recommend it! AFIK there are lots of places that doesn't use it because they do not know enough about it and are trying ( in vain ) to solve their problems by throwing more hardware at it, when instead partitioning would make all their problems go away almost for free

    And if you mange to get vertical partitioning to work for you too, then you are golden

    Happy partitioning! 

  • Yes this is exactly I want, but should I wait for Sql2k5 due to easy administration and a new way to implement it - TABLE Partition including the consideration of physical locations? I think I should explore Sql2k5 more rather to work on Sql2k, What are your thoughts? About Vertical Partition leave it man, i have big replicated Database and now it'll not be so easy to do such things

    Thanks

    Govind

  • I recommend you to dig into the partitioning in SQL Server 2000 directly!

    Short Flow:

    Determine HOW you want it partitioned. Let's just talk horizontal partitioning now.

    Say you have something like a DW structure. Often the fact table has a DATETIME column. Let's hope you use it also in your queries. That would be the perfect solution.

    Say you want to partition monthly. And just keep 20 Months of data.

    Say you do not want to move or delete (as of yet) any data in your current (Big) table.

    Say it is enough with a READONLY partitioned view (I.e the View is not updateable but the underlying partitions are)

    1) Make your import process import data into a Staging Table.

    2) After daily import:

    2.1) Verify data in staging table

    2.2) If today is the first of the month; Create a table called dbo.myTable_YYYY_MM_DDm (ex: myTable_2005_07_01m (the 'm' stands for Month and shows that the data in the table is a month worth of data. Recommended standard))  

    2.2) Move the (valid) data from your staging table into the current month table

    2.3) If no indexes exist on your monthly table, create them (they do not have to be the same for all tables in the partitioned view)

    2.4) If no constraints on your monthly table. Create a constraint on your DateTime column that restricts the data to this month.

    2.5) ALTER your partitioned view so it SELECT <columns> FROM BigTable UNION ALL SELECT <columns> FROM myTable_ThisMonth UNION ALL SELECT <columns> FROM myTable_LastMonth and so on for the last 20 months worth of partitions.

    2.6) DROP any partitions that are not participating in you newly constructed view.

     

    //Hanslindgren

  • Thanks Hanslindgren, indeed your suggestion is the needful solution of my current problem. Now I'll will do these things:

    1) From now, I'll create small tables for each month (with m suffix) with UNIONed ALL my current (Big) table.

    2) Then I'll add that small table in my publication so that replication works forever!

    3) Then I'll modify / check my queries for DW structure.

    Thanks again.

    Govind

  • Sounds great!

    I hope it works. Please let me know if there are any problems.

    //Hans

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply