Design Advice - Partitioned Views or Partitioned Tables?

  • We are just moving to SQL Server 2005. I am developing a table which will hold several million rows per year, and I would like to partition it by year.

    Can anyone advise which would be best, a partitioned view where each year is a separate table with constraints allowing inserts to the view, or a single table partitioned by year?

    Data will never be dropped, so the advantage of "sliding windows" is not a consideration. I would be quite happy for the view to have empty tables set up for the next 25 years or so. Data is only added/updated nightly, and the speed of this is not critical. The important part is reading the data during the day.

    Partitioned views seem to do the job OK, but would this really be the recommended way? The data will never be distributed over multiple servers, and (unfortunately) all the disk space is on 1 physical drive, so there can be no I/O benefits from Distributed Views or FileGroups.

    I have seen conflicting articles, some saying partitioned views are only there for backward compatability and others saying that the best use for partitioned tables is "sliding windows".

    Any advise would be appreciated.

    Bob

  • Partitioned tables will result in much better performance than a single table. With a single table performance will degrade more and more as years go by, and then everyone will want to know who designed the system to only use one table for everything. You can always claim it was the guy that worked there before you, but a couple of people might know the truth.

    You don't need to use a view to combine the partition. Use a real partitioned table.

    Partitioned tables also have the advantage of being marked read-only and stored in a separate file that doesn't need to be backed up regularly.


    My blog: SQL Soldier[/url]
    SQL Server Best Practices:
    SQL Server Best Practices
    Twitter: @SQLSoldier
    My book: Pro SQL Server 2008 Mirroring[/url]
    Microsoft Certified Master: SQL Server, Data Platform MVP
    Database Engineer at BlueMountain Capital Management[/url]

  • From BOL:

    Local partitioned views are included in SQL Server 2005 for backward compatibility purposes only, and are in the process of being deprecated. The preferred method for partitioning data locally is through partitioned tables. For more information, see Partitioned Tables and Indexes.

    MohammedU
    Microsoft SQL Server MVP

  • Adding more filegroups to the same physical drive will have an effect, but I'm not sure you can say it will always be a benefit or a penalty.  The I/O for each file, even in the same filegroup, is handled by a separate thread so more I/O work can be done in parallel, but there is probably going to be more latency from head movement to span the files.  If your single drive is really one overworked disk a single data file might be best, but if it is something more high-performance like a RAID group or a SAN then multiple files might be better.

    So you're planning the table structure for the next 25 years and you're sure they won't add another drive or two somewhere in that time?

  • If each partition is many GB's, then not having to back up the previous 10 years inactive data of 500 GB in order to backup the active 2 GB data for this year would be highly advantageous particularly on slower drives.

    The read-only files could even be moved to cheaper drives on Raid 5 since you would only be reading, not writing.

    Definitely, the log file should be on a different drive path so that the logging doesn't compete with the data file for IO.


    My blog: SQL Soldier[/url]
    SQL Server Best Practices:
    SQL Server Best Practices
    Twitter: @SQLSoldier
    My book: Pro SQL Server 2008 Mirroring[/url]
    Microsoft Certified Master: SQL Server, Data Platform MVP
    Database Engineer at BlueMountain Capital Management[/url]

  • Thanks for the advice. I actually mis-stated the disk, although it appears as 1 physical disk it is actually 5 disks configured as a single raid-5 disk.

    On the point about partitioned views only being included for backward compatability, various MS article now have removed that, so I assume they are here to stay.

     

    Does anyone having any performance comparisions?

     

    Thanks again

     

     

     

  • You shouldn't even consider partitioned views since SQL 2005 has excellent support for real partitioned tables!!

    Can the disk be reconfigured as RAID 10? RAID 10 is much better.


    My blog: SQL Soldier[/url]
    SQL Server Best Practices:
    SQL Server Best Practices
    Twitter: @SQLSoldier
    My book: Pro SQL Server 2008 Mirroring[/url]
    Microsoft Certified Master: SQL Server, Data Platform MVP
    Database Engineer at BlueMountain Capital Management[/url]

  • Why worry about partitioned views?  You can create a partitioning scheme that maps the partitions for each year into the same filegroup (presumably PRIMARY).  Then you have one table, one partitioning function, and one partitioning scheme, instead of a partitioned view and a lot of tables.  There's probably no need to analyze I/O impact, the data pages are in the same place either way.  If you create the function and scheme with values for the next 25 years then you won't have to change anything for a long time.

    At a later date you might tackle improvements such as a read-only filegroup that you can move the historical partitions into, or filegroups on additional drives if you add hardware, but you don't have to create 30 separate filegroups right away.

  • >> There's probably no need to analyze I/O impact, the data pages are in the same place either way.

    Only if you do it that way. If you put them in separate filegroups, they can be put on separate drives (or not).


    My blog: SQL Soldier[/url]
    SQL Server Best Practices:
    SQL Server Best Practices
    Twitter: @SQLSoldier
    My book: Pro SQL Server 2008 Mirroring[/url]
    Microsoft Certified Master: SQL Server, Data Platform MVP
    Database Engineer at BlueMountain Capital Management[/url]

  • The only reason we're still talking about this is that the original poster seems to be unconvinced that partitioned tables are superior, and the only reason I can think of is that he would rather manage multiple tables than multiple filegroups.  I wanted to make the point that multiple filegroups aren't necessary, and if all the partitions are in the primary filegroup the distribution of data pages is probably exactly the same for either the partitioned table or a partitioned view.

    There are good reasons to use multiple filegroups, but they're not fundamental to choosing between partitioned table vs. partitioned view.  You could put the tables of a partitioned view in different filegroups and get the same performance and backup benefits.

    I think the relevant differences are that partitioned views are on the way out, and a partitioned table gives you fewer objects to manage.

  • I completely agree. The only reason I can think to even consider a partitioned view is if the data was stored in multiple databases or servers ... which I would not recommend in most cases anyway.


    My blog: SQL Soldier[/url]
    SQL Server Best Practices:
    SQL Server Best Practices
    Twitter: @SQLSoldier
    My book: Pro SQL Server 2008 Mirroring[/url]
    Microsoft Certified Master: SQL Server, Data Platform MVP
    Database Engineer at BlueMountain Capital Management[/url]

  • Again, thanks for the input.

    The bit in BOL about only retaining partioned views for backwards compatabilty has been taken out, so I suspect that is not an issue.

    I decided the best thing to do would be create both a partitioned view and a partitioned table to compare performance. Loaded with about 85 million rows the view consistently outperformed the table for selects - usually about 50% of the time.

    I suspect I may have got something wrong creating the partitioned table, so if anyone can spot what from the script below I would be most appreciative.

    Thanks once more.

    --Create Partition Function

    CREATE PARTITION FUNCTION [pflo_DateRangePF1] (datetime)

    AS RANGE RIGHT FOR VALUES ('19961231', '19971231', '19981231',

                   '19991231', '20001231', '20011231', '20021231',

                   '20031231', '20041231', '20051231', '20061231',

        '20071231', '20081231', '20091231', '20101231');

    GO

    --Create File Groups

    Alter Database Falcon Add Filegroup FG_1996

    Alter Database Falcon Add Filegroup FG_1997

    Alter Database Falcon Add Filegroup FG_1998

    Alter Database Falcon Add Filegroup FG_1999

    Alter Database Falcon Add Filegroup FG_2000

    Alter Database Falcon Add Filegroup FG_2001

    Alter Database Falcon Add Filegroup FG_2002

    Alter Database Falcon Add Filegroup FG_2003

    Alter Database Falcon Add Filegroup FG_2004

    Alter Database Falcon Add Filegroup FG_2005

    Alter Database Falcon Add Filegroup FG_2006

    Alter Database Falcon Add Filegroup FG_2007

    Alter Database Falcon Add Filegroup FG_2008

    Alter Database Falcon Add Filegroup FG_2009

    Alter Database Falcon Add Filegroup FG_2010

    Alter Database Falcon Add Filegroup FG_2011

    --Add files to file group

    Alter Database Falcon

    Add FILE (Name = FG_1996File,

    Filename = 'e:\MSSQL.1\MSSQL\Data\Falcon_FG_1996File.ndf',Size = 100MB,Maxsize = 500MB,Filegrowth = 100MB)To FileGroup FG_1996

    etc ...

    Alter Database Falcon

    Add FILE (Name = FG_2011File,

    Filename = 'e:\MSSQL.1\MSSQL\Data\Falcon_FG_2011File.ndf',Size = 100MB,Maxsize = 500MB,Filegrowth = 100MB)To FileGroup FG_2011

    --Create Scheme

    CREATE PARTITION SCHEME pflo_DateRangePS1

    AS PARTITION [pflo_DateRangePF1]

    TO (FG_1996, FG_1997, FG_1998, FG_1999,FG_2000,

        FG_2001,FG_2002,FG_2003,FG_2004,FG_2005,

     FG_2006,FG_2007,FG_2008,FG_2009,FG_2010,FG_2011);

    go

    --Create Table

    CREATE TABLE [dbo].[pflo_Position_P](

     [ValuationPoint] [varchar](20) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

     [ValuationVersion] [int] NOT NULL,

     [ValuationDate] [datetime] NOT NULL,

     [AccountId] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

     [InstrumentId] [varchar](30) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

     [PositionId] [int] IDENTITY(1,1) NOT NULL,

    etc....

     [LatestVersion] [char](1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL)

    ON pflo_DateRangePS1 ([ValuationDate]) ;

    go

    CREATE UNIQUE INDEX PK_pflo_Position_p ON pflo_position_p (

     [ValuationPoint] ASC,

     [ValuationVersion] ASC,

     [ValuationDate] ASC,

     [AccountId] ASC,

     [InstrumentId] ASC,

     [PositionId] ASC

    )

    go

    CREATE NONCLUSTERED INDEX [IX_pflo_Position_p_date] ON dbo.pflo_position_p

    (

     [ValuationDate] ASC,

     [LatestVersion] ASC

    )

    GO

    CREATE NONCLUSTERED INDEX [pflo_Position_p_ValuationPoint_Date_Acc_LatestVersion] ON [dbo].[pflo_Position_p]

    (

     [ValuationPoint] ASC,

     [ValuationDate] ASC,

     [AccountId] ASC,

     [LatestVersion] ASC

    )

    GO

    CREATE NONCLUSTERED INDEX [pflo_Position_p_ValuationPoint_Date_InstId_Acc_LatestVersion] ON [dbo].[pflo_Position_p]

    (

     [ValuationPoint] ASC,

     [ValuationDate] ASC,

     [InstrumentId] ASC,

     [AccountId] ASC,

     [LatestVersion] ASC

    )

    go

  • The partitioned table is spread all over your disk, while the tables from the partitioned view are much more localized in the primary filegroup.  The only way to get a valid comparison is either put all the partitions in the same filegroup, or distribute the tables from the partitioned view into all the new filegroups.

  • One of my concerns was adding filegroups and files, having now done some more performance tests, for the requirements of my project, I can happily put all the partitions in the primary filegroup and still have acceptable performance.

    Many thanks to those who spent time making suggestions.

    Bob

  • There is an advantage to partitioned views which in my opinion is frequently overlooked, and that is the ability to specify multiple "partition keys", by adding multiple constraints to the physical tables. For example - in my data warehouse my largest fact table is partitioned by year. Using a partitioned table I would create a partition on year_id. A query that selects on year_id (WHERE year_id = 1) will scan only the relevant partition. If I do a query like 'WHERE month_id = 11', the optimizer has no idea that month 11 is in year 1 and will scan all partitions.

    With a partitioned view, I can add constraints for all the time segments and the optimizer will take any/all of them into account. If I add 3 check constraints - form year_id = 1, month_id between 1 and 12, week_id between 1 and 52 the optimizer will know which physical table to scan whether I select by year, month or week.

    Beth

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply