Large Data Files - Filegroups - Best Practise?

  • Hi all. I am new to this website but I can already tell that it's very interesting and shall be using it often.

    My question is this...I have a database with a large datafile of 10 GB and once I migrate to 2005 I want to use filegroups. What is the best configuration for best performance...

    A) Having 1 FILEGROUP (other than the PRIMARY) with multiple secondary datafiles on a secondary HD other than the C drive and have all tables on this filegroup

    OR

    B) Having multiple Filegroups with several datafiles each and have different tables spread on different filegroups?

    I mean is it this...

    HD 1 - PRIMARY FILEGROUP - PRIMARY DATAFILE

    HD 2 - SECONDARY FILEGROUP - DATAFILE 1 , DATAFILE 2, DATAFILE 3 (all tables here)

    OR

    HD 1 - PRIMARY FILEGROUP - PRIMARY DATAFILE

    HD 2 - SECONDARY FILEGROUP - DATAFILE 1 , DATAFILE 2 (some tables here)

    HD 2 - THIRD FILEGROUP - DATAFILE 3 , DATAFILE 4 (some tables here)

    Also, what should I keep on the Primary Filegroup?

    Thanks for your response.

    PS: Hd will be both on a RAID 10 configuration. Also I use simple recovery mode so Log files shouldn't be a problem.

    Mario

  • well Mario, U must spilt this file having size 10 GB ,normally all of this is done to reduce IO or eleminate the IO bottlenecks, due to multiple reading heads of disks, i will suggest u followings,

    HD 1 - PRIMARY FILEGROUP - PRIMARY DATAFILE (All Tables here only with clustered indexes, its growth should be restricted)

    HD 2 - SECONDARY FILEGROUP - SECONDARY DATAFILE 1 (All Nonclustered and image data should here,some tables only with high insertion or update should place here so that IO on databse should split)

     

    If u are doing all of this than should review ur backup/restore policy and scripts if any.

    Regards

    Ahamd Drshen

    adrshen@yahoo.com

  • Thanks Ahamd for your reply.

    All the tables have clustered indexes on the Primary key.

    The heaviest transaction we have is a heavy select statement loop on a view and this view gets data from the largest 4 tables in the db. This takes hours to compute.

    Shall I place these 4 tables in one filegroup having several datafiles or is it better to split them on separate Filegroups on separate HDs?

    ie something like...

    Table 1 - Filegroup 1 - Datafile 1 - HD1

    Table 2 - Filegroup 2 - Datafile 2 - HD1

    Table 3 - Filegroup 3 - Datafile 3 - HD2

    Table 4 - Filegroup 4 - Datafile 4 - HD2

    and the rest of the tables in the Primary Datafile?

    Thanks

    Mario

  • Although there is a potential performance benefit to use multiple files and filegroups, the main reason you would consider implementing them would be for easier administration. When you use multiple files for a db, whether data or log, they are filled proportionally. If you add a new file into an existing filegroup , it may show heavy IO until it catches up with the other files. On the other hand, files and file groups allow you to deal with smaller portions of your db when doing maintenance, such as backups/restores. The success of files and filegroups is not to split up IO bandwidth but to combine it - when placing a file on separate physical disks for each filegroup. Scan density is the quin here.

    The are catches as well, like for ex ALTER INDEX (sql 2k5) only works on a per-file basis (pages do not migrate between files)

    However, implementing files and filegroups for improving an app performance really depends on the type of sql transactions your app is running. For example, if you have a BI app running on a star datawarehouse db, you may want to keep your non-clustered indexes on a different disk (.ndf file, different filegroup). If you have historical data as well, you may want to implement partitioning - then it's recommended to keep historical data in a table separate by the current reporting period. Please keep in mind that these are just examples.

    Hope this helps.

     

     

  • Placing different filegroups under different drives will enhance performance. But under the same drive you will not get muchas expected.

    Cheers,
    Sugeshkumar Rajendran
    SQL Server MVP
    http://sugeshkr.blogspot.com

  • All the tables have clustered indexes on the Primary key.

    The heaviest transaction we have is a heavy select statement loop on a view and this view gets data from the largest 4 tables in the db. This takes hours to compute.

    Shall I place these 4 tables in one filegroup having several datafiles or is it better to split them on separate Filegroups on separate HDs?

    ie something like...

    Table 1 & 2 - Filegroup 1 - Datafile 1 - HD1

    Table 3 & 4 - Filegroup 2 - Datafile 2 - HD2

    and the rest of the tables in the Primary Datafile?

    Thanks

    Mario

  • I would say to optimise the query in the view first, then build an index on your view - use DTA for this; after that you can think to spread data accross multiple files. And always update the statistics!

    Good luck.

     

  • First of all thank you all for your replies.

    Michaela, that's is what I did, I created the indexes using the Index Wizard (in SQL 2000 as currently the db is on 2000). Performance improved but still the transaction is heavy. So I need to organise the datafiles.

    Re Statistics... I need to look into that as never used it

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply