Moving Large Table to Different File Group

  • Simon Facer (12/6/2013)


    (1) Clustered indexes causes page splits. Yes, you get Page Splits with a Clustered Index, but Clustered Indexes do not cause Page Splits. Page splits are caused by Insert or Update activity in a table with a Clustered index, or even switching to Snapshot isolation. In the case of a Heap, you get forwarded records instead, which are worse for performance.

    (2) Clustered Index Columns are added to all secondary indexes. True. There is storage and IOPs overhead involved with adding the columns to secondary indexes, but the performance benefits of a Clustered Index far outweigh the overhead. This does mean that you have to design your indexes, rather than just using a 'this looks good' approach. A wide Clustered Index is not normally a good choice, design your indexes with care, and keep the Clustered index as narrow as possible.

    These two points are the reasoning behind the best practice that one should always cluster on a sequential identity column; it offers the best paging performance (see below) and is always the narrowest possible clustering index.

    1. Instead of page splits in the middle of a clustered index (requiring a fill factor < 100 to minimize the effects), inserts always go to the end of the clustered index. This fills pages neatly and opens new pages at the end, which is a much lower cost operation than page splits.

    2. The index column is, if it is a bigint, 8 bytes. This reduces data space consumption in the nonclustered indexes. consequentially, index performance is improved.

    3. A SQL Primary Key can then be natural, or it can be the identity. If the latter, then the amount of data in the child rows referring to the parent may be smaller than if a natural PK is used. Of course, this has ramifications to the underlying application, and must be planned together with the developers. If the natural PK is still used a it can be nonclustered, and the performance will still be very good.

    Thanks

    John

    The identity column does not have to be informative to the application, and the extra 4-8 bytes per row is negligible in the big picture, since it empowers the DBA to effectively performance-tune the database.

  • I think the solution is more complicated then it needs to be. You don't really need to do the with move as recreating (in this case with drop_existing) the clustered index will move the data. Note since the clustered index is a PK this "new" index will be created as a PK as well.

    -- also you probably want to do online = on so you don't lock the table

    -- Secondery should be spelled secondary.

    CREATE UNIQUE CLUSTERED INDEX [INVOICE_PK] ON [INVOICE]

    ([column name] ASC)

    WITH (ONLINE=ON, DROP_EXISTING=ON) ON [SECONDERYDATA];

    Thanks.

    Kirby

  • Hi,

    in an replication i can't drop the PK, what ist the best way around?

    Thx.

  • kay.wohlfarth (12/7/2013)


    Hi,

    in an replication i can't drop the PK, what is the best way around?

    Thx.

    Is your PK presently clustered? If not then you can create a clustered index on the new file group.

    Is yes then you need downtime because you will need to:

    1. Quiesce the applications on both publisher and subscribers.

    2. Remove the table as a replication article.

    3. Drop PK

    4. Recreate PK clustered on new file group or Consider adding an integer or bigint identity column as an unique clustered index (see above).

    5. Re-add table as a new replication article.

    6. Snapshot if this is a transactional publication.

    7. Come back up.

    There are likely more steps depending on environment. However this is the gist of it.

    Thanks

    John.

  • Uwe Ricken (12/6/2013)


    jayant-479472 (10/16/2008)


    I think my question is little confusing, let me make it more clear

    If we have a table with cluster index then by simply recreating the clustered index on different file group will move the table also on that file group

    now I have a table with no index (for simplicity) and I would like to move that table to a newly created filegroup, how would I do that?

    Yayant,

    this is nonsence - the clustered index IS the table.

    If you don't have a clustered index you have a HEAP.

    Get familiar with the concepts of tables here:

    http://technet.microsoft.com/en-us/library/ms190457.aspx%5B/quote%5D

    It isn't nonsense. He's saying that you move a table by moving the clustered index, which you didn't disagree with. He then asks what you should do if there is no clustered index.

    John

  • victor.girling 17919 (12/6/2013)


    Please note that log(m) mathematically proves that a clustered index is quicker with any table over 14 rows.

    That sounds interesting. Do you have any further reading on that, please? The link you posted subsequently doesn't even mention clustered indexes.

    John

  • John,

    Thanks for the detailed list for replication.

    Kay

  • bobbums (12/6/2013)


    In the article, your suggestion is to drop the PK and then re-add it.

    This is correct as in this case the clustered index column is also the PK constraint.

    bobbums (12/6/2013)


    1. I think that will be tedious with foreign keys on the table, because we will have to drop all of the foreign keys and re-add them as well.

    exactly why you should design correctly from the outset ๐Ÿ˜‰

    bobbums (12/6/2013)


    2. I think that will have a lot of overhead because it could cause all of the nonclustered indexes to be rebuilt twice (once when you drop the PK, and again when you re-add the PK).

    see above

    bobbums (12/6/2013)


    If we instead used CREATE INDEX WITH (DROP EXISTING)

    This is not valid in this case. You cant specify to create a clustered index with drop existing when the PK constraint is in force, you need to remove it first which doing so drops the clustered index anyway. Likewise creating the PK clustered re instates the primary key with a clustered index on the filegroup you specify.

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" ๐Ÿ˜‰

  • Perry Whittle (12/9/2013)


    bobbums (12/6/2013)


    1. I think that will be tedious with foreign keys on the table, because we will have to drop all of the foreign keys and re-add them as well.

    exactly why you should design correctly from the outset ๐Ÿ˜‰

    You do know that some of us are maintaining databases created by other people, right? Based on your advice of "design correctly from the outset", we don't even need this discussion. In fact there should be no discussions about fixing anything. Just one giant article that says "design correctly from the outset". This contributes nothing to the current discussion. ๐Ÿ˜‰

    Perry Whittle (12/9/2013)


    bobbums (12/6/2013)


    If we instead used CREATE INDEX WITH (DROP EXISTING)

    This is not valid in this case. You cant specify to create a clustered index with drop existing when the PK constraint is in force, you need to remove it first which doing so drops the clustered index anyway. Likewise creating the PK clustered re instates the primary key with a clustered index on the filegroup you specify.

    I am pretty sure that your entire statement (except the last sentence) is totally false. I had already tested my recommendation before I posted in the first place. There is too much pollution on the internet in the form of unverified expertise. Today is a great day for all of the experts on the internet to check everything they say before they post it. (I apologize for my frustrated tone.)

    Anyways, the following runs without any errors on SQL Server 2008 R2:

    create table bob_test_2 (b varchar(10), c varchar(10) not null, constraint bob_test_2_pk primary key (b), constraint bob_test_2_ak_1 unique (c));

    create table bob_test_1 (a int, b varchar(10) not null, constraint bob_test_1_pk primary key (a) on fg_data1, constraint bob_test_1_ak_1 unique (b,a), constraint bob_test_2_fk_1 foreign key (b) references bob_test_2(b));

    insert into bob_test_2 (b,c) values ('one','first'),('two','second'),('three','third');

    insert into bob_test_1 (a,b) values (1,'one'),(2,'two'),(3,'three')

    go

    create unique clustered index bob_test_1_pk on bob_test_1 (a) with (drop_existing = on) on fg_index1;

    go

    drop table bob_test_1;

    drop table bob_test_2;

  • John Mitchell-245523 (12/9/2013)


    victor.girling 17919 (12/6/2013)


    Please note that log(m) mathematically proves that a clustered index is quicker with any table over 14 rows.

    That sounds interesting. Do you have any further reading on that, please? The link you posted subsequently doesn't even mention clustered indexes.

    Yes, great. Another mathematics genious. And, I can point you at a mathemtical proof that 1 + 2 + 3 + 4 + ยท ยท ยท to inifinity = -1/12. So what? Stick to the topic.

    This DBA says - "It depends".

  • bobbums (12/9/2013)


    You do know that some of us are maintaining databases created by other people, right?

    Yes, as do I.

    bobbums (12/6/2013)


    Anyways, the following runs without any errors on SQL Server 2008 R2:

    create table bob_test_2 (b varchar(10), c varchar(10) not null, constraint bob_test_2_pk primary key (b), constraint bob_test_2_ak_1 unique (c));

    create table bob_test_1 (a int, b varchar(10) not null, constraint bob_test_1_pk primary key (a) on fg_data1, constraint bob_test_1_ak_1 unique (b,a), constraint bob_test_2_fk_1 foreign key (b) references bob_test_2(b));

    insert into bob_test_2 (b,c) values ('one','first'),('two','second'),('three','third');

    insert into bob_test_1 (a,b) values (1,'one'),(2,'two'),(3,'three')

    go

    create unique clustered index bob_test_1_pk on bob_test_1 (a) with (drop_existing = on) on fg_index1;

    go

    drop table bob_test_1;

    drop table bob_test_2;

    I stand corrected

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" ๐Ÿ˜‰

  • Hi,

    Thanks for the great stuff. I have a table which is part of the transactional replication. As per the article, I cannot drop cluster index as expected. Please let me know what are all the steps need to be taken care. Thanks in advance.

  • I know this is old post but this is what I am looking for . I recreated the index specifying new filegroup but the data is still in Primary Filegroup the column which stored data is Varbinary max. Do I missed anything?

Viewing 13 posts - 46 through 57 (of 57 total)

You must be logged in to reply to this topic. Login to reply