non clustered primary key on a partitioned taable

  • ....

    Hello gurus,

    I need your comments on the table partitioning we implemented in our environment.

    we have a pk column (INT), Partitioning column(Datetime), fkcolumn (INT), few other varchar columns

    We created a clustered index on a Pk and partitioning column. Reason: our partitioning column is updated freequently. so inorder to avoid page splits, we created a clustered index on pk and partitioning column. please comment on this approach. Also, let me know if i have any misconception here.

    When I try to alteer table to create a non clustered primary key on a fkcolumn it is complaining that partitioning column should be part of the non clustered pk. As per my understanding, every non-clustered index will and should have the clustered index column/s as part of the non clustered index. If that understanding is correct, then, it should not complain for the partitioning column not being part of the non clustered pk.

    Please clarify my misconception.

    Thanks
    Jagan K

  • jvkondapalli (1/24/2012)


    ....

    Hello gurus,

    I need your comments on the table partitioning we implemented in our environment.

    we have a pk column (INT), Partitioning column(Datetime), fkcolumn (INT), few other varchar columns

    We created a clustered index on a Pk and partitioning column. Reason: our partitioning column is updated freequently. so inorder to avoid page splits, we created a clustered index on pk and partitioning column. please comment on this approach. Also, let me know if i have any misconception here.

    When I try to alteer table to create a non clustered primary key on a fkcolumn it is complaining that partitioning column should be part of the non clustered pk. As per my understanding, every non-clustered index will and should have the clustered index column/s as part of the non clustered index. If that understanding is correct, then, it should not complain for the partitioning column not being part of the non clustered pk.

    Please clarify my misconception.

    jvkondapalli, what is the relationship between your PK column (presumably an ID) and your partitioning column?

    Do you expect data from the past to show up later? (Meaning, a date say one month back being assigned a new ID as the current top ID +1?)

  • Revenant,

    ....

    Hello gurus,

    I need your comments on the table partitioning we implemented in our environment.

    we have a pk column (INT), Partitioning column(Datetime), fkcolumn (INT), few other varchar columns

    We created a clustered index on a Pk and partitioning column. Reason: our partitioning column is updated freequently. so inorder to avoid page splits, we created a clustered index on pk and partitioning column. please comment on this approach. Also, let me know if i have any misconception here.

    When I try to alteer table to create a non clustered primary key on a fkcolumn it is complaining that partitioning column should be part of the non clustered pk. As per my understanding, every non-clustered index will and should have the clustered index column/s as part of the non clustered index. If that understanding is correct, then, it should not complain for the partitioning column not being part of the non clustered pk.

    Please clarify my misconception.

    jvkondapalli, what is the relationship between your PK column (presumably an ID) and your partitioning column?

    Do you expect data from the past to show up later? (Meaning, a date say one month back being assigned a new ID as the current top ID +1?)

    1. PK is a running number and partitioning column will get updated when that record is processed. Practically, in our system, that same record could

    be processed a month later. This is the reason why we are having 3 months of data. on the 4th month, we will switch out the oldest parttion and

    merge it.

    2. I dont want that data to show up with the new ID. the running number should be same.

    Thanks
    Jagan K

  • jvkondapalli (1/24/2012)


    Revenant,

    ....

    Hello gurus,

    I need your comments on the table partitioning we implemented in our environment.

    we have a pk column (INT), Partitioning column(Datetime), fkcolumn (INT), few other varchar columns

    We created a clustered index on a Pk and partitioning column. Reason: our partitioning column is updated freequently. so inorder to avoid page splits, we created a clustered index on pk and partitioning column. please comment on this approach. Also, let me know if i have any misconception here.

    When I try to alteer table to create a non clustered primary key on a fkcolumn it is complaining that partitioning column should be part of the non clustered pk. As per my understanding, every non-clustered index will and should have the clustered index column/s as part of the non clustered index. If that understanding is correct, then, it should not complain for the partitioning column not being part of the non clustered pk.

    Please clarify my misconception.

    jvkondapalli, what is the relationship between your PK column (presumably an ID) and your partitioning column?

    Do you expect data from the past to show up later? (Meaning, a date say one month back being assigned a new ID as the current top ID +1?)

    1. PK is a running number and partitioning column will get updated when that record is processed. Practically, in our system, that same record could

    be processed a month later. This is the reason why we are having 3 months of data. on the 4th month, we will switch out the oldest parttion and

    merge it.

    2. I dont want that data to show up with the new ID. the running number should be same.

    Any thoughts from the experts? your valuable suggestions/comments are deeply appreciated.

    Thanks

    Jagan K

    Thanks
    Jagan K

  • I think the first question is "does having the primary key and clustered index on an identity column and the partitioned column of datetime eliminate page splits when new data is entered?"

    From my understanding, I don't see how this prevents page splits. (Feel free to jump in gurus) In fact, I think with frequent updates on the partitioned column it will create MORE page splits. I see no need to have this column as part of the key as it does not describe any uniqueness at this point of my understanding.

    Jared
    CE - Microsoft

  • Per my understanding, partitioning column should either be part of the clustered index or the Primary key and it should not be null. Else, it will reside left most partition.

    Again, I might be wrong, please comment.

    Thanks
    Jagan K

  • The problem is you are saying that that column gets updated frequently. So every time you update that value, it has to move the data. That may be to another page and even another file. I would hesitate partitioning on a column that gets frequent updates.

    Jared
    CE - Microsoft

  • This script demonstrates some concepts to answer some of your questions:

    USE tempdb

    GO

    CREATE PARTITION FUNCTION PF(date) AS

    RANGE RIGHT FOR VALUES ('2012-02-01', '2012-03-01', '2012-04-01');

    GO

    CREATE PARTITION SCHEME PS AS PARTITION PF ALL TO ([PRIMARY]);

    GO

    -- Partitioned heap

    CREATE TABLE dbo.Heap

    (

    pk integer IDENTITY NOT NULL,

    the_date date NOT NULL,

    fk integer NULL,

    data varchar(50) NULL

    ) ON PS (the_date);

    GO

    -- Sample data

    INSERT dbo.Heap

    (the_date, fk, data)

    VALUES

    ('2012-01-04', 1001, 'a'),

    ('2012-01-08', 1002, 'b'),

    ('2012-02-21', 1003, 'c'),

    ('2012-03-07', 1004, 'd'),

    ('2012-03-15', 1005, 'e'),

    ('2012-03-24', 1006, 'f');

    GO

    -- Show partitions and row counts

    SELECT

    partition_id = $PARTITION.PF(the_date),

    row_count = COUNT_BIG(*)

    FROM dbo.Heap AS h

    GROUP BY

    $PARTITION.PF(the_date)

    GO

    -- Error: Partition columns for a *unique* index must be

    -- a subset of the index *key* (included column does not count)

    CREATE TABLE dbo.Clustered1

    (

    pk integer IDENTITY NOT NULL,

    the_date date NOT NULL,

    fk integer NULL,

    data varchar(50) NULL,

    CONSTRAINT [PKC dbo.Clustered1 pk P:the_date]

    PRIMARY KEY (pk) ON PS (the_date)

    );

    GO

    -- Same error, for non-clustered *unique* index (PK)

    CREATE TABLE dbo.Clustered2

    (

    pk integer IDENTITY NOT NULL,

    the_date date NOT NULL,

    fk integer NULL,

    data varchar(50) NULL,

    CONSTRAINT [PKN dbo.Clustered2 pk P:the_date]

    PRIMARY KEY NONCLUSTERED (pk) ON PS (the_date)

    );

    GO

    -- Start as an ordinary heap with a nonclustered PK

    CREATE TABLE dbo.Clustered3

    (

    pk integer IDENTITY NOT NULL,

    the_date date NOT NULL,

    fk integer NULL,

    data varchar(50) NULL,

    CONSTRAINT [PKN dbo.Clustered3 pk]

    PRIMARY KEY NONCLUSTERED (pk)

    );

    GO

    -- Partitioned non-unique clustered index is allowed

    -- Note: the existing nonclustered PK stays NOT partitioned

    CREATE CLUSTERED INDEX [CX dbo.Clustered3 fk P:the_date]

    ON dbo.Clustered3 (fk) ON PS(the_date);

    GO

    DROP TABLE dbo.Heap;

    DROP TABLE dbo.Clustered2;

    DROP TABLE dbo.Clustered3

    DROP PARTITION SCHEME PS;

    DROP PARTITION FUNCTION PF;

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply