what's the best column for clustered indexes

  • GilaMonster (9/23/2011)


    Gianluca Sartori (9/23/2011)


    Also a uniqueidentifier can be a good clustering key.

    Can be, maybe, if the person designing the table understands the complexities. Usually however it's about the worst clustered index possible (wideish and completely random)

    Agreed. It really depends.

    In databases designed for merge replication it's often used as primary key clustered, with default NEWSEQUENTIALID(). Still a bit wide, at least ever-increasing.

    -- Gianluca Sartori

  • Corrections on my poor advice are welcome.

    However, it seems to me that this thread is turning into arguing on each other's suggestions.

    I'm not sure this will help the OP in any way.

    -- Gianluca Sartori

  • Hi Gianluca Sartori,

    Your valauble helps me a lot. Thanks for all your advices

    Thanks

  • Since every column except id$ allows NULL, there's no choice for the primary key but id$ - and of course that would only be valid if it is unique.

    It's somewhat amazing to see a creation date column allowing null, instead of having a default of getdate() and forbidding null, but then it's also rather unusual to see so many columns permitting NULLs. But the example queries don't incude any operations on a range of creation dates (or indeed on a range in any column, and don't even include a select on creation date, so if that were the cluster key it would act only to organise the table instead of leaving it as a heap, would have no function at all as an index.

    Using id$ as clustered primary key will lead to some fragmentation because id$ is not stable - it gets updated in one of the example queries shown; but how often does that happen? If it's not very often, it doesn't matter, and id could be the clustered primary key (at least the sample queries do filter on individual values of id$, so it will have some use as an index unless it is insufficiently selective).

    On what we've been told I think the best advice we can give is to review the table definition and work out which - if any - of the columns other than id$ should forbid NULL values and make them NOT NULL (perhaps providing sensible defaults for them); then review the workload and see if there are any range filters on any of the columns in the table; review business rules and see what combinations of columns (if any) are unique and add unique constraints; and then review the table seriously and see what normalisation should be done. Only after all that has been done is it possible to examine sensibly what should be primary keys, what should be clustered indexes, and so on.

    Tom

Viewing 4 posts - 16 through 18 (of 18 total)

You must be logged in to reply to this topic. Login to reply