add primary key constraint but use existing clustered index

  • We have many tables in our database that have both a clustered index and non-clustered index on the same column. For example:

    ACCIDENT

    KEYS

    PK__ACCIDENT__435F9C4B

    INDEXES

    PK__ACCIDENT__435F9C4B(Unique,Non-clustered)

    primaryKey(clustered)

    This occurred because the clustered index was added first, and then a primary key constraint was added which creates the second non-clustered index (on the same column - usually ID ).

    It seems like a lot of extra overhead to have two indexes on the same column so I would like to clean them up. The only solution that I could find was to do a

    DROP INDEX primaryKey ON ACCIDENT

    ALTER TABLE ACCIDENT DROP CONSTRAINT PK__ACCIDENT__435F9C4B

    to get rid of the indexes and then add back the Primary Key Constraint which recreates a clustered index.

    ALTER TABLE ACCIDENT ADD CONSTRAINT PK_ACCIDENT PRIMARY KEY CLUSTERED (ID)

    The problem with this solution is that several of our tables have millions of rows and dropping and rebuilding the clustered index takes an hour and fills up the transaction log. Doing this over many tables creates a major headache.

    Does anyone have a solution where I can create the Primary Key Constraint to use the existing primaryKey index (which is clustered) without having to DROP and ADD it back?

    thanks,

    Keith Battles

  • You will have to drop+recreate. It IS disk intensive ( especially if there are NC indexes in addition to the clustered)

    To minimize impact you can run the rebuild using SORT_IN_TEMPDB option and take transaction log backups as frequently as necessary.

    Just My $0.02


    * Noel

  • I guess the initial questions should be:

    - And why is it so bad having clustered and nonclustered indexes on the same column?

    - Is it possible that they were added not by mistake, but by design?

    - What is it that you are trying to achieve by dropping them? Cut the HDD space or achieve any performance gain?

    Kalen Delaney blog and interesting article on the topic:

    http://sqlblog.com/blogs/kalen_delaney/archive/2008/04/24/clustered-or-nonclustered-why-not-both.aspx

  • The non-clustered indexes on the ID fields were not added by design but by a side effect of adding a Primary Key Constraint after a clustered index on the ID column was already set up.

    Our application currently is at 160GB data and 60GB of indexes and growing every day. By eliminating a bunch of redundant indexes, I hope to slow the growth of the disk space requirements.

    We do nightly full backups and weekly index rebuild/reindex maintenance jobs which should also run faster but that is secondary to the space issue. As far as user response time on inserts/deletes goes, I don't think the extra overhead of updating the redundant indexes is noticeable.

    Basically my thought was that having a non-clustered index on a column that already has a clustered index doesn't gain us anything and is just extra overhead that could be eliminated.

  • You have two options:

    1. Follow your initial intension and spend a lot of time on rebuilding indexes just to find out that your intension would lead to a moderate if not minimal impact on combined index size (nonclustered indexes are generally smaller than clustered).

    2. Review Calen's blog again and find out that SQL server in the case of a filtered SELECT will use nonclustered index. Since nonclustered index is smaller, you will get a performance gain. In the case of the big tables (millions of records) and a multiuser environment this performance gain in conjunction with the fact that you will free a very limited amount of HDD space, might lead you to reestimation your original approach.

    Combining these two approaches together you can build a test case based on one of your biggest tables with both indexes.

    Build a set of SELECT statements against this table; review their execution plans and time of execution + performance metrics.

    Delete and recreate indexes (as suggested in original response).

    Run the same test again. Compare your metrics.

    BTW, posting results here might lead you to writing a very interesting article 🙂

    BTW2. HDD space is cheap. Adding additional RAID and moving indexes on its own filgroup might be cheaper and less process and LOE intensive than your original intension...

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply