Ditch the Clustered index on a table?

  • We currently have a repository of monitoring data provided by an outside vendor (Tivoli Datawarehouse from IBM to be specific). It stores data for different metrics from systems all over the company into tables, and at different time levels. It stores timestamps, machine names, and the data specific for that metric.

    So for example, we would have a table called LogicalDisk, LogicalDisk_Hourly, LogicalDisk_Daily, LogicalDisk_Monthly, and LogicalDisk_Yearly. All data is written to the lowest table LogicalDisk. Once a night, a summarization and pruning process kicks off to take the "live" data from the lowest table and summarize it for the table above, hourly, then take the hourly and summarize it into the daily, and so on up the chain. The person managing this application has kept data for a year total so the yearly table at the top of the chain doesn't grow forever.

    As you can guess, there would be lots of inserts and deletes going on. When I was configuring this about a year ago, we would always be getting delays in retrieving the data. All the tables had nonclustered indexes, but not a single one had a primary key or clustered index at all. I set about creating clustered indexes on most of the tables to cover the timestamp and machine name. This behaved for awhile until recently. The pruning process was taking a long time on a Warehouse Log table, a summary of all the transactions for the other tables that gets purged to a length of time we set (2 weeks). So I modified our nightly index rebuild job to use a small fillfactor (from 80 to 60) and it behaved again somewhat. But the pruning process is still taking longer than before. I'll give it a few more days to catch up from a couple of cancelled prunes we did.

    If it doesn't behave, and if this Warehouse Log is a point of excessive blocking, my question is this: Is it better to drop the clustered index on the warehouselog and leave the nonclustered index as before to improve performance on this table that gets written to and purged constantly? Users typically do not access this table, as far as I know, and it was designed without clustered indexes in the first place.

    Thanks.

    Gaby
    ________________________________________________________________
    "In theory, theory and practice are the same. In practice, they are not."
    - Albert Einstein

  • Its good to have clustered indexes, but depends what kind of column you choose. To keep it short, its better if the column picked as Cluster is a column which always grows forward smthng like identity or datetime.

    I couldnt get which column you picked for cluster. if it is timestamp/machine then its not the greatest of choices.

  • Not too sure whether I got your question correct, But anyways thought of asking this.

    Did you try disabling the Non clustered index during the pruning process and then enabling it again, just to make the inserts faster.

  • arr.nagaraj (3/2/2010)


    Not too sure whether I got your question correct, But anyways thought of asking this.

    Did you try disabling the Non clustered index during the pruning process and then enabling it again, just to make the inserts faster.

    Hmmm....that's a possibility. We do a rebuild/reorganize of the indexes a couple of hours after the pruning. I'll have to look into that. Data Warehousing has a different focus/way of looking at things than I'm used to.

    Thanks.

    Gaby
    ________________________________________________________________
    "In theory, theory and practice are the same. In practice, they are not."
    - Albert Einstein

  • Did you mean timestamp/machine name, or datetime/machine name.? If not the latter, then step one, make it the latter.

    You could then partition by day and just drop the relevant partition.

  • peter.j.hanlon (5/12/2010)


    Did you mean timestamp/machine name, or datetime/machine name.? If not the latter, then step one, make it the latter.

    You could then partition by day and just drop the relevant partition.

    Datetime/machine name (not timestamp datatype) It's behaving again, so cross our fingers on this one. Thanks everyone for the ideas.

    Gaby
    ________________________________________________________________
    "In theory, theory and practice are the same. In practice, they are not."
    - Albert Einstein

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply