Is this a duplicate index?

Question

Post reply

Is this a duplicate index?

Lori B

Ten Centuries

Points: 1149
More actions
February 18, 2005 at 7:23 am

#85476

I have a database that gets indexes built from an application based on search screens that are developed. The developer many times does not know or research if an index already existing that could be used for their search query so there are several indexes on some tables that are very similar.
The clustered index for the table looks like this:
clustered index name columnid column name length precision scale datatype
PO_LINE_SHIP 1 BUSINESS_UNIT 5 5 NULL char
2 PO_ID 10 10 NULL char
3 LINE_NBR 4 10 0 int
4 SCHED_NBR 2 5 0 smallint

Here is index #2:
non-clustered index name columnid column name length precision scale datatype keyno
PO_LINE_SHIP1 13 DUE_DT 8 23 3 datetime 1
1 BUSINESS_UNIT 5 5 NULL char 2
2 PO_ID 10 10 NULL char 3
3 LINE_NBR 4 10 0 int 4
4 SCHED_NBR 2 5 0 smallint 5

And here is index #3:
non-clustered index name columnid column name length precision scale datatype keyno
PO_LINE_SHIP2 13 ACCT_DT 8 23 3 datetime 1
1 BUSINESS_UNIT 5 5 NULL char 2
2 PO_ID 10 10 NULL char 3
3 LINE_NBR 4 10 0 int 4
4 SCHED_NBR 2 5 0 smallint 5
I am doing an index review and will be checking to see if these indexes are actually used. But, I need to know what to do with indexes that I encounter that are so very similar.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

David.Poole SSC Guru Points: 75656 More actions · Answer 1

Looking at index 3.

If you search for records with just an ACCT_DT field in the WHERE clause the index will be used.

If you search for records based on ACCT_DT and BUSINESS_UNIT the index will be used.

I would be a bit concerned over the use of the clustered index because a clustered index contains data in physical order so if you insert Business_UNIT 'BBBBB' after Business_UNIT 'CCCCC' the penalty of maintaining the index is going to be high.

Personally I would save the clustered index for a naturally ordered column such as an ORDERDATE. If your DUE_DATE column is naturally ordered then I would choose that column for a clustered index.

You really need to know what the common searches are that are performed against these tables. If you are not sure then perhaps the best way is to use the index wizard.

LinkedIn Profile

PW-201837 SSC-Insane Points: 20805 More actions · Answer 2

I would also take a look at the size of the clustered index - those 21 bytes get replicated in all non-clustered indexes (for doing bookmark lookups), so in addition to trying to identify a naturally increasing value like a date, I'd try to reduce it in size.

The non-clustered indexes look as if they've been designed to be 'covering' indexes, so that a large percentage of queries are satisfied by the index columns themselves without having to lookup the rest of the data in the row - that will require confirmation, and if so, they may be valid despite the apparent overlap.

This is really a tough call, there are so many variables.

Lori B Ten Centuries Points: 1149 More actions · Answer 3

David,

I agree with you that the clustered index with it's 4 columns is a bad thing. Unfortunately, I have been advised to be VERY careful should I want to change the clustered index. Actually, there are clustered indexes in this database that are 14 columns wide, YUCK!!! And, we are paying the price because we have to constantly rebuild them since they get fragmented really easily. But, I have to try to tune and maintain what I have, thus my question.

I guess that I am checking to see what people think because I think that we should drop one of the indexes and modify the remaining one to look like this:

non-clustered index name	columnid	column name	length	precision	scale	datatype	keyno
PO_LINE_SHIP3	13	DUE_DT	8	23	3	datetime	1
	15	ACCT_DT	8	23	3	datetime	2
	1	BUSINESS_UNIT	5	5	NULL	char	3
	2	PO_ID	10	10	NULL	char	4
	3	LINE_NBR	4	10	0	int	5
	4	SCHED_NBR	2	5	0	smallint	6

Would that get me anything?

PW-201837 SSC-Insane Points: 20805 More actions · Answer 4

The problem with that, is that any WHERE clause that doesn't contain DUE_DT, and where the SELECT contains 1 or more columns not in the index, may tablescan.

One other observation - is it really necessary to record DUE_DT and ACCT_DT right down to the millisecond level ? It appears as if the datatypes are twice as large as they need to be and could be smalldatetime's if the precise time component isn't necessary.

noeld SSC Guru Points: 96590 More actions · Answer 5

Lori,

On index analysis column-order matters! and not only that the first column should be the most selective

The indexes you show are deinetly being defined to account for different queries. It is usually the case when they seem that close but actually behave quite different

HTH

* Noel

Lori B Ten Centuries Points: 1149 More actions · Answer 6

Noel,

I agree that column order matters on idexes, but I guess I am trying to go for a covering index. My thought is why should I have to maintain 2 indexes when 1 covering index might do just as well. This kind of thing happens all over the place in this database. If this is not the way to go, then the indexes will stay. So far, it looks like the thing to do is to check for queries that would specifically use those indexes and see what happens if I design a covering index. I will have to check execution plans with both scenarios in place. The usual stuff.

I am just a little confused and looking for some other opinions besides mine. Perhaps someone has seen this kind of thing before and maybe has a rule of thumb.

hoo-t SSChampion Points: 12713 More actions · Answer 7

If you only consider acct_dt, in your current scenario, a query that searched based on acct_dt would use index PO_LINE_SHIP2. In your suggested scenario, a query that searched on acct_dt (or acct_dt business_unit, or acct_dt business_unit po_id, …) would have no index to use, because your new PO_LINE_SHIP3 index has as its first column due_dt, and all indexed acct_dts are sorted by due_dt. Even if they occur that way naturally, a query with acct_dt and/or the remaining columns in the index (not including due_dt) would not use the index.

Secondly, if your search was based on due_dt business_unit… it might use the index (sorting by due_dt ONLY), but the index isn’t sorted by due_dt business_unit, its sorted by due_dt, ACCT_DT, business_unit… So the index becomes much less useful for either purpose that the original indexes were intended for, except in the case of a covering index for searches only on due_dt.

Steve

Johan Bijnens SSC Guru Points: 134873 More actions · Answer 8

If you are doing an index-review ...

run a trace that captures all sql and sp activity during a typical timeframe and have that analyzed by ITW, leaving it the option to do its stuff to the full, but reduce the number of columns to be used in one index.

I hope you've documented your "covering" indexes, because you might want to keep them (at least for second level support).

If you have to do a bunch of indexmaintenance to your clustering index, check out fillfactor and pad_index in bol.

... The developer many times ...

Does this mean you're performing these actions on a development server ? If yes, don't hesitate to remove clustering indexes if you feel the overhead might be to much and want to try another one ! IMO there is a common misinterpretation regarding clusering indexes, as if these would guarantee ordering query-results. That's what order-by must be used for !

check qa.sqlservercentral.com for index-articles !

or

http://www.sql-server-performance.com/sql_server_performance_audit7.asp

http://www.sql-server-performance.com/sql_server_performance_audit11.asp

btw : is this OLTP or OLAP pollicy ?

Johan

Learn to play, play to learn !

Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:

- How to post Performance Problems
- How to post data/code to get the best help[/url]

- How to prevent a sore throat after hours of presenting ppt

press F1 for solution, press shift+F1 for urgent solution 😀

Need a bit of Powershell? How about this

Who am I ? Sometimes this is me but most of the time this is me

Jeff W SSCertifiable Points: 5207 More actions · Answer 9

This is easy...are they duplicates?

NO...just similar!

"Keep Your Stick On the Ice" ..Red Green

clustered index name	columnid	column name	length	precision	scale	datatype
PO_LINE_SHIP	1	BUSINESS_UNIT	5	5	NULL	char
	2	PO_ID	10	10	NULL	char
	3	LINE_NBR	4	10	0	int
	4	SCHED_NBR	2	5	0	smallint