Is this a duplicate index?

  • I have a database that gets indexes built from an application based on search screens that are developed.  The developer many times does not know or research if an index already existing that could be used for their search query so there are several indexes on some tables that are very similar. 

    The clustered index for the table looks like this:

    clustered index namecolumnidcolumn namelengthprecisionscaledatatype
    PO_LINE_SHIP1BUSINESS_UNIT55NULLchar
    2PO_ID1010NULLchar
    3LINE_NBR4100int
    4SCHED_NBR250smallint

     

    Here is index #2:

    non-clustered index namecolumnidcolumn namelengthprecisionscaledatatypekeyno
    PO_LINE_SHIP113DUE_DT8233datetime1
    1BUSINESS_UNIT55NULLchar2
    2PO_ID1010NULLchar3
    3LINE_NBR4100int4
    4SCHED_NBR250smallint5

     

    And here is index #3:

    non-clustered index namecolumnidcolumn namelengthprecisionscaledatatypekeyno
    PO_LINE_SHIP213ACCT_DT8233datetime1
    1BUSINESS_UNIT55NULLchar2
    2PO_ID1010NULLchar3
    3LINE_NBR4100int4
    4SCHED_NBR250smallint5

    I am doing an index review and will be checking to see if these indexes are actually used.  But, I need to know what to do with indexes that I encounter that are so very similar.

     

  • Looking at index 3.

    If you search for records with just an ACCT_DT field in the WHERE clause the index will be used.

    If you search for records based on ACCT_DT and BUSINESS_UNIT the index will be used.

    I would be a bit concerned over the use of the clustered index because a clustered index contains data in physical order so if you insert Business_UNIT 'BBBBB' after Business_UNIT 'CCCCC' the penalty of maintaining the index is going to be high.

    Personally I would save the clustered index for a naturally ordered column such as an ORDERDATE. If your DUE_DATE column is naturally ordered then I would choose that column for a clustered index.

    You really need to know what the common searches are that are performed against these tables. If you are not sure then perhaps the best way is to use the index wizard.

  • I would also take a look at the size of the clustered index - those 21 bytes get replicated in all non-clustered indexes (for doing bookmark lookups), so in addition to trying to identify a naturally increasing value like a date, I'd try to reduce it in size.

    The non-clustered indexes look as if they've been designed to be 'covering' indexes, so that a large percentage of queries are satisfied by the index columns themselves without having to lookup the rest of the data in the row - that will require confirmation, and if so, they may be valid despite the apparent overlap.

    This is really a tough call, there are so many variables.

     

  • David,

    I agree with you that the clustered index with it's 4 columns is a bad thing.  Unfortunately, I have been advised to be VERY careful should I want to change the clustered index.  Actually, there are clustered indexes in this database that are 14 columns wide, YUCK!!!  And, we are paying the price because we have to constantly rebuild them since they get fragmented really easily.  But, I have to try to tune and maintain what I have, thus my question.

    I guess that I am checking to see what people think because I think that we should drop one of the indexes and modify the remaining one to look like this:

    non-clustered index namecolumnidcolumn namelengthprecisionscaledatatypekeyno
    PO_LINE_SHIP313DUE_DT8233datetime1
    15ACCT_DT8233datetime2
    1BUSINESS_UNIT55NULLchar3
    2PO_ID1010NULLchar4
    3LINE_NBR4100int5
    4SCHED_NBR250smallint6

    Would that get me anything?

  • The problem with that, is that any WHERE clause that doesn't contain DUE_DT, and where the SELECT contains 1 or more columns not in the index, may tablescan.

    One other observation - is it really necessary to record DUE_DT and ACCT_DT right down to the millisecond level ? It appears as if the datatypes are twice as large as they need to be and could be smalldatetime's if the precise time component isn't necessary.

  • Lori,

    On index analysis column-order matters! and not only that the first column should be the most selective

    The indexes you show are deinetly being defined to account for different queries. It is usually the case  when they seem that close but actually behave quite different

    HTH


    * Noel

  • Noel,

    I agree that column order matters on idexes, but I guess I am trying to go for a covering index.  My thought is why should I have to maintain 2 indexes when 1 covering index might do just as well.  This kind of thing happens all over the place in this database.  If this is not the way to go, then the indexes will stay.  So far, it looks like the thing to do is to check for queries that would specifically use those indexes and see what happens if I design a covering index.  I will have to check execution plans with both scenarios in place.  The usual stuff.

    I am just a little confused and looking for some other opinions besides mine.  Perhaps someone has seen this kind of thing before and maybe has a rule of thumb.

     

  • If you only consider acct_dt, in your current scenario, a query that searched based on acct_dt would use index PO_LINE_SHIP2.  In your suggested scenario, a query that searched on acct_dt (or acct_dt business_unit, or acct_dt business_unit po_id, …) would have no index to use, because your new PO_LINE_SHIP3 index has as its first column due_dt, and all indexed acct_dts are sorted by due_dt.  Even if they occur that way naturally, a query with acct_dt and/or the remaining columns in the index (not including due_dt) would not use the index.

     

    Secondly, if your search was based on due_dt business_unit… it might use the index (sorting by due_dt ONLY), but the index isn’t sorted by due_dt business_unit, its sorted by due_dt, ACCT_DT, business_unit…  So the index becomes much less useful for either purpose that the original indexes were intended for, except in the case of a covering index for searches only on due_dt.

     

    Steve

  • If you are doing an index-review ...

    run a trace that captures all sql and sp activity during a typical timeframe and have that analyzed by ITW, leaving it the option to do its stuff to the full, but reduce the number of columns to be used in one index.

    I hope you've documented your "covering" indexes, because you might want to keep them (at least for second level support).

    If you have to do a bunch of indexmaintenance to your clustering index, check out fillfactor and pad_index in bol.

    ... The developer many times ...

    Does this mean you're performing these actions on a development server ? If yes, don't hesitate to remove clustering indexes if you feel the overhead might be to much and want to try another one ! IMO there is a common misinterpretation regarding clusering indexes, as if these would guarantee ordering query-results. That's what order-by must be used for !

    check qa.sqlservercentral.com for index-articles !

    or

    http://www.sql-server-performance.com/sql_server_performance_audit7.asp

    http://www.sql-server-performance.com/sql_server_performance_audit11.asp

     

    btw : is this OLTP or OLAP pollicy ?

     

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • This is easy...are they duplicates?

     

    NO...just similar!


    "Keep Your Stick On the Ice" ..Red Green

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply