Create Indexes on Computed Columns

  • Hi everyone

    Hi have a doubt about create indexes on computed columns.

    Microsoft provide in its web with comments about this subject:
    https://msdn.microsoft.com/en-us/library/ms189292(v=sql.120).aspx

    Days ago, I watched a video explaining that if you want to have an index in a computed column, those computed columns have to be deterministic (because normally we have always the same data and the same calculation)
    But, I have a question. If we have to put the index in that computed column, do we have to use always the PERSISTED function?

    Microsfot tells this about: PERSISTED
    *PERSISTED Specifies that the Database Engine will physically store the computed values in the table, and update the values when any other columns on which the computed column depends are updated. Marking a computed column as PERSISTED allows an index to be created on a computed column that is deterministic, but not precise. For more information, see Indexes on Computed Columns. Any computed columns used as partitioning columns of a partitioned table must be explicitly marked PERSISTED. computed_column_expression must be deterministic when PERSISTED is specified.*

    Or we can do indexes in computed columns without the PERSISTED option?

    Thank you in advance!!

    Regards

  • Jorgexspain - Thursday, March 2, 2017 10:47 AM

    Hi everyone

    Hi have a doubt about create indexes on computed columns.

    Microsoft provide in its web with comments about this subject:
    https://msdn.microsoft.com/en-us/library/ms189292(v=sql.120).aspx

    Days ago, I watched a video explaining that if you want to have an index in a computed column, those computed columns have to be deterministic (because normally we have always the same data and the same calculation)
    But, I have a question. If we have to put the index in that computed column, do we have to use always the PERSISTED function?

    Microsfot tells this about: PERSISTED
    *PERSISTED Specifies that the Database Engine will physically store the computed values in the table, and update the values when any other columns on which the computed column depends are updated. Marking a computed column as PERSISTED allows an index to be created on a computed column that is deterministic, but not precise. For more information, see Indexes on Computed Columns. Any computed columns used as partitioning columns of a partitioned table must be explicitly marked PERSISTED. computed_column_expression must be deterministic when PERSISTED is specified.*

    Or we can do indexes in computed columns without the PERSISTED option?

    Thank you in advance!!

    Regards

    Even if you COULD do it, I wouldn't want to.  That would be like adding a scalar function to EVERY access to the table, instead of only to inserts or updates.  For really small tables, it might not matter that much, but for any sizable table, I'd much rather PERSIST the column so I could index it.   If you're computing something random that would never be deterministic, I wouldn't want that in a computed column to begin with.

  • I believe that to index the column, it must be persisted.  If it's not persisted already, SQL will persist it when you add the index on it.

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • Once Persisted, won't this also start causing fragmentation in the index if the values change?

  • nsadams87xx - Tuesday, March 7, 2017 11:47 AM

    Once Persisted, won't this also start causing fragmentation in the index if the values change?

    If that's a concern, then why compute the column in the first place?   Think about the alternative.   If this value was pre-computed before each new record was inserted, and provided to the table as a value in the insert, then wouldn't you still have the same issue with the index when this value gets updated?   What you may need is just a regular index rebuild, but without more info I have too little to go on.

  • sgmunson - Tuesday, March 7, 2017 12:03 PM

    nsadams87xx - Tuesday, March 7, 2017 11:47 AM

    Once Persisted, won't this also start causing fragmentation in the index if the values change?

    If that's a concern, then why compute the column in the first place?   Think about the alternative.   If this value was pre-computed before each new record was inserted, and provided to the table as a value in the insert, then wouldn't you still have the same issue with the index when this value gets updated?   What you may need is just a regular index rebuild, but without more info I have too little to go on.

    Right, exactly point.  Just sounds like too many potential problems.

  • nsadams87xx - Tuesday, March 7, 2017 12:05 PM

    sgmunson - Tuesday, March 7, 2017 12:03 PM

    nsadams87xx - Tuesday, March 7, 2017 11:47 AM

    Once Persisted, won't this also start causing fragmentation in the index if the values change?

    If that's a concern, then why compute the column in the first place?   Think about the alternative.   If this value was pre-computed before each new record was inserted, and provided to the table as a value in the insert, then wouldn't you still have the same issue with the index when this value gets updated?   What you may need is just a regular index rebuild, but without more info I have too little to go on.

    Right, exactly point.  Just sounds like too many potential problems.

    That potential is something only you can assess at this point.   We don't have enough detail.  If you care to post the formula for this field, and the frequency with which data coming into the formula is updated, we can venture some guesses...

  • ScottPletcher - Thursday, March 2, 2017 11:30 AM

    I believe that to index the column, it must be persisted.  If it's not persisted already, SQL will persist it when you add the index on it.

    Thank you for the answer!!

    Yes, more or less It was the idea that I had. I understand that to create an index in a column, the normal thing is to be index; but I have not found the "official" answer.

    Thank you again, guy!

  • sgmunson - Thursday, March 2, 2017 10:59 AM

    Jorgexspain - Thursday, March 2, 2017 10:47 AM

    Hi everyone

    Hi have a doubt about create indexes on computed columns.

    Microsoft provide in its web with comments about this subject:
    https://msdn.microsoft.com/en-us/library/ms189292(v=sql.120).aspx

    Days ago, I watched a video explaining that if you want to have an index in a computed column, those computed columns have to be deterministic (because normally we have always the same data and the same calculation)
    But, I have a question. If we have to put the index in that computed column, do we have to use always the PERSISTED function?

    Microsfot tells this about: PERSISTED
    *PERSISTED Specifies that the Database Engine will physically store the computed values in the table, and update the values when any other columns on which the computed column depends are updated. Marking a computed column as PERSISTED allows an index to be created on a computed column that is deterministic, but not precise. For more information, see Indexes on Computed Columns. Any computed columns used as partitioning columns of a partitioned table must be explicitly marked PERSISTED. computed_column_expression must be deterministic when PERSISTED is specified.*

    Or we can do indexes in computed columns without the PERSISTED option?

    Thank you in advance!!

    Regards

    Even if you COULD do it, I wouldn't want to.  That would be like adding a scalar function to EVERY access to the table, instead of only to inserts or updates.  For really small tables, it might not matter that much, but for any sizable table, I'd much rather PERSIST the column so I could index it.   If you're computing something random that would never be deterministic, I wouldn't want that in a computed column to begin with.

    Thank you sgmunson! I had the same feeling; Maybe you can use in a small table; but in a big one it will be slow in every query to the table, isn't it?

    PD: I was "Jorgexspain", I have changed my user 🙂
    Thanks again!

  • jorge_vicente,

    How much slower will depend heavily on the nature of the calculation.  How much that ends up costing overall will then further depend on the number of records that might need to change due to calculation dependencies.   It could be anywhere from a mere pittance to a total disaster.

  • jorge_vicente - Tuesday, March 7, 2017 2:47 PM

    ScottPletcher - Thursday, March 2, 2017 11:30 AM

    I believe that to index the column, it must be persisted.  If it's not persisted already, SQL will persist it when you add the index on it.

    Thank you for the answer!!

    Yes, more or less It was the idea that I had. I understand that to create an index in a column, the normal thing is to be index; but I have not found the "official" answer.

    Thank you again, guy!

    Officially you don't need to persist the computed column in the table, I've done it before in a covering non-clustered index, since the computed column is stored in the index it isn't recalculated each time it's queried if it's only used through that index.

  • nsadams87xx - Tuesday, March 7, 2017 11:47 AM

    Once Persisted, won't this also start causing fragmentation in the index if the values change?

    Fragmentation usually doesn't matter so much.  If you look at an execution plan, you'll see that the optimizer really doesn't care about fragmentation.  The only time fragmentation really becomes a concern is if there's a whole lot of free space in each page due to splitting.  However, if you wait a bit, a "natural fill factor" will take care of most of that.

    I've not rebuilt indexes on any of my boxes since 17 Jan 2016 (more than a year now).  Performance only got better over the first 3 months and hasn't degraded.  It IS, however, uber important to keep statistics up to date.  And, if you do end up not trusting what I just said and continue to defrag your indexes, then do it based on the average percent of page used rather than on percent of fragmentation.

    Also remember that an index on a persisted computed column isn't any different than any other NCI.  Unless the lead column is ever increasing, your going to get fragmentation.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply