Tally OH! An Improved SQL 8K “CSV Splitter” Function

  • Classic article, Jeff. I have loved it (and used it extensively) since it first came out five years ago (and the discussion is still going strong!)

    With SQL 2016 just released this month, we now have the new STRING_SPLIT() function. How does that stack up against the various string splitters you have tested?

    Thank you, again, for your many contributions to our community. 🙂

  • David Moutray (6/22/2016)


    Classic article, Jeff. I have loved it (and used it extensively) since it first came out five years ago (and the discussion is still going strong!)

    With SQL 2016 just released this month, we now have the new STRING_SPLIT() function. How does that stack up against the various string splitters you have tested?

    Thank you, again, for your many contributions to our community. 🙂

    Wayne Sheffield wrote an article on just that topic: http://qa.sqlservercentral.com/articles/STRING_SPLIT/139338/.

    Note that the new function doesn't return (or even guarantee) ordinal position. Without it, the new function is s a bit limited.

  • David Moutray (6/22/2016)


    Classic article, Jeff. I have loved it (and used it extensively) since it first came out five years ago (and the discussion is still going strong!)

    With SQL 2016 just released this month, we now have the new STRING_SPLIT() function. How does that stack up against the various string splitters you have tested?

    Thank you, again, for your many contributions to our community. 🙂

    Wow. It HAS been that long, hasn't it? Thank you for the very kind feedback, David. Thank YOU for your participation in all of this.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Hi Jeff,

    Thanks for the article.

    Could you explain the relevance and importance of using 'WITH SCHEMA BINDING' please?

    Thanks,

  • Hi,

    Thanks for the article Jeff.

    Would anyone be able to explain the relevance and importance of using 'WITH SCHEMA BINDING' with this function please?

    Thanks,

  • carl.thompson (11/22/2016)


    Hi,

    Thanks for the article Jeff.

    Would anyone be able to explain the relevance and importance of using 'WITH SCHEMA BINDING' with this function please?

    Thanks,

    It means that the function is bound to the objects it references. It enforces the dependency and doesn't allow the bound objects to be altered without first removing them from the function itself. Because this is enforced, SQL doesn't have to check the referenced objects to make sure the schema is still okay. Any time the query optimizer can do less work, it can result in better performance.

    MSDN's create function page is located at https://msdn.microsoft.com/en-us/library/ms186755.aspx and the SCHEMABINDING is just over halfway down.

  • Hi SSCrazyEights,

    Thank you for your reply.

    That makes perfect sense and I think confirmed what we thought to be the reason for the inclusion in the function. Can you confirm that this has been used in the function for the purpose of speed alone? As the function does not appear to be referencing any schema bound objects?

    Our infrastructure team have advised that a system upgrade failed due to the split function and a view both using WITH SCHEMA BINDING. Other for than performance improvement we advised that the SCHEMA BINDING option was not required within the split function. Have we provided correct advice?

    Thanks in advance for any reply.

    Carl

  • carl.thompson (11/22/2016)


    Hi SSCrazyEights,

    Thank you for your reply.

    That makes perfect sense and I think confirmed what we thought to be the reason for the inclusion in the function. Can you confirm that this has been used in the function for the purpose of speed alone? As the function does not appear to be referencing any schema bound objects?

    Our infrastructure team have advised that a system upgrade failed due to the split function and a view both using WITH SCHEMA BINDING. Other for than performance improvement we advised that the SCHEMA BINDING option was not required within the split function. Have we provided correct advice?

    Thanks in advance for any reply.

    Carl

    The schemabinding is always optional. Specifying it means that the query optimizer knows that it doesn't have to check anything, so it saves a step at run-time. This is where the performance improvement comes from.

    The other advantage it can provide is safety. Let's say you have a function that references a table and relies on that table to do what it needs to do. Someone comes along and, not knowing about your function, alters the table in a way that breaks your function. With schemabinding specified on the function, the table can't be altered. The function has to be altered, then the table altered, then the function altered again. In short, it saves you from yourself...and others.

    I must admit I'm curious why your infrastructure team said that a system upgrade failed due to a schemabinding option being present on a view and function. If you're talking about an upgrade to an application, I can see it if the upgrade altered a base object, but I would think this would have been discovered during development or testing. I can't see it for an upgrade of the system - either OS or database version.

  • I've never had any issues upgrading with the splitter using SCHEMABINDING. Is it your view? In any case, if SCHEMABINDING is causing issues, just remove SCHEMABINDING on the function and view, do the system upgrade, and then put SCHEMABINDING back on the function and view. If you have SCHEMABINDING on the view for indexing, you will also need to make sure that you recreate the index.

  • Hi,

    Thanks SSCrazyEights and UPD Broadcaster.

    It was an upgrade to an application and has been picked up as part of testing. I thought that the culprit would be the schema binding used in the view.

    I fed back that the Schema binding was optional and explained the use of Schema binding with regard to referential integrity. In this case we were specifically talking about the DelimitedSplit8K function and as there was no referential integrity required I just wanted to check it's use in this function. As we had taken this from SQLServerCentral I wanted to check that the Schema Binding option was there only for performance.

    Many thanks for your help with this both.

    Carl

  • carl.thompson (11/22/2016)


    Hi SSCrazyEights,

    Thank you for your reply.

    That makes perfect sense and I think confirmed what we thought to be the reason for the inclusion in the function. Can you confirm that this has been used in the function for the purpose of speed alone? As the function does not appear to be referencing any schema bound objects?

    Our infrastructure team have advised that a system upgrade failed due to the split function and a view both using WITH SCHEMA BINDING. Other for than performance improvement we advised that the SCHEMA BINDING option was not required within the split function. Have we provided correct advice?

    Thanks in advance for any reply.

    Carl

    I want details about this failure that was blamed on WITH SCHEMABINDING. I advise all my clients, forum posters and conference session attendees to ALWAYS use it. For UDFs it can remove a table spool for UPDATE Halloween protection. For views it keeps developers from BREAKING CODE by not doing a thorough impact analysis. Others covered additional benefits.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Are they advocating that SCHEMABINDING is permanently removed? If so, I would seriously question why. I might temporarily remove it as necessary for certain modifications but there would have to be a very good reason why I would permanently remove it. It's usually there for a reason, in this case, performance for the splitter. I don't have enough information why you have it on the view but that's something for you to investigate. As always, it is good to know why something is being done before you mess with it.

    Also, SSCrazyEights and UPD Broadcaster are just categories based on how much we post. Our avatar names are Ed Wagner and JTSash. It's a minor thing but the categories will change over time and it is better to use our names as that way others know whom you are talking about.

    Always happy to help out others as I have received much help in these forums myself.

  • I also use WITH SCHEMABINDING on my functions, for both safety and performance. There are some cases where it can't be used, but I use it everywhere I can.

    If you have a table being altered and a view WITH SCHEMABINDING that references it, that would completely explain why your application update failed.

  • Hi,

    Yes, Ed, our 3rd party supplier is updating the application and some of the tables are being altered. This view is referencing some of those tables and has Schema Binding applied.

    The 3rd party supplier advised that the upgrade was 'failing' due to schema binding and highlighted the views and functions where schema binding was being used. I advised that I did not think that the Split function was having an impact and that it would be the view that was causing the problem. However, as part of our change process I was questioned further and as I didn't write the Split function and this was taken from SQLSeverCentral I wanted to check it's use within the function.

    With the help of this forum I have been able to provide further feedback and confirmation so that we can act accordingly.

    Thanks for the advice on the use of schema binding where possible also. I'll definitely keep it in mind when creating views and functions. At present we mostly use Stored Procedures and reference the base tables within our reporting database but we are beginning to realise the benefits of views also and will start to utilise these more in the future. Schema binding sounds like something we should ensure we are using.

    Thanks for all your help.

    Carl

  • Ed Wagner (11/22/2016)


    carl.thompson (11/22/2016)


    Hi SSCrazyEights,

    Thank you for your reply.

    That makes perfect sense and I think confirmed what we thought to be the reason for the inclusion in the function. Can you confirm that this has been used in the function for the purpose of speed alone? As the function does not appear to be referencing any schema bound objects?

    Our infrastructure team have advised that a system upgrade failed due to the split function and a view both using WITH SCHEMA BINDING. Other for than performance improvement we advised that the SCHEMA BINDING option was not required within the split function. Have we provided correct advice?

    Thanks in advance for any reply.

    Carl

    The schemabinding is always optional. Specifying it means that the query optimizer knows that it doesn't have to check anything, so it saves a step at run-time. This is where the performance improvement comes from.

    The other advantage it can provide is safety. Let's say you have a function that references a table and relies on that table to do what it needs to do. Someone comes along and, not knowing about your function, alters the table in a way that breaks your function. With schemabinding specified on the function, the table can't be altered. The function has to be altered, then the table altered, then the function altered again. In short, it saves you from yourself...and others.

    I must admit I'm curious why your infrastructure team said that a system upgrade failed due to a schemabinding option being present on a view and function. If you're talking about an upgrade to an application, I can see it if the upgrade altered a base object, but I would think this would have been discovered during development or testing. I can't see it for an upgrade of the system - either OS or database version.

    Ah... you said "always". 😉 Just for those that might run into it, it's not optional if you're using it as a source for a persisted computed column.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 15 posts - 901 through 915 (of 981 total)

You must be logged in to reply to this topic. Login to reply