Updating Comma Delimited Field

  • dwain.c (5/23/2012)


    I've seen (dramatically) how WITH SCHEMABINDING improves the performance of a Scalar Valued Function. I'm wondering if the same is true for TVFs and what is the downside?

    No performance benefits. In-line functions are always expanded into the host query (like a view) and optimized as a whole. The downside is the usual one: you bind to any schema objects it references.

  • SQL Kiwi (5/23/2012)


    dwain.c (5/23/2012)


    I've seen (dramatically) how WITH SCHEMABINDING improves the performance of a Scalar Valued Function. I'm wondering if the same is true for TVFs and what is the downside?

    No performance benefits. In-line functions are always expanded into the host query (like a view) and optimized as a whole. The downside is the usual one: you bind to any schema objects it references.

    schemabinding can improve performance in functions - but not on it's own

    in order for a function to be deterministic (http://msdn.microsoft.com/en-us/library/ms178091.aspx) the function must be schemabound - amongst other things

    Deterministic functions always return the same result any time they are called with a specific set of input values and given the same state of the database. Nondeterministic functions may return different results each time they are called with a specific set of input values even if the database state that they access remains the same. - this results in fewer function calls in a query result set

    There are several properties of user-defined functions that determine the ability of the SQL Server Database Engine to index the results of the function, either through indexes on computed columns that call the function, or through indexed views that reference the function. The determinism of a function is one such property. For example, a clustered index cannot be created on a view if the view references any nondeterministic functions. For more information about the properties of functions, including determinism, see User-Defined Functions

    MVDBA

  • I was curious how a TVF would compare performance-wise to a SVF for this solution, so I built the following two functions cobbled together from examples provided. Sorry Lynn, but I wasn't able to include yours because apparently there's a function missing somewhere along the way.

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE FUNCTION dbo.tvf_ValidateDelimitedString

    (

    @MRN NCHAR(10),

    @Additional_Procedure_Location varchar(max)

    )

    RETURNS TABLE WITH SCHEMABINDING

    AS RETURN (

    SELECT @MRN AS MRN, STUFF((SELECT ','+

    CASE WHEN Item NOT IN ('1','2') OR Item = ''

    THEN '1'

    ELSE Item END AS Item

    FROM dbo.DelimitedSplit8k(@additional_procedure_Location,',') split

    ORDER BY ItemNumber

    FOR XML PATH(''), TYPE).value('.', 'varchar(max)'),1,1,'') AS new_additional_procedure_Location

    )

    GO

    CREATE FUNCTION dbo.svf_ValidateDelimitedString

    (

    @Additional_Procedure_Location varchar(max)

    )

    RETURNS VARCHAR(MAX) WITH SCHEMABINDING

    AS

    BEGIN

    DECLARE @new_Additional_Procedure_Location VARCHAR(MAX)

    ;WITH Nbrs_3(n) AS (SELECT 1 UNION SELECT 0)

    ,Nbrs_2 (n) AS (SELECT 1 FROM Nbrs_3 n1 CROSS JOIN Nbrs_3 n2)

    ,Nbrs_1 (n) AS (SELECT 1 FROM Nbrs_2 n1 CROSS JOIN Nbrs_2 n2)

    ,Nbrs_0 (n) AS (SELECT 1 FROM Nbrs_1 n1 CROSS JOIN Nbrs_1 n2)

    ,Nbrs (n) AS (SELECT 1 FROM Nbrs_0 n1 CROSS JOIN Nbrs_0 n2)

    ,Tally (n) AS (SELECT ROW_NUMBER() OVER (ORDER BY n) As n FROM Nbrs),

    Parser (new_additional_procedure_Location, r) AS (

    SELECT CASE WHEN n = 1 AND SUBSTRING(@Additional_Procedure_Location, n, 1) = ',' THEN '1,'

    WHEN n = 1 AND SUBSTRING(@Additional_Procedure_Location, n, 1) NOT IN ('1','2',',') THEN '1'

    WHEN n = 1 THEN SUBSTRING(@Additional_Procedure_Location, n, 1)

    WHEN n <> LEN(@Additional_Procedure_Location) AND

    SUBSTRING(@Additional_Procedure_Location, n, 1) <> ',' AND

    SUBSTRING(@Additional_Procedure_Location, n-1, 1) <> ',' THEN ''

    WHEN n <> LEN(@Additional_Procedure_Location) AND

    SUBSTRING(@Additional_Procedure_Location, n, 1) = ',' AND

    SUBSTRING(@Additional_Procedure_Location, n-1, 1) = ',' THEN '1,'

    WHEN n <> LEN(@Additional_Procedure_Location) AND

    SUBSTRING(@Additional_Procedure_Location, n, 1) NOT IN ('1','2',',') THEN '1'

    WHEN n <> LEN(@Additional_Procedure_Location) THEN SUBSTRING(@Additional_Procedure_Location, n, 1)

    WHEN n = LEN(@Additional_Procedure_Location) AND

    SUBSTRING(@Additional_Procedure_Location, n, 1) <> ',' AND

    SUBSTRING(@Additional_Procedure_Location, n-1, 1) <> ',' THEN ''

    WHEN n = LEN(@Additional_Procedure_Location) AND

    SUBSTRING(@Additional_Procedure_Location, n, 1) = ',' AND

    SUBSTRING(@Additional_Procedure_Location, n-1, 1) = ',' THEN '1,'

    WHEN n = LEN(@Additional_Procedure_Location) AND

    SUBSTRING(@Additional_Procedure_Location, n, 1) NOT IN ('1','2',',') THEN '1'

    WHEN n = LEN(@Additional_Procedure_Location) AND

    SUBSTRING(@Additional_Procedure_Location, n, 1) = ',' THEN ',1'

    ELSE SUBSTRING(@Additional_Procedure_Location, n, 1) END

    ,ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM Tally

    WHERE n BETWEEN 1 and LEN(@Additional_Procedure_Location))

    SELECT @new_Additional_Procedure_Location =

    (SELECT new_additional_procedure_location

    FROM Parser

    FOR XML PATH(''), root('M'), type).value('/M[1]','varchar(max)' )

    FROM Parser p1

    RETURN @new_Additional_Procedure_Location

    END

    I then built the following test harness (based on Jeff Moden's suggestion) for 10000 rows. I could have taken it further but it wasn't necessary because the timings were already clear at this time.

    CREATE TABLE #temp (MRN nchar(10) null,

    additional_procedure_Location varchar(max))

    ;WITH Tally (n) AS (

    SELECT TOP 10000 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM sys.all_columns ac1 CROSS JOIN sys.all_columns ac2)

    INSERT INTO #temp

    SELECT RIGHT('0000000'+CAST(n AS VARCHAR(7)),7)

    ,SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1)

    FROM Tally

    SELECT * FROM #Temp

    SET STATISTICS TIME ON

    SET STATISTICS IO ON

    SELECT t.MRN, additional_procedure_Location, new_additional_procedure_Location

    FROM #temp t

    CROSS APPLY dbo.tvf_ValidateDelimitedString(t.MRN, t.additional_procedure_Location) x

    SELECT MRN, additional_procedure_Location

    ,dbo.svf_ValidateDelimitedString(Additional_Procedure_Location) AS new_additional_procedure_Location

    FROM #temp t

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    DROP TABLE #temp

    And here are the results.

    (10000 row(s) affected)

    Table '#temp___0000000000C0'. Scan count 1, logical reads 80, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 639 ms, elapsed time = 741 ms.

    (10000 row(s) affected)

    Table '#temp___0000000000C0'. Scan count 1, logical reads 80, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 3511 ms, elapsed time = 3599 ms.

    So TVF beats SVF, at least in this case. Anyone else is welcome to pit their submission against these (or improve upon them).


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (5/23/2012)


    I was curious how a TVF would compare performance-wise to a SVF for this solution, so I built the following two functions cobbled together from examples provided. Sorry Lynn, but I wasn't able to include yours because apparently there's a function missing somewhere along the way.

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE FUNCTION dbo.tvf_ValidateDelimitedString

    (

    @MRN NCHAR(10),

    @Additional_Procedure_Location varchar(max)

    )

    RETURNS TABLE WITH SCHEMABINDING

    AS RETURN (

    SELECT @MRN AS MRN, STUFF((SELECT ','+

    CASE WHEN Item NOT IN ('1','2') OR Item = ''

    THEN '1'

    ELSE Item END AS Item

    FROM dbo.DelimitedSplit8k(@additional_procedure_Location,',') split

    ORDER BY ItemNumber

    FOR XML PATH(''), TYPE).value('.', 'varchar(max)'),1,1,'') AS new_additional_procedure_Location

    )

    GO

    CREATE FUNCTION dbo.svf_ValidateDelimitedString

    (

    @Additional_Procedure_Location varchar(max)

    )

    RETURNS VARCHAR(MAX) WITH SCHEMABINDING

    AS

    BEGIN

    DECLARE @new_Additional_Procedure_Location VARCHAR(MAX)

    ;WITH Nbrs_3(n) AS (SELECT 1 UNION SELECT 0)

    ,Nbrs_2 (n) AS (SELECT 1 FROM Nbrs_3 n1 CROSS JOIN Nbrs_3 n2)

    ,Nbrs_1 (n) AS (SELECT 1 FROM Nbrs_2 n1 CROSS JOIN Nbrs_2 n2)

    ,Nbrs_0 (n) AS (SELECT 1 FROM Nbrs_1 n1 CROSS JOIN Nbrs_1 n2)

    ,Nbrs (n) AS (SELECT 1 FROM Nbrs_0 n1 CROSS JOIN Nbrs_0 n2)

    ,Tally (n) AS (SELECT ROW_NUMBER() OVER (ORDER BY n) As n FROM Nbrs),

    Parser (new_additional_procedure_Location, r) AS (

    SELECT CASE WHEN n = 1 AND SUBSTRING(@Additional_Procedure_Location, n, 1) = ',' THEN '1,'

    WHEN n = 1 AND SUBSTRING(@Additional_Procedure_Location, n, 1) NOT IN ('1','2',',') THEN '1'

    WHEN n = 1 THEN SUBSTRING(@Additional_Procedure_Location, n, 1)

    WHEN n <> LEN(@Additional_Procedure_Location) AND

    SUBSTRING(@Additional_Procedure_Location, n, 1) <> ',' AND

    SUBSTRING(@Additional_Procedure_Location, n-1, 1) <> ',' THEN ''

    WHEN n <> LEN(@Additional_Procedure_Location) AND

    SUBSTRING(@Additional_Procedure_Location, n, 1) = ',' AND

    SUBSTRING(@Additional_Procedure_Location, n-1, 1) = ',' THEN '1,'

    WHEN n <> LEN(@Additional_Procedure_Location) AND

    SUBSTRING(@Additional_Procedure_Location, n, 1) NOT IN ('1','2',',') THEN '1'

    WHEN n <> LEN(@Additional_Procedure_Location) THEN SUBSTRING(@Additional_Procedure_Location, n, 1)

    WHEN n = LEN(@Additional_Procedure_Location) AND

    SUBSTRING(@Additional_Procedure_Location, n, 1) <> ',' AND

    SUBSTRING(@Additional_Procedure_Location, n-1, 1) <> ',' THEN ''

    WHEN n = LEN(@Additional_Procedure_Location) AND

    SUBSTRING(@Additional_Procedure_Location, n, 1) = ',' AND

    SUBSTRING(@Additional_Procedure_Location, n-1, 1) = ',' THEN '1,'

    WHEN n = LEN(@Additional_Procedure_Location) AND

    SUBSTRING(@Additional_Procedure_Location, n, 1) NOT IN ('1','2',',') THEN '1'

    WHEN n = LEN(@Additional_Procedure_Location) AND

    SUBSTRING(@Additional_Procedure_Location, n, 1) = ',' THEN ',1'

    ELSE SUBSTRING(@Additional_Procedure_Location, n, 1) END

    ,ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM Tally

    WHERE n BETWEEN 1 and LEN(@Additional_Procedure_Location))

    SELECT @new_Additional_Procedure_Location =

    (SELECT new_additional_procedure_location

    FROM Parser

    FOR XML PATH(''), root('M'), type).value('/M[1]','varchar(max)' )

    FROM Parser p1

    RETURN @new_Additional_Procedure_Location

    END

    I then built the following test harness (based on Jeff Moden's suggestion) for 10000 rows. I could have taken it further but it wasn't necessary because the timings were already clear at this time.

    CREATE TABLE #temp (MRN nchar(10) null,

    additional_procedure_Location varchar(max))

    ;WITH Tally (n) AS (

    SELECT TOP 10000 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM sys.all_columns ac1 CROSS JOIN sys.all_columns ac2)

    INSERT INTO #temp

    SELECT RIGHT('0000000'+CAST(n AS VARCHAR(7)),7)

    ,SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1) +

    SUBSTRING('01234,', ABS(CHECKSUM(NEWID()))%6+1, 1)

    FROM Tally

    SELECT * FROM #Temp

    SET STATISTICS TIME ON

    SET STATISTICS IO ON

    SELECT t.MRN, additional_procedure_Location, new_additional_procedure_Location

    FROM #temp t

    CROSS APPLY dbo.tvf_ValidateDelimitedString(t.MRN, t.additional_procedure_Location) x

    SELECT MRN, additional_procedure_Location

    ,dbo.svf_ValidateDelimitedString(Additional_Procedure_Location) AS new_additional_procedure_Location

    FROM #temp t

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    DROP TABLE #temp

    And here are the results.

    (10000 row(s) affected)

    Table '#temp___0000000000C0'. Scan count 1, logical reads 80, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 639 ms, elapsed time = 741 ms.

    (10000 row(s) affected)

    Table '#temp___0000000000C0'. Scan count 1, logical reads 80, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 3511 ms, elapsed time = 3599 ms.

    So TVF beats SVF, at least in this case. Anyone else is welcome to pit their submission against these (or improve upon them).

    I think that is because I modified the OPs original code. This is the function missing probably: dbo.Validate_2_DelimitedFields_id. Is the the split function the OP was using?

  • Lynn Pettis (5/23/2012)


    I think that is because I modified the OPs original code. This is the function missing probably: dbo.Validate_2_DelimitedFields_id. Is the the split function the OP was using?

    Hi Lynn - I certainly wasn't trying to imply you had done something wrong. Yes I believe it was that FUNCTION that was missing. Whether it does a SplitString or not, who's to say. You certainly wouldn't be able to tell from the name.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 5 posts - 16 through 19 (of 19 total)

You must be logged in to reply to this topic. Login to reply