Upper Case Lower Case

  • Jeff Moden - Wednesday, May 2, 2018 4:47 PM

    I'm seriously missing it in the thread above.  What "small change" did Luis make?

    Not to worry, it's easy to miss. He just made one small alteration...

    My original...
    FOR XML PATH(''), TYPE).value('.', 'varchar(8000)')

    Luis' modification...
    FOR XML PATH(''), TYPE).value('./text()[1]', 'varchar(8000)')

  • Jason A. Long - Wednesday, May 2, 2018 7:35 PM

    Jeff Moden - Wednesday, May 2, 2018 4:47 PM

    I'm seriously missing it in the thread above.  What "small change" did Luis make?

    Not to worry, it's easy to miss. He just made one small alteration...

    My original...
    FOR XML PATH(''), TYPE).value('.', 'varchar(8000)')

    Luis' modification...
    FOR XML PATH(''), TYPE).value('./text()[1]', 'varchar(8000)')

    Thanks.  That does make for a better looking execution plan but, in practice, I've found no particular performance advantage.  Maybe it was just for the one test I did when Wayne Sheffield came out with his article on the subject of using FOR XML PATH to do concatenation.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Here is the first round of testing...
    Since I'm running 2017 on my home machine, I went ahead and threw in a STRING_AGG version as well...

    To begin, the 3 current functions being tested...

    CREATE FUNCTION dbo.tfn_ProperCase_JL
    /* ===================================================================
    05/02/2018 JL, Created:
                    This function will capitolize the first letter
                    in a string plusr any alpha character that follows a
                    non alpha character or apostrophe
    =================================================================== */
    --===== Define I/O parameters
    (
        @_string VARCHAR(8000)
    )
    RETURNS TABLE WITH SCHEMABINDING AS
    RETURN
        WITH
        cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)),
        cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),
        cte_Tally (n) AS (
            SELECT TOP (LEN(@_string))
                ROW_NUMBER() OVER (ORDER BY b.n)
            FROM
                cte_n2 a CROSS JOIN cte_n2 b
            )
        SELECT
            CasedString = ((
                SELECT
                    CONCAT('', cv.cased_value)
                FROM
                    cte_Tally t
                    CROSS APPLY ( VALUES (
                            CASE
                                WHEN SUBSTRING(@_string, t.n - 1, 1) NOT LIKE '[''a-Z]'
                                THEN UPPER(SUBSTRING(@_string, t.n, 1))
                                ELSE LOWER(SUBSTRING(@_string, t.n, 1))
                            END
                        ) ) cv (cased_value)
                FOR XML PATH(''), TYPE).value('.', 'varchar(8000)')
                );
    GO

    CREATE FUNCTION dbo.tfn_ProperCase_LC
    /* ===================================================================
    05/02/2018 JL, Created:
                    This is an exact copy of the dbo.tfn_ProperCase_JL
                    function except for a slight alteration in the FOR XML PATH line
                    that was suggested by Luis Cazares
    =================================================================== */
    --===== Define I/O parameters
    (
        @_string VARCHAR(8000)
    )
    RETURNS TABLE WITH SCHEMABINDING AS
    RETURN
        WITH
        cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)),
        cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),
        cte_Tally (n) AS (
            SELECT TOP (LEN(@_string))
                ROW_NUMBER() OVER (ORDER BY b.n)
            FROM
                cte_n2 a CROSS JOIN cte_n2 b
            )
        SELECT
            CasedString = ((
                SELECT
                    CONCAT('', cv.cased_value)
                FROM
                    cte_Tally t
                    CROSS APPLY ( VALUES (
                            CASE
                                WHEN SUBSTRING(@_string, t.n - 1, 1) NOT LIKE '[''a-Z]'
                                THEN UPPER(SUBSTRING(@_string, t.n, 1))
                                ELSE LOWER(SUBSTRING(@_string, t.n, 1))
                            END
                        ) ) cv (cased_value)
                FOR XML PATH(''), TYPE).value('./text()[1]', 'varchar(8000)')
                );
    GO

    CREATE FUNCTION dbo.tfn_ProperCase_SA
    /* ===================================================================
    05/02/2018 JL, Created:
                    This returns the same results as the dbo.tfn_ProperCase_JL
                    function but replaces the "FOR XML PATH" syntax with the
                    newer STRING_AGG function
    =================================================================== */
    --===== Define I/O parameters
    (
        @_string VARCHAR(8000)
    )
    RETURNS TABLE WITH SCHEMABINDING AS
    RETURN
        WITH
        cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)),
        cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),
        cte_Tally (n) AS (
            SELECT TOP (LEN(@_string))
                ROW_NUMBER() OVER (ORDER BY b.n)
            FROM
                cte_n2 a CROSS JOIN cte_n2 b
            )
        SELECT
            CasedString = STRING_AGG(cv.cased_value, '')
        FROM
            cte_Tally t
            CROSS APPLY ( VALUES (
                    CASE
                        WHEN SUBSTRING(@_string, t.n - 1, 1) NOT LIKE '[''a-Z]'
                        THEN UPPER(SUBSTRING(@_string, t.n, 1))
                        ELSE LOWER(SUBSTRING(@_string, t.n, 1))
                    END
                ) ) cv (cased_value);
    GO

    The test harness... (being executed with "Discard results after execution" turned on)

    IF OBJECT_ID('tempdb..#TestData', 'U') IS NULL
    BEGIN    -- DROP TABLE #TestData;
        CREATE TABLE #TestData (
            rid INT NOT NULL PRIMARY KEY CLUSTERED,
            string_val VARCHAR(1000)
            );

    WITH
        cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)),
        cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),
        cte_Tally (n) AS (
            SELECT TOP (10000)
                ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
            FROM
                cte_n2 a CROSS JOIN cte_n2 b
            )
    INSERT #TestData(rid, string_val)
    SELECT
        t.n,
        x.string_val    
    FROM
        cte_Tally t
        CROSS APPLY ( VALUES ((
            SELECT TOP (ABS(CHECKSUM(NEWID())) % 999 + 1)
                CONCAT('', CHAR(a2.ascii_val))
            FROM
                cte_Tally t2
                CROSS APPLY ( VALUES (ABS(CHECKSUM(NEWID())) % 58 + 65) ) av (ascii_val)
                CROSS APPLY ( VALUES (
                    CASE av.ascii_val
                        WHEN 91 THEN 32
                        WHEN 92 THEN 39
                        WHEN 93 THEN 44
                        WHEN 94 THEN 46
                        ELSE av.ascii_val
                    END)
                    ) a2 (ascii_val)
            WHERE
                t.n > 0
            FOR XML PATH(''), TYPE).value('./text()[1]', 'varchar(8000)')
            ) ) x (string_val);            
    END;

    --================================================================================================================================================
    --================================================================================================================================================
    --================================================================================================================================================

    SET NOCOUNT ON;
    GO
    DBCC FREEPROCCACHE WITH NO_INFOMSGS;
    DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
    PRINT(CONCAT(CHAR(13), CHAR(10), N'   wait a moment...', CHAR(13), CHAR(10)));
    WAITFOR DELAY '00:00:01';
    GO
    -- SET STATISTICS XML ON;
    GO
    DECLARE @_clock_start DATETIME2(7) = SYSDATETIME(), @_test_name NVARCHAR(70) = N'tfn_ProperCase_JL ';
    PRINT(CONCAT(N'+', REPLICATE(N'-', 148), N'+
    ¦', LEFT(CONCAT(N'    Start Time: ', @_clock_start, N'   Test Name: ', ISNULL(NULLIF(@_test_name, N' '), N'Test Query 1'),
    REPLICATE(N' ', 100)), 148), N'¦', CHAR(13), CHAR(10), N'+', REPLICATE(N'-', 148), N'+')); SET STATISTICS IO ON;
    -- ____________________________________________________________________________________________________________________________________________
    -- ?????? place tsql here ????????????????????????????????????????????????????????????

    SELECT
        td.rid,
        td.string_val,
        pc.CasedString
    FROM
        #TestData td
        CROSS APPLY dbo.tfn_ProperCase_JL(td.string_val) pc;

    -- ?????? place tsql here ????????????????????????????????????????????????????????????
    -- ????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
    DECLARE @_clock_stop DATETIME2(7) = SYSDATETIME(); SET STATISTICS IO OFF; PRINT(CONCAT(N'+', REPLICATE(N'-', 148), N'+
    ¦', LEFT(STUFF(CONCAT(N'    Finish Time: ', @_clock_stop, N'   Duration: ', DATEDIFF(mcs, @_clock_start, @_clock_stop)
    / 1000000.0, N' secs.  ', DATEDIFF(mcs, @_clock_start, @_clock_stop) / 1000.0, N' ms.', REPLICATE(N' ', 100)), 76, 3, N''), 148), N'¦
    +', REPLICATE(N'-', 148), N'+'));
    GO
    -- SET STATISTICS XML OFF;
    GO
    DBCC FREEPROCCACHE WITH NO_INFOMSGS;
    DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
    PRINT(CONCAT(CHAR(13), CHAR(10), N'   wait a moment...', CHAR(13), CHAR(10)));
    WAITFOR DELAY '00:00:01';
    GO
    -- SET STATISTICS XML ON;
    GO
    DECLARE @_clock_start DATETIME2(7) = SYSDATETIME(), @_test_name NVARCHAR(70) = N'tfn_ProperCase_LC ';
    PRINT(CONCAT(N'+', REPLICATE(N'-', 148), N'+                              
    ¦', LEFT(CONCAT(N'    Start Time: ', @_clock_start, N'   Test Name: ', ISNULL(NULLIF(@_test_name, N' '), N'Test Query 2'),
    REPLICATE(N' ', 100)), 148), N'¦', CHAR(13), CHAR(10), N'+', REPLICATE(N'-', 148), N'+')); SET STATISTICS IO ON;    
    -- ____________________________________________________________________________________________________________________________________________
    -- ?????? place tsql here ????????????????????????????????????????????????????????????

    SELECT
        td.rid,
        td.string_val,
        pc.CasedString
    FROM
        #TestData td
        CROSS APPLY dbo.tfn_ProperCase_LC(td.string_val) pc;

    -- ?????? place tsql here ????????????????????????????????????????????????????????????
    -- ????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
    DECLARE @_clock_stop DATETIME2(7) = SYSDATETIME(); SET STATISTICS IO OFF; PRINT(CONCAT(N'+', REPLICATE(N'-', 148), N'+
    ¦', LEFT(STUFF(CONCAT(N'    Finish Time: ', @_clock_stop, N'   Duration: ', DATEDIFF(mcs, @_clock_start, @_clock_stop)
    / 1000000.0, N' secs.  ', DATEDIFF(mcs, @_clock_start, @_clock_stop) / 1000.0, N' ms.', REPLICATE(N' ', 100)), 76, 3, N''), 148), N'¦
    +', REPLICATE(N'-', 148), N'+'));
    GO
    -- SET STATISTICS XML OFF;
    GO
    DBCC FREEPROCCACHE WITH NO_INFOMSGS;
    DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
    PRINT(CONCAT(CHAR(13), CHAR(10), N'   wait a moment...', CHAR(13), CHAR(10)));
    WAITFOR DELAY '00:00:01';
    GO
    -- SET STATISTICS XML ON;
    GO
    DECLARE @_clock_start DATETIME2(7) = SYSDATETIME(), @_test_name NVARCHAR(70) = N'tfn_ProperCase_SA ';
    PRINT(CONCAT(N'+', REPLICATE(N'-', 148), N'+                              
    ¦', LEFT(CONCAT(N'    Start Time: ', @_clock_start, N'   Test Name: ', ISNULL(NULLIF(@_test_name, N' '), N'Test Query 2'),
    REPLICATE(N' ', 100)), 148), N'¦', CHAR(13), CHAR(10), N'+', REPLICATE(N'-', 148), N'+')); SET STATISTICS IO ON;    
    -- ____________________________________________________________________________________________________________________________________________
    -- ?????? place tsql here ????????????????????????????????????????????????????????????

    SELECT
        td.rid,
        td.string_val,
        pc.CasedString
    FROM
        #TestData td
        CROSS APPLY dbo.tfn_ProperCase_SA(td.string_val) pc;

    -- ?????? place tsql here ????????????????????????????????????????????????????????????
    -- ????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
    DECLARE @_clock_stop DATETIME2(7) = SYSDATETIME(); SET STATISTICS IO OFF; PRINT(CONCAT(N'+', REPLICATE(N'-', 148), N'+
    ¦', LEFT(STUFF(CONCAT(N'    Finish Time: ', @_clock_stop, N'   Duration: ', DATEDIFF(mcs, @_clock_start, @_clock_stop)
    / 1000000.0, N' secs.  ', DATEDIFF(mcs, @_clock_start, @_clock_stop) / 1000.0, N' ms.', REPLICATE(N' ', 100)), 76, 3, N''), 148), N'¦
    +', REPLICATE(N'-', 148), N'+'));
    GO
    -- SET STATISTICS XML OFF;
    GO

    The results...

          wait a moment...
    +----------------------------------------------------------------------------------------------------------------------------------------------------+
    ¦    Start Time: 2018-05-02 23:11:35.6157441   Test Name: tfn_ProperCase_JL                     
    +----------------------------------------------------------------------------------------------------------------------------------------------------+
    Table '#TestData___________________________________________________________________________________________________________000000000077'. Scan count 1, logical reads 676, physical reads 0, read-ahead reads 465, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    +----------------------------------------------------------------------------------------------------------------------------------------------------+
    ¦    Finish Time: 2018-05-02 23:11:38.2096084   Duration: 2.593864 secs.  2593.864000 ms.                
    +----------------------------------------------------------------------------------------------------------------------------------------------------+

       wait a moment...
    +----------------------------------------------------------------------------------------------------------------------------------------------------+                              
    ¦    Start Time: 2018-05-02 23:11:39.2327236   Test Name: tfn_ProperCase_LC                     
    +----------------------------------------------------------------------------------------------------------------------------------------------------+
    Table '#TestData___________________________________________________________________________________________________________000000000077'. Scan count 1, logical reads 676, physical reads 0, read-ahead reads 465, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    +----------------------------------------------------------------------------------------------------------------------------------------------------+
    ¦    Finish Time: 2018-05-02 23:11:41.7875599   Duration: 2.554836 secs.  2554.836000 ms.                
    +----------------------------------------------------------------------------------------------------------------------------------------------------+

       wait a moment...
    +----------------------------------------------------------------------------------------------------------------------------------------------------+                              
    ¦    Start Time: 2018-05-02 23:11:42.8179316   Test Name: tfn_ProperCase_SA                     
    +----------------------------------------------------------------------------------------------------------------------------------------------------+
    Table '#TestData___________________________________________________________________________________________________________000000000077'. Scan count 1, logical reads 676, physical reads 0, read-ahead reads 465, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    +----------------------------------------------------------------------------------------------------------------------------------------------------+
    ¦    Finish Time: 2018-05-02 23:11:44.7733370   Duration: 1.955406 secs.  1955.406000 ms.                
    +----------------------------------------------------------------------------------------------------------------------------------------------------+

    Actual execution plan attached...

  • Jeff Moden - Wednesday, May 2, 2018 8:51 PM

    Jason A. Long - Wednesday, May 2, 2018 7:35 PM

    Jeff Moden - Wednesday, May 2, 2018 4:47 PM

    I'm seriously missing it in the thread above.  What "small change" did Luis make?

    Not to worry, it's easy to miss. He just made one small alteration...

    My original...
    FOR XML PATH(''), TYPE).value('.', 'varchar(8000)')

    Luis' modification...
    FOR XML PATH(''), TYPE).value('./text()[1]', 'varchar(8000)')

    Thanks.  That does make for a better looking execution plan but, in practice, I've found no particular performance advantage.  Maybe it was just for the one test I did when Wayne Sheffield came out with his article on the subject of using FOR XML PATH to do concatenation.

    I'll make the switch because a) the impact on estimated costing seems more in line with reality and b) based on the very limited testing I've done today, it does appear to be marginally faster, most of the time.

  • There is also a version that allows for exceptions such as names with apostrophes etc. here:
    http://www.wisesoft.co.uk/scripts/tsql_proper_case_udf.aspx
    P.S. I have no connection with the Author of the above function.

  • Jeff Moden - Wednesday, May 2, 2018 7:37 AM

    jonathan.crawford - Wednesday, May 2, 2018 6:14 AM

    Jeff Moden - Initial creation and unit test (http://qa.sqlservercentral.com/Forums/Topic530630-8-2.aspx)

    Wow... that was a while back.  I'm humbled that you kept a link to it.  It also reminded me that I was going to try something a bit different so that it would work in an iTVF using a reverse order Tally table.  I guess I never got back to it because I never needed to use such a thing.

    I also like the Nested Replaces that Chris did alot.  Should be nasty fast especially with the binary collation that was used.

    I just pulled the link out of the comments in our system function that we stole from you. 🙂

    -------------------------------------------------------------------------------------------------------------------------------------
    Please follow Best Practices For Posting On Forums to receive quicker and higher quality responses

  • jonathan.crawford - Thursday, May 3, 2018 8:33 AM

    Jeff Moden - Wednesday, May 2, 2018 7:37 AM

    jonathan.crawford - Wednesday, May 2, 2018 6:14 AM

    Jeff Moden - Initial creation and unit test (http://qa.sqlservercentral.com/Forums/Topic530630-8-2.aspx)

    Wow... that was a while back.  I'm humbled that you kept a link to it.  It also reminded me that I was going to try something a bit different so that it would work in an iTVF using a reverse order Tally table.  I guess I never got back to it because I never needed to use such a thing.

    I also like the Nested Replaces that Chris did alot.  Should be nasty fast especially with the binary collation that was used.

    I just pulled the link out of the comments in our system function that we stole from you. 🙂

    Heh... SEE???!!!  Documentation works!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 7 posts - 16 through 21 (of 21 total)

You must be logged in to reply to this topic. Login to reply