Split sting by hypens

  • Hi Guys,

    I have one task to derived column values by splitting the string by hypens

    below is the DDL,


    CREATE TABLE #TABLE
    (COL VARCHAR (100) )

    INSERT INTO #TABLE VALUES ('Sam Capali - Test1 - 12526 - 1 - Period 2')
    INSERT INTO #TABLE VALUES ('Mike M Capela')
    INSERT INTO #TABLE VALUES ('Natalie Simon - Test1/Test2/Test3 Gr 5 - 25143 - 3 - Period 3')
    INSERT INTO #TABLE VALUES ('James Germeny - Test1 & Test2 Ed - 62549 - 2 - Period 2')

    desired output is,

    column1        column2                       column3    column4    column5
    Sam Capali      Test1                         12526       1        Period 2
    Mike M Capela
    Natalie Simon    Test1/Test2/Test3 Gr 5       25143       3        Period 3
    James Germeny    Test1 & Test2 Ed             62549       2        Period 2

    Please help me to build this, Thanks for your help in advance.

    Looking forward for any inputs from you guys.
    Thanks

  • This will split the data:
    SELECT col, ds.ItemNumber, ds.Item
    FROM #Table t
    CROSS APPLY Utilities.dbo.DelimitedSplit8K(col,'-') ds

    Then you would have to pivot it. This is most of the way there:
    SELECT x.col
         , MAX(x.col1) AS c1
         , MAX(x.col2) AS c2
         , MAX(x.col3) AS c3
         , MAX(x.col4) AS c4
         , MAX(x.col5) AS c5
    FROM
    (SELECT col
        , item
        , itemNumber
        , case when itemNumber = 1 then Item else NULL end AS col1
        , case when itemNumber = 2 then Item else NULL end AS col2
        , case when itemNumber = 3 then Item else NULL end AS col3
        , case when itemNumber = 4 then Item else NULL end AS col4
        , case when itemNumber = 5 then Item else NULL end AS col5    
    FROM #Table t
    CROSS APPLY Utilities.dbo.DelimitedSplit8K(col,'-') ds) x
    GROUP BY x.col

  • Will there always be 5 parts? Could there be more, less, an unknown number?

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Check the function from this article
    😎

    The function (slightly modified)

    CREATE FUNCTION [dbo].[DelimitedSplit8K_LEAD]
    --===== Define I/O parameters
       (@pString VARCHAR(8000), @pDelimiter CHAR(1))
    RETURNS TABLE WITH SCHEMABINDING AS
    RETURN
    --===== "Inline" CTE Driven "Tally Table” produces values from 0 up to 10,000...
      -- enough to cover VARCHAR(8000)
    WITH T(N) AS (SELECT X.N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) X(N))
    , NUMS(N) AS (--==== This provides the "zero base" and limits the number of rows right up front
           -- for both a performance gain and prevention of accidental "overruns"
          SELECT 0 UNION ALL
          SELECT TOP (DATALENGTH(ISNULL(@pString,1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
          FROM T T1,T T2,T T3,T T4
          ),
    cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
          SELECT t.N+1
           FROM NUMS t
          WHERE (SUBSTRING(@pString,t.N,1) = @pDelimiter OR t.N = 0)
          )
    --===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
    SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY s.N1),
       Item = SUBSTRING(@pString,s.N1,ISNULL(NULLIF((LEAD(s.N1,1,1) OVER (ORDER BY s.N1) - 1),0)-s.N1,8000))
     FROM cteStart s
    ;

    The application code

    USE TEEST;
    GO
    SET NOCOUNT ON;

    DECLARE @TABLE TABLE
    (COL VARCHAR (100) )

    INSERT INTO @TABLE VALUES
    ('Sam Capali - Test1 - 12526 - 1 - Period 2')
    , ('Mike M Capela')
    , ('Natalie Simon - Test1/Test2/Test3 Gr 5 - 25143 - 3 - Period 3')
    , ('James Germeny - Test1 & Test2 Ed - 62549 - 2 - Period 2')
    ;
    ;WITH BASE_DATA(COL_ID,COL) AS
    (
      SELECT
       ROW_NUMBER() OVER
        (
          PARTITION BY @@VERSION
          ORDER BY  @@VERSION
        ) AS COL_ID
       ,T.COL
      FROM @TABLE T
    )
    SELECT
      BD.COL_ID
     ,MAX(CASE WHEN X.ItemNumber = 1 THEN LTRIM(RTRIM(X.Item)) END) AS COL_01
     ,MAX(CASE WHEN X.ItemNumber = 2 THEN LTRIM(RTRIM(X.Item)) END) AS COL_02
     ,MAX(CASE WHEN X.ItemNumber = 3 THEN LTRIM(RTRIM(X.Item)) END) AS COL_03
     ,MAX(CASE WHEN X.ItemNumber = 4 THEN LTRIM(RTRIM(X.Item)) END) AS COL_04
     ,MAX(CASE WHEN X.ItemNumber = 5 THEN LTRIM(RTRIM(X.Item)) END) AS COL_05
     ,MAX(CASE WHEN X.ItemNumber = 6 THEN LTRIM(RTRIM(X.Item)) END) AS COL_06
     ,MAX(CASE WHEN X.ItemNumber = 7 THEN LTRIM(RTRIM(X.Item)) END) AS COL_07
     ,MAX(CASE WHEN X.ItemNumber = 8 THEN LTRIM(RTRIM(X.Item)) END) AS COL_08
    FROM BASE_DATA BD
    CROSS APPLY dbo.DelimitedSplit8K_LEAD(BD.COL,CHAR(45)) X
    GROUP BY BD.COL_ID
    ORDER BY BD.COL_ID ASC;

    And the output

    COL_ID  COL_01          COL_02                  COL_03  COL_04  COL_05    COL_06  COL_07  COL_08
    ------- --------------- ----------------------- ------- ------- --------- ------- ------- -------
    1       Sam Capali      Test1                   12526   1       Period 2  NULL    NULL    NULL
    2       Mike M Capela   NULL                    NULL    NULL    NULL      NULL    NULL    NULL
    3       Natalie Simon   Test1/Test2/Test3 Gr 5  25143   3       Period 3  NULL    NULL    NULL
    4       James Germeny   Test1 & Test2 Ed        62549   2       Period 2  NULL    NULL    NULL

  • Eirikur Eiriksson - Thursday, March 28, 2019 9:46 AM

    Check the function from this article
    😎

    The function (slightly modified)

    CREATE FUNCTION [dbo].[DelimitedSplit8K_LEAD]
    --===== Define I/O parameters
       (@pString VARCHAR(8000), @pDelimiter CHAR(1))
    RETURNS TABLE WITH SCHEMABINDING AS
    RETURN
    --===== "Inline" CTE Driven "Tally Table†produces values from 0 up to 10,000...
      -- enough to cover VARCHAR(8000)
    WITH T(N) AS (SELECT X.N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) X(N))
    , NUMS(N) AS (--==== This provides the "zero base" and limits the number of rows right up front
           -- for both a performance gain and prevention of accidental "overruns"
          SELECT 0 UNION ALL
          SELECT TOP (DATALENGTH(ISNULL(@pString,1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
          FROM T T1,T T2,T T3,T T4
          ),
    cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
          SELECT t.N+1
           FROM NUMS t
          WHERE (SUBSTRING(@pString,t.N,1) = @pDelimiter OR t.N = 0)
          )
    --===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
    SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY s.N1),
       Item = SUBSTRING(@pString,s.N1,ISNULL(NULLIF((LEAD(s.N1,1,1) OVER (ORDER BY s.N1) - 1),0)-s.N1,8000))
     FROM cteStart s
    ;

    The application code

    USE TEEST;
    GO
    SET NOCOUNT ON;

    DECLARE @TABLE TABLE
    (COL VARCHAR (100) )

    INSERT INTO @TABLE VALUES
    ('Sam Capali - Test1 - 12526 - 1 - Period 2')
    , ('Mike M Capela')
    , ('Natalie Simon - Test1/Test2/Test3 Gr 5 - 25143 - 3 - Period 3')
    , ('James Germeny - Test1 & Test2 Ed - 62549 - 2 - Period 2')
    ;
    ;WITH BASE_DATA(COL_ID,COL) AS
    (
      SELECT
       ROW_NUMBER() OVER
        (
          PARTITION BY @@VERSION
          ORDER BY  @@VERSION
        ) AS COL_ID
       ,T.COL
      FROM @TABLE T
    )
    SELECT
      BD.COL_ID
     ,MAX(CASE WHEN X.ItemNumber = 1 THEN LTRIM(RTRIM(X.Item)) END) AS COL_01
     ,MAX(CASE WHEN X.ItemNumber = 2 THEN LTRIM(RTRIM(X.Item)) END) AS COL_02
     ,MAX(CASE WHEN X.ItemNumber = 3 THEN LTRIM(RTRIM(X.Item)) END) AS COL_03
     ,MAX(CASE WHEN X.ItemNumber = 4 THEN LTRIM(RTRIM(X.Item)) END) AS COL_04
     ,MAX(CASE WHEN X.ItemNumber = 5 THEN LTRIM(RTRIM(X.Item)) END) AS COL_05
     ,MAX(CASE WHEN X.ItemNumber = 6 THEN LTRIM(RTRIM(X.Item)) END) AS COL_06
     ,MAX(CASE WHEN X.ItemNumber = 7 THEN LTRIM(RTRIM(X.Item)) END) AS COL_07
     ,MAX(CASE WHEN X.ItemNumber = 8 THEN LTRIM(RTRIM(X.Item)) END) AS COL_08
    FROM BASE_DATA BD
    CROSS APPLY dbo.DelimitedSplit8K_LEAD(BD.COL,CHAR(45)) X
    GROUP BY BD.COL_ID
    ORDER BY BD.COL_ID ASC;

    And the output

    COL_ID  COL_01          COL_02                  COL_03  COL_04  COL_05    COL_06  COL_07  COL_08
    ------- --------------- ----------------------- ------- ------- --------- ------- ------- -------
    1       Sam Capali      Test1                   12526   1       Period 2  NULL    NULL    NULL
    2       Mike M Capela   NULL                    NULL    NULL    NULL      NULL    NULL    NULL
    3       Natalie Simon   Test1/Test2/Test3 Gr 5  25143   3       Period 3  NULL    NULL    NULL
    4       James Germeny   Test1 & Test2 Ed        62549   2       Period 2  NULL    NULL    NULL

    Thanks,

    It works as i need it.

    Thanks for your help.

  • Thanks Guys for helping out to build this.
    Appreciate your effort.

  • yogi123 - Thursday, March 28, 2019 10:41 AM

    Thanks Guys for helping out to build this.
    Appreciate your effort.

    You are very welcome
    😎

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply