parse comma-delimited list into seperate columns

  • WayneS (1/11/2010)


    ...

    PS. Jason - just how do you go about embedding the execution plans inside the messages? I've seen you do this many times...

    Wayne,

    set statistics xml on

    Or Click show actual execution plan.

    On the execution plan, right click, save as - name it. Then you upload the plan just the same as the jpgs you have attached. It is something that Gail has covered in an article.

    http://qa.sqlservercentral.com/articles/SQLServerCentral/66909/

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Has anyone played with this yet?

    SELECT s.ID, x.*

    FROM #Sample s

    CROSS APPLY (

    SELECT

    MAX(CASE d.ColNo WHEN 1 THEN SUBSTRING(d.DodgyStringData, d.b, d.e-d.b) END) AS [Account],

    MAX(CASE d.ColNo WHEN 2 THEN SUBSTRING(d.DodgyStringData, d.b, d.e-d.b) END) AS [Year],

    MAX(CASE d.ColNo WHEN 3 THEN SUBSTRING(d.DodgyStringData, d.b, d.e-d.b) END) AS [Month],

    MAX(CASE d.ColNo WHEN 4 THEN SUBSTRING(d.DodgyStringData, d.b, d.e-d.b) END) AS [Scenario],

    MAX(CASE d.ColNo WHEN 5 THEN SUBSTRING(d.DodgyStringData, d.b, d.e-d.b) END) AS [Version],

    MAX(CASE d.ColNo WHEN 6 THEN SUBSTRING(d.DodgyStringData, d.b, d.e-d.b) END) AS [Entity]

    FROM (

    SELECT s.DodgyStringData,

    ColNo = ROW_NUMBER() OVER(ORDER BY n.n), -- string segment number

    n.n AS b, -- delimiter at beginning of string

    ISNULL(NULLIF(CHARINDEX(',', s.DodgyStringData, n.n+1), 0), LEN(s.DodgyStringData)+1) AS e -- delimiter at end of string

    FROM (SELECT TOP 100 [n] = ROW_NUMBER() OVER(ORDER BY [name]) FROM master.dbo.syscolumns) n

    WHERE SUBSTRING(','+s.DodgyStringData+',', n.n, 1) = ','

    ) d

    ) x

    It's fast - 5s for over a million rows on a dev box here.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • CirquedeSQLeil (1/11/2010)[hrWayne, I found a really slow point in your second script (the million records). The table population stuff. You will probably notice throughout the site that there are some really fast methods to do this. Not that that section really affects the split script any - just a thought though.;-)

    Yeah, I noticed that. I decided to take the lazy way on that though, since we already had a couple of rows of test data, to just duplicate them until I had a million.

    I guess I should have done this:

    ;with CTE (N) AS

    (

    select s1.object_id

    from sys.objects s1, sys.objects s2, sys.objects s3, sys.objects s4

    ),CTE2 (N) AS

    (

    select top 1000000 ROW_NUMBER() OVER (ORDER BY N)

    FROM CTE

    )

    select RowData = 'xyz ' + convert(varchar(7), N) + ', FY11, Apr, Budget, Version_' + CONVERT(varchar(7), N) + ', 0160117'

    INTO #TEST

    FROM CTE2

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • CirquedeSQLeil (1/11/2010)


    WayneS (1/11/2010)


    ...

    PS. Jason - just how do you go about embedding the execution plans inside the messages? I've seen you do this many times...

    Wayne,

    set statistics xml on

    Or Click show actual execution plan.

    On the execution plan, right click, save as - name it. Then you upload the plan just the same as the jpgs you have attached. It is something that Gail has covered in an article.

    http://qa.sqlservercentral.com/articles/SQLServerCentral/66909/

    Hmm. And since they are XML, it just displays it. Interesting... I'll have to try it out pretty soon.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • WayneS (1/11/2010)


    Try this.

    -- if temp table already exists (failed previous run), drop it

    if OBJECT_ID('tempdb..#test') IS NOT NULL DROP TABLE #test

    -- simulate the table with the data in it.

    -- NOTE how your sample data was put into a table

    -- to make it easier for us volunteers to work with it!

    DECLARE @test-2 TABLE (RowData varchar(75))

    INSERT INTO @test-2

    SELECT 'xyz 54050, FY11, Apr, Budget, Version_1, 0160117' UNION ALL

    SELECT 'abc 54050, FY11, May, Budget, Version_1, 0160117'

    -- get the data from the table and put it into a temporary work table

    SELECT RowData

    INTO #TEST

    FROM @test-2

    -- add some columns to hold the comma positions

    ALTER TABLE #TEST

    ADD Col1EndPos int,

    Col2EndPos int,

    Col3EndPos int,

    Col4EndPos int,

    Col5EndPos int

    -- need some variables to hold the comma positions for each row

    DECLARE @Col1EndPos int,

    @Col2EndPos int,

    @Col3EndPos int,

    @Col4EndPos int,

    @Col5EndPos int

    -- update the columns to hold the comma positions

    UPDATE #Test

    SET @Col1EndPos = Col1EndPos = CharIndex(',', RowData),

    @Col2EndPos = Col2EndPos = CharIndex(',', RowData, @Col1EndPos + 1),

    @Col3EndPos = Col3EndPos = CharIndex(',', RowData, @Col2EndPos + 1),

    @Col4EndPos = Col4EndPos = CharIndex(',', RowData, @Col3EndPos + 1),

    @Col5EndPos = Col5EndPos = CharIndex(',', RowData, @Col4EndPos + 1)

    -- now, get the data for each column

    SELECT [ID] = ROW_NUMBER() OVER (ORDER BY RowData),

    [Account] = LEFT(RowData, Col1EndPos-1),

    [Year] = SUBSTRING(RowData, Col1EndPos+1, Col2EndPos-Col1EndPos-1),

    [Month] = SUBSTRING(RowData, Col2EndPos+1, Col3EndPos-Col2EndPos-1),

    [Scenario] = SUBSTRING(RowData, Col3EndPos+1, Col4EndPos-Col3EndPos-1),

    [Version] = SUBSTRING(RowData, Col4EndPos+1, Col5EndPos-Col4EndPos-1),

    [Entity] = SUBSTRING(RowData, Col5EndPos+1, len(RowData)-Col5EndPos-1)

    FROM #Test

    -- clean up

    if OBJECT_ID('tempdb..#test') IS NOT NULL DROP TABLE #test

    For a thorough description of how to use this form of the update statement, as well as ALL of the rules for utilizing it, read this article[/url].

    Edit: Replaced variables with column names in the select statement.

    I've decided to go with this script. Thanks Wayne. However I did have to make one minor change. On the following line...

    [Entity] = SUBSTRING(RowData, Col5EndPos+1, len(RowData)-Col5EndPos-1)

    It stripped off the last 2 characters in the final column, so I change it to.....

    [Entity] = SUBSTRING(RowData, Col5EndPos+1, len(RowData))

  • Thanks for letting us know how you decided.

    Is there any particular reason for going this method - specially since there is a way further on down that is faster, and avoids the use of the controversial "quirky update"?

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • WayneS (1/12/2010)


    Thanks for letting us know how you decided.

    Is there any particular reason for going this method - specially since there is a way further on down that is faster, and avoids the use of the controversial "quirky update"?

    Inquiring minds want to know. I am interested in knowing the reasoning too.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • WayneS (1/12/2010)


    Thanks for letting us know how you decided.

    Is there any particular reason for going this method - specially since there is a way further on down that is faster, and avoids the use of the controversial "quirky update"?

    If you are referring to the CTE one you posted then no i havent had time to test since I already started with your original and needed to get this report out asap. The report returns about 5000 rows so its so fast already.

  • is250sp (1/12/2010)


    WayneS (1/12/2010)


    Thanks for letting us know how you decided.

    Is there any particular reason for going this method - specially since there is a way further on down that is faster, and avoids the use of the controversial "quirky update"?

    If you are referring to the CTE one you posted then no i havent had time to test since I already started with your original and needed to get this report out asap. The report returns about 5000 rows so its so fast already.

    Fair enough. Thanks for the feedback.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • CirquedeSQLeil (1/12/2010)


    is250sp (1/12/2010)


    WayneS (1/12/2010)


    Thanks for letting us know how you decided.

    Is there any particular reason for going this method - specially since there is a way further on down that is faster, and avoids the use of the controversial "quirky update"?

    If you are referring to the CTE one you posted then no i havent had time to test since I already started with your original and needed to get this report out asap. The report returns about 5000 rows so its so fast already.

    Fair enough. Thanks for the feedback.

    Yes, thank you.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Thank you for posting the script. It is exactly what I am looking for and works beautifully.

Viewing 11 posts - 31 through 40 (of 40 total)

You must be logged in to reply to this topic. Login to reply