Text to Columns in SQL

  • Hi

    I have a problem at work (a hospital). We have data stored in a DWH that is taken from an overnight snapshot from the live sysytem and resides in a view in SQL server.

    I have a column from the live system which contains multiple data items seperated by a '/'.

    i need to show this field as seperate columns (up to a max of 10)

    Below is an example of my data and also an expected results table.

    Any help or advice would be greatly appreciated

    --Data I have to work with --

    CREATE TABLE PS_TestForOnline

    (

    crn NVARCHAR (10),

    all_codes_in_one_field VARCHAR (500),

    );

    INSERT INTO PS_TestForOnline

    VALUES('11111','W83.1, Endoscopic drilling of lesion of articular cartilage / Y76.7, Arthroscopic approach to joint / W74.2, Reconstruction of intra-articular ligament NEC / Y69.8, Other specified harvest of other tissue' );

    INSERT INTO PS_TestForOnline

    VALUES('22222','W28.3, Removal of internal fixation from bone NEC / W31.9, Unspecified other autograft of bone / W62.9, Unspecified other primary fusion of other joint' );

    INSERT INTO PS_TestForOnline

    VALUES('33333','L62.1, Repair of femoral artery NEC' );

    INSERT INTO PS_TestForOnline

    VALUES('44444','E13.6, Puncture of maxillary antrum / E08.1, Polypectomy of internal nose' );

    select * from PS_TestForOnline

    drop table PS_TestForOnline

    ----------------------------------------------------------

    --Expected results--

    CREATE TABLE PS_TestForOnline_expected_answer

    (

    crn int,

    CODE_POS_1 NVARCHAR (500),

    CODE_POS_2 NVARCHAR (500),

    CODE_POS_3 NVARCHAR (500),

    CODE_POS_4 NVARCHAR (500),

    CODE_POS_5 NVARCHAR (500),

    CODE_POS_6 NVARCHAR (500),

    CODE_POS_7 NVARCHAR (500),

    CODE_POS_8 NVARCHAR (500),

    CODE_POS_9 NVARCHAR (500),

    CODE_POS_10 NVARCHAR (500),

    );

    INSERT INTO PS_TestForOnline_expected_answer

    VALUES('11111','W83.1, Endoscopic drilling of lesion of articular cartilage','Y76.7, Arthroscopic approach to joint','W74.2, Reconstruction of intra-articular ligament NEC','Y69.8, Other specified harvest of other tissue','','','','','','');

    INSERT INTO PS_TestForOnline_expected_answer

    VALUES('22222','W28.3, Removal of internal fixation from bone NEC','W31.9, Unspecified other autograft of bone','W62.9, Unspecified other primary fusion of other joint','','','','','','','' );

    INSERT INTO PS_TestForOnline_expected_answer

    VALUES('33333','L62.1, Repair of femoral artery NEC','','','','','','','','','' );

    INSERT INTO PS_TestForOnline_expected_answer

    VALUES('44444','E13.6, Puncture of maxillary antrum','E08.1, Polypectomy of internal nose','','','','','','','','' );

    select * from PS_TestForOnline_expected_answer

    drop table PS_TestForOnline_expected_answer

  • I guess you need a splitter function, such as the great one desribed by Jeff Moden in his article here[/url].

    If you like his code, please consider voting for him in the Exceptional DBA 2011 Awards. 😉

    -- Gianluca Sartori

  • Gianluca Sartori (8/4/2011)


    I guess you need a splitter function, such as the great one desribed by Jeff Moden in his article here[/url].

    If you like his code, please consider voting for him in the Exceptional DBA 2011 Awards. 😉

    Well.. for all the articles/suggestions Jeff has given, he is already an Exceptional DBA.... Iv cast my vote 😀

    [font="Times New Roman"]For better assistance in answering your questions
    Click Here[/url][/font]

  • Hi

    I've had a look at the function you suggested and am running the following.

    SELECT CRN, Item = QUOTENAME(PS_TestForOnline.all_codes_in_one_field,'"')

    FROM PS_TestForOnline

    CROSS APPLY dbo.DelimitedSplit8k(PS_TestForOnline.all_codes_in_one_field,'/') split

    it gives me the following...

    CRNItem

    11111NULL

    11111NULL

    11111NULL

    11111NULL

    22222NULL

    22222NULL

    22222NULL

    33333"L62.1, Repair of femoral artery NEC"

    44444"E13.6, Puncture of maxillary antrum / E08.1, Polypectomy of internal nose"

    44444"E13.6, Puncture of maxillary antrum / E08.1, Polypectomy of internal nose"

    For starters I am requiring the field to be split into columns and not rows and also I havent a clue why the items for 11111 and 22222 are showing as NULL.

    Maybe i got this totally wrong, I dont know.

    Thanks

  • You need to pivot results:

    SELECT crn, [1] as CODE_POS_1,[2] as CODE_POS_2,[3] as CODE_POS_3

    ,[4] as CODE_POS_4,[5] as CODE_POS_5,[6] as CODE_POS_6,[7] as CODE_POS_7

    ,[8] as CODE_POS_8,[9] as CODE_POS_9,[10] as CODE_POS_10

    from

    (

    SELECT t.crn, f.ItemNumber, f.Item

    FROM dbo.PS_TestForOnline t

    CROSS APPLY dbo.DelimitedSplit8K(t.all_codes_in_one_field, '/') f

    ) ts

    pivot

    (

    MAX(Item)

    FOR ItemNumber IN ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10])

    ) as pt

    Also, you will probably want to LTRIM resulted strings...

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • You're looking for the "pivot" operator:

    SELECT t.*

    ,x.*

    FROM PS_TestForOnline t

    CROSS APPLY (

    select pvt.[1]

    ,pvt.[2]

    ,pvt.[3]

    ,pvt.[4]

    ,pvt.[5]

    ,pvt.[6]

    ,pvt.[7]

    ,pvt.[8]

    ,pvt.[9]

    ,pvt.[10]

    from dbo.DelimitedSplit8k(t.all_codes_in_one_field,'/') src

    pivot (

    max(src.Item)

    for src.ItemNumber in ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10])

    ) pvt

    ) x

    edit: sigh: too late..., again 🙂



    Posting Data Etiquette - Jeff Moden[/url]
    Posting Performance Based Questions - Gail Shaw[/url]
    Hidden RBAR - Jeff Moden[/url]
    Cross Tabs and Pivots - Jeff Moden[/url]
    Catch-all queries - Gail Shaw[/url]


    If you don't have time to do it right, when will you have time to do it over?

  • Errr...Jeff Moden strikes again? Try a Crosstab instead of PIVOT. You'll be happy you did.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns[/url]

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Ok, a cross-tab implementation. It is faster and enables more possibilities than the Pivot implementation. Enough reasons to choose the cross-tab version. Even though I personally don't share the ideas that it reads and looks better. Judge for yourself:

    SELECT convert(int, t.crn) as crn

    ,x.CODE_POS_1

    ,x.CODE_POS_2

    ,x.CODE_POS_3

    ,x.CODE_POS_4

    ,x.CODE_POS_5

    ,x.CODE_POS_6

    ,x.CODE_POS_7

    ,x.CODE_POS_8

    ,x.CODE_POS_9

    ,x.CODE_POS_10

    FROM PS_TestForOnline t

    CROSS APPLY (

    select max(case src.ItemNumber when 1 then src.Item end) as CODE_POS_1

    ,max(case src.ItemNumber when 2 then src.Item end) as CODE_POS_2

    ,max(case src.ItemNumber when 3 then src.Item end) as CODE_POS_3

    ,max(case src.ItemNumber when 4 then src.Item end) as CODE_POS_4

    ,max(case src.ItemNumber when 5 then src.Item end) as CODE_POS_5

    ,max(case src.ItemNumber when 6 then src.Item end) as CODE_POS_6

    ,max(case src.ItemNumber when 7 then src.Item end) as CODE_POS_7

    ,max(case src.ItemNumber when 8 then src.Item end) as CODE_POS_8

    ,max(case src.ItemNumber when 9 then src.Item end) as CODE_POS_9

    ,max(case src.ItemNumber when 10 then src.Item end) as CODE_POS_10

    from dbo.DelimitedSplit8k(t.all_codes_in_one_field,'/') src

    ) x;

    Just because I felt like it, here's an example of some things you can't do with the pivot version. I've split each items into a code and a description, then I've sorted the items by their code before putting them in columns. i.e the 1st column has the item with the smallest code.

    SELECT convert(int, t.crn) as crn

    ,x.*

    FROM PS_TestForOnline t

    CROSS APPLY (

    select max(case t.ItemNumber when 1 then t.Code end) as CODE_POS_1

    ,max(case t.ItemNumber when 1 then t.Description end) as Description_POS_1

    ,max(case t.ItemNumber when 2 then t.Code end) as CODE_POS_2

    ,max(case t.ItemNumber when 2 then t.Description end) as Description_POS_2

    ,max(case t.ItemNumber when 3 then t.Code end) as CODE_POS_3

    ,max(case t.ItemNumber when 3 then t.Description end) as Description_POS_3

    ,max(case t.ItemNumber when 4 then t.Code end) as CODE_POS_4

    ,max(case t.ItemNumber when 4 then t.Description end) as Description_POS_4

    ,max(case t.ItemNumber when 5 then t.Code end) as CODE_POS_5

    ,max(case t.ItemNumber when 5 then t.Description end) as Description_POS_5

    ,max(case t.ItemNumber when 6 then t.Code end) as CODE_POS_6

    ,max(case t.ItemNumber when 6 then t.Description end) as Description_POS_6

    ,max(case t.ItemNumber when 7 then t.Code end) as CODE_POS_7

    ,max(case t.ItemNumber when 7 then t.Description end) as Description_POS_7

    ,max(case t.ItemNumber when 8 then t.Code end) as CODE_POS_8

    ,max(case t.ItemNumber when 8 then t.Description end) as Description_POS_8

    ,max(case t.ItemNumber when 9 then t.Code end) as CODE_POS_9

    ,max(case t.ItemNumber when 9 then t.Description end) as Description_POS_9

    ,max(case t.ItemNumber when 10 then t.Code end) as CODE_POS_10

    ,max(case t.ItemNumber when 10 then t.Description end) as Description_POS_10

    from (

    select row_number() over (order by flds.Code) as ItemNumber

    ,flds.Code

    ,flds.Description

    from dbo.DelimitedSplit8k(t.all_codes_in_one_field,'/') src

    cross apply (

    select ltrim(left(src.Item, charindex(',', src.Item) - 1)) as Code

    ,ltrim(substring( src.Item, charindex(',', src.Item) + 1, 8000)) as Description

    ) flds

    ) t

    ) x;



    Posting Data Etiquette - Jeff Moden[/url]
    Posting Performance Based Questions - Gail Shaw[/url]
    Hidden RBAR - Jeff Moden[/url]
    Cross Tabs and Pivots - Jeff Moden[/url]
    Catch-all queries - Gail Shaw[/url]


    If you don't have time to do it right, when will you have time to do it over?

  • Many Thanks to everyone

  • R.P.Rozema (8/4/2011)


    Even though I personally don't share the ideas that it reads and looks better.

    BWAA-HAAA!!!! When leading commas are present, neither do I! 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Sriram.RM (8/4/2011)


    Gianluca Sartori (8/4/2011)


    I guess you need a splitter function, such as the great one desribed by Jeff Moden in his article here[/url].

    If you like his code, please consider voting for him in the Exceptional DBA 2011 Awards. 😉

    Well.. for all the articles/suggestions Jeff has given, he is already an Exceptional DBA.... Iv cast my vote 😀

    Thank you both for the kind words and the votes! :blush:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • :-D, still there is a reason for them being in this example: I generate a lot of my code, and leading comma's happen to be a lot easier to generate (using T-SQL). Plus they allow you to comment a line while debugging, testing etc without having to alter the line before the excluded line(s). I'm not actively advocating the use of comma's in front of lines, but it does have some advantages, so I do use it sometimes.



    Posting Data Etiquette - Jeff Moden[/url]
    Posting Performance Based Questions - Gail Shaw[/url]
    Hidden RBAR - Jeff Moden[/url]
    Cross Tabs and Pivots - Jeff Moden[/url]
    Catch-all queries - Gail Shaw[/url]


    If you don't have time to do it right, when will you have time to do it over?

  • R.P.Rozema (8/25/2011)


    :-D, still there is a reason for them being in this example: I generate a lot of my code, and leading comma's happen to be a lot easier to generate (using T-SQL). Plus they allow you to comment a line while debugging, testing etc without having to alter the line before the excluded line(s). I'm not actively advocating the use of comma's in front of lines, but it does have some advantages, so I do use it sometimes.

    Understood. I don't have any problems generating code with trailing commas but that's just me. I also don't understand why people think it makes debugging easier. Leading commas only make commenting out the last item easier and trailing commas make it easier to comment out the first item. Everything in between is just as easy either way. But, to each their own. 🙂

    Shifting gears, I do have an example of doing a "double pivot" in the article that opc.three cited, but it gets ugly in a hurry.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply