Percolating Data

  • I have the following table:

    create table #address (address_id int, addr1 char(30), addr2 char(30), addr3 char(30), addr4 char(30), addr5 char(30), addr6 char(30))

    insert into #address values (1, '', '111 Main Street', '', 'Main City', 'CA', '92222')

    insert into #address values (2, '', '', '', 'HB', 'CA', '92222')

    insert into #address values (3, 'Blue', '', '', 'Costa Mesa', 'CA', '92222')

    I would like to get back:

    1, '111 Main Street','Main City', 'CA', '92222', '', ''

    2, 'HB', 'CA', '92222', '' , '' , ''

    3, 'Blue', 'Costa Mesa', 'CA', '92222', '',''

    Notice, I have moved field data over to drop blank fields...

    Thank you.

    Mike

  • select

    address_id,

    case SUBSTRING(addr_flags, 1, 1)

    when '1' then addr1

    when '2' then addr2

    when '3' then addr3

    when '4' then addr4

    when '5' then addr5

    when '6' then addr6

    else '' end as addr1,

    case SUBSTRING(addr_flags, 2, 1)

    when '2' then addr2

    when '3' then addr3

    when '4' then addr4

    when '5' then addr5

    when '6' then addr6

    else '' end as addr2,

    case SUBSTRING(addr_flags, 3, 1)

    when '3' then addr3

    when '4' then addr4

    when '5' then addr5

    when '6' then addr6

    else '' end as addr3,

    case SUBSTRING(addr_flags, 4, 1)

    when '4' then addr4

    when '5' then addr5

    when '6' then addr6

    else '' end as addr4,

    case SUBSTRING(addr_flags, 5, 1)

    when '5' then addr5

    when '6' then addr6

    else '' end as addr5,

    case SUBSTRING(addr_flags, 6, 1)

    when '6' then addr6

    else '' end as addr6

    from #address

    cross apply (

    select case when addr1 > '' then '1' else '' end +

    case when addr2 > '' then '2' else '' end +

    case when addr3 > '' then '3' else '' end +

    case when addr4 > '' then '4' else '' end +

    case when addr5 > '' then '5' else '' end +

    case when addr6 > '' then '6' else '' end as addr_flags

    ) as ca1

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • Here's another option...

    CREATE TABLE #address (

    address_id INT,

    addr1 CHAR(30),

    addr2 CHAR(30),

    addr3 CHAR(30),

    addr4 CHAR(30),

    addr5 CHAR(30),

    addr6 CHAR(30)

    )

    insert into #address values (1, '', '111 Main Street', '', 'Main City', 'CA', '92222')

    insert into #address values (2, '', '', '', 'HB', 'CA', '92222')

    insert into #address values (3, 'Blue', '', '', 'Costa Mesa', 'CA', '92222')

    -- The actual solution --

    SELECT

    x.address_id,

    MAX(CASE WHEN x.PartNum = 1 THEN x.PartName END) AS Col1,

    MAX(CASE WHEN x.PartNum = 2 THEN x.PartName END) AS Col2,

    MAX(CASE WHEN x.PartNum = 3 THEN x.PartName END) AS Col3,

    MAX(CASE WHEN x.PartNum = 4 THEN x.PartName END) AS Col4,

    MAX(CASE WHEN x.PartNum = 5 THEN x.PartName END) AS Col5,

    MAX(CASE WHEN x.PartNum = 6 THEN x.PartName END) AS Col6

    FROM (

    SELECT

    a.address_id,

    ROW_NUMBER() OVER (PARTITION BY a.address_id ORDER BY p.PartNum) AS PartNum,

    p.PartName

    FROM

    #address a

    CROSS APPLY (

    VALUES (1, a.addr1), (2, a.addr2), (3, a.addr3), (4, a.addr4), (5, a.addr5), (6, a.addr6)

    ) p (PartNum, PartName)

    WHERE

    p.PartName <> ''

    ) x

    GROUP BY

    x.address_id

    The results...

    address_id Col1 Col2 Col3 Col4 Col5 Col6

    ----------- ------------------------------ ------------------------------ ------------------------------ ------------------------------ ------------------------------ ------------------------------

    1 111 Main Street Main City CA 92222 NULL NULL

    2 HB CA 92222 NULL NULL NULL

    3 Blue Costa Mesa CA 92222 NULL NULL

    HTH,

    Jason

  • Just an FYI... Jason's is faster because it quickly eliminates the blank "fields" but it also means that it will NOT return any row where all the ADDRx columns are all blank or null.

    If the WHERE clause in Jason's code is removed to fix that nuance, it runs at the same speed as Scott's, which still isn't bad considering what this code is doing.

    There might be something that maintains the speed as well as returning rows with all blank ADDRx columns but 17 seconds for a million rows is probably better than "good enough".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • If keeping those rows is important... This doesn't seem to beat it up too bad...

    IF OBJECT_ID('tempdb..#address') IS NOT NULL

    DROP TABLE #address;

    CREATE TABLE #address (

    address_id INT,

    addr1 CHAR(30),

    addr2 CHAR(30),

    addr3 CHAR(30),

    addr4 CHAR(30),

    addr5 CHAR(30),

    addr6 CHAR(30)

    )

    insert into #address values (1, '', '111 Main Street', '', 'Main City', 'CA', '92222')

    insert into #address values (2, '', '', '', 'HB', 'CA', '92222')

    insert into #address values (3, '', '', '', '', NULL, '')

    insert into #address values (6, 'Blue', '', '', 'Costa Mesa', 'CA', '92222')

    -- The actual solution --

    SELECT

    x.address_id,

    MAX(CASE WHEN x.PartNum = 1 THEN NULLIF(x.PartName, '') END) AS Col1,

    MAX(CASE WHEN x.PartNum = 2 THEN NULLIF(x.PartName, '') END) AS Col2,

    MAX(CASE WHEN x.PartNum = 3 THEN NULLIF(x.PartName, '') END) AS Col3,

    MAX(CASE WHEN x.PartNum = 4 THEN NULLIF(x.PartName, '') END) AS Col4,

    MAX(CASE WHEN x.PartNum = 5 THEN NULLIF(x.PartName, '') END) AS Col5,

    MAX(CASE WHEN x.PartNum = 6 THEN NULLIF(x.PartName, '') END) AS Col6

    FROM (

    SELECT

    a.address_id,

    ROW_NUMBER() OVER (PARTITION BY a.address_id ORDER BY p.PartNum) AS PartNum,

    p.PartName

    FROM

    #address a

    CROSS APPLY (

    VALUES (1, a.addr1), (2, a.addr2), (3, a.addr3), (4, a.addr4), (5, a.addr5), (6, a.addr6)

    ) p (PartNum, PartName)

    WHERE

    p.PartName <> '' OR CONCAT(a.addr1, a.addr2, a.addr3, a.addr4, a.addr5, a.addr6) = ''

    ) x

    GROUP BY

    x.address_id

    address_id Col1 Col2 Col3 Col4 Col5 Col6

    ----------- ------------------------------ ------------------------------ ------------------------------ ------------------------------ ------------------------------ ------------------------------

    1 111 Main Street Main City CA 92222 NULL NULL

    2 HB CA 92222 NULL NULL NULL

    3 NULL NULL NULL NULL NULL NULL

    6 Blue Costa Mesa CA 92222 NULL NULL

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply