Extracting a specific word???

  • hi all,

    I want to get the specific word from a string. but i dont know how.

    example:

    20 Kenvil Ave Succasunna NJ

    327 Aspen Ct stanhope NJ

    22 Woodland Dr Long Valley NJ

    4 Brittin St Madison NJ

    18 North 1 Ave Kenvil NJ

    C/o Abe Mishkin  22 Swamp Fox Dr Carolinea Shore NC

    Paragon Village 425 Rte 46 E.#314 Hackettstown NJ

    1 Chestnut St. Netcong NJ

    .

    .

    please give me a solution to get these bold words from a column. there are thausands of records..

    Thanx

    Noman

  • here's a little bit of help. since you cannot determine whether a cityname has 1,2, or more words, i can't see of any way to not do at least a portion of this manually.

    i would hope that something like this might get you 90% of the way done automatically.

    as i see it, you need to pull bot cityname and state out as separate columns.

     

    create table #tmp(addr1 varchar(80))

    insert into #tmp (addr1) values('20 Kenvil Ave Succasunna NJ')

    insert into #tmp (addr1) values('327 Aspen Ct stanhope NJ')

    insert into #tmp (addr1) values('22 Woodland Dr Long Valley NJ')

    insert into #tmp (addr1) values('4 Brittin St Madison NJ')

    insert into #tmp (addr1) values('18 North 1 Ave Kenvil NJ')

    insert into #tmp (addr1) values('C/o Abe Mishkin  22 Swamp Fox Dr Carolinea Shore NC')

    insert into #tmp (addr1) values('Paragon Village 425 Rte 46 E.#314 Hackettstown NJ')

    insert into #tmp (addr1) values('1 Chestnut St. Netcong NJ')

    alter table #tmp add cityname varchar(80),state varchar(2)

    --find the space that is  second  from the end of the string

    --extract the state:

    select REVERSE(SUBSTRING(REVERSE(addr1),1,charindex( ' ',reverse(addr1))-1)) from #tmp

    update #tmp set state =  REVERSE(SUBSTRING(REVERSE(addr1),1,charindex( ' ',reverse(addr1))-1))

    --single word citys/towns:

    select REVERSE(SUBSTRING(REVERSE(addr1),3,charindex( ' ',reverse(substring(addr1, 1, len(addr1) -3)))+0)) from #tmp --where count of spaces =4?

    update #tmp set cityname = REVERSE(SUBSTRING(REVERSE(addr1),3,charindex( ' ',reverse(substring(addr1, 1, len(addr1) -3)))+0))

    --doubleword city's towns?

    --triple word cities like New York City?

    select * from #tmp

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Actually, looking at the sample data provided, you do have a couple of identifiers.

    Identifier 1) ST, DR, CR, etc...

    There are a finite amount of street identifiers, even if you have E.#314.

    Identifier 2) Reversing the identifier to the END of the string, being the 2 letter state, plus one space.

    Personally depending on the amount of records involved, it may just be as easy to export this to excel, and work with find, and mid functions.

    If you want to explore this option, reply to me.

  • here's a couple of other ideas that might help:

    if we can assume that single-word city/towns have all address elements, so there is always 4 spaces in the document,something like this might help:

    select len(addr1) - Len(replace(addr1,' ','')) as NumSpaceOccurances, #tmp.* from #tmp

    select REVERSE(SUBSTRING(REVERSE(addr1),3,charindex( ' ',reverse(substring(addr1, 1, len(addr1) -3)))+0)) from #tmp  where  len(addr1) - Len(replace(addr1,' ','')) =4

    --using this instead of the previous example leaves the cityname null for later updates:

    update #tmp set cityname = REVERSE(SUBSTRING(REVERSE(addr1),3,charindex( ' ',reverse(substring(addr1, 1, len(addr1) -3)))+0))    where  len(addr1) - Len(replace(addr1,' ','')) =4

     

    select len(addr1) - Len(replace(addr1,' ','')) as NumSpaceOccurances, #tmp.* from #tmp

    NumSpaceOccurancesaddr1citynamestate
    420 Kenvil Ave Succasunna NJSuccasunna NJ
    4327 Aspen Ct stanhope NJstanhope NJ
    522 Woodland Dr Long Valley NJNULLNJ
    44 Brittin St Madison NJMadison NJ
    518 North 1 Ave Kenvil NJNULLNJ
    10C/o Abe Mishkin  22 Swamp Fox Dr Carolinea Shore NCNULLNC
    7Paragon Village 425 Rte 46 E.#314 Hackettstown NJNULLNJ
    41 Chestnut St. Netcong NJNetcong NJ

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • yet another thought: if you have the zipcode you could just look up the city and state from a separate table and ignore parsing the addr field ...plenty of free resources of zipcodes with around 42K of zipcodes in em; bigger data sources cost real money. that might be a possibility too.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Thanx to u all, i have solved my problem

    Noman

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply