String question

  • I have a string that contains names delaminated by a '&'.

    There can be any number of names in this field.

    I want to exclude names that either have a (X) or a (Y) in them

    So if we had

    "Mr ab cdef & Mrs gh ijklm & Ms opqr stuvw"

    then I want to display

    "Mr ab cdef & Mrs gh ijklm & Ms opqr stuvw"

    If we had

    "Mr ab cdef & (X)Mrs gh ijklm & Ms opqr stuvw"

    then I want to display

    "Mr ab cdef & Ms opqr stuvw"

    If we had

    "Mr ab cdef & (X)Mrs gh ijklm & (Y)Ms opqr stuvw"

    then I want to display

    "Mr ab cdef"

    If we had

    "(Y)Mr ab cdef & (X)Mrs gh ijklm"

    then display null

    there can be any combination of names number of names.

    Any ideas as to how I should approach this?

    Thanks

  • huw.pickrell (5/22/2014)


    I have a string that contains names delaminated by a '&'.

    There can be any number of names in this field.

    I want to exclude names that either have a (X) or a (Y) in them

    So if we had

    "Mr ab cdef & Mrs gh ijklm & Ms opqr stuvw"

    then I want to display

    "Mr ab cdef & Mrs gh ijklm & Ms opqr stuvw"

    If we had

    "Mr ab cdef & (X)Mrs gh ijklm & Ms opqr stuvw"

    then I want to display

    "Mr ab cdef & Ms opqr stuvw"

    If we had

    "Mr ab cdef & (X)Mrs gh ijklm & (Y)Ms opqr stuvw"

    then I want to display

    "Mr ab cdef"

    If we had

    "(Y)Mr ab cdef & (X)Mrs gh ijklm"

    then display null

    there can be any combination of names number of names.

    Any ideas as to how I should approach this?

    Thanks

    Seems like a homework question 🙂

    Can you please post your query which you have tried so far.

    --rhythmk
    ------------------------------------------------------------------
    To post your question use below link

    https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help
    🙂

  • Certainly not a homework question. That was a long time ago. 🙂

    I don't have any SQL to give at the moment.

    I was just seeking ideas as to how others might approach this.

    I was sort of thinking of doing a char index to find the first instance of "&" and doing another char index to to find the first instance of either the Y or X and then doing 2 substrings based on that information to cut out the text that I don't want. Then looping through it to seek for any more that I might need to take out.

    I seem to think that this will get quite messy though

  • Why are people teaching that you have to loop through a string by character by character to solve a problem like this? Sigh. If you go through the whole string one character at a time, what happens to performance if you have to do it against a 1M-row table on a varchar(1000) column?

    Instead of thinking about it character by character, I would think of it in sets. If you first split it by word, you can then simply select only the rows you want and then concatenate the rows back together. To get you started, read Jeff Moden's article on string splitting: http://qa.sqlservercentral.com/articles/Tally+Table/72993/. If you haven't read it, it is well worth your time. It will change the way you view data and it'll also change your expectations of performance.

    It sounds like a fun exercise and learning experience to me, so have some fun with it.

  • well, it's kind of hard, because you have to split the string into parts based on hte ampersand, and then reconstruct the string all over again.

    this would normally be fixed by normalizign the data, then the (X) or (Y) would bew flags or delete operations, right?

    the solution is going to require a string splitter like DelimitedSplit8K, as well as a FOR XML operation to put it all back into a single string again.

    here's the first part:

    ;WITH MyCTE([StringVal])

    AS

    (

    SELECT 'Mr ab cdef & Mrs gh ijklm & Ms opqr stuvw' UNION ALL

    SELECT 'Mr ab cdef & (X)Mrs gh ijklm & Ms opqr stuvw' UNION ALL

    SELECT 'Mr ab cdef & (X)Mrs gh ijklm & (Y)Ms opqr stuvw' UNION ALL

    SELECT '(Y)Mr ab cdef & (X)Mrs gh ijklm'

    )

    SELECT * FROM MyCTE

    CROSS APPLY DelimitedSplit8K(StringVal,'&') fn

    WHERE fn.Item NOT LIKE '%(X)%' AND fn.Item NOT LIKE '%(Y)%'

    is that enough to get you started? do you know about splitting strings?

    can you cange the data so you don't have to jump through hoops like this to manipulate strings?

    because you did not provide a real table definition, any FOR XML example i post is just not going to be accurate enough.

    post a decent example of the data, like idid, along with some key to relate the strigns to when you put them back together, and i could help with the next part.

    here's a generic FOR XML concatenation example:

    SELECT DISTINCT

    t.name,

    sq.Columns

    FROM sys.tables t

    JOIN (

    SELECT OBJECT_ID,

    Columns = STUFF((SELECT ',' + name

    FROM sys.columns sc

    WHERE sc.object_id = s.object_id

    FOR XML PATH('')),1,1,'')

    FROM sys.columns s

    ) sq ON t.object_id = sq.object_id

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Thanks for your reply I will go off and experiment with that.

    I learned SQL by myself so bear with me.

    Thanks again

  • SQL_Novice23 (5/22/2014)


    Certainly not a homework question. That was a long time ago. 🙂

    I don't have any SQL to give at the moment.

    I was just seeking ideas as to how others might approach this.

    I was sort of thinking of doing a char index to find the first instance of "&" and doing another char index to to find the first instance of either the Y or X and then doing 2 substrings based on that information to cut out the text that I don't want. Then looping through it to seek for any more that I might need to take out.

    I seem to think that this will get quite messy though

    While you are thinking in right direction but as suggested by others you will be hit by performance for larger string.So give it a shot by Jeff Moden's tried and tested split function.Though you need to modify as per your requirement 🙂

    --rhythmk
    ------------------------------------------------------------------
    To post your question use below link

    https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help
    🙂

  • I think I've got it to work. 🙂

    Thanks for your help on this one - a nice one to learn about.

    My SQL skill are improving slowly !!!

  • SQL_Novice23 (5/22/2014)


    I think I've got it to work. 🙂

    Thanks for your help on this one - a nice one to learn about.

    My SQL skill are improving slowly !!!

    Please post the code you ended up with. There might be some suggestions for further improvement. At the very least, it might help someone with a similar type of question. Thanks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply