Matching the Char

  • Hi,

    I have 2 columns:

    col1 col2

    abc\aaa aaa\abc

    ax\pq pqr\aaa

    prq\aaa prt\arz

    I want to query out results with maching characters :

    Result

    col1 col2

    abc\aaa aaa\abc

    prq\aaa pqr\aaa

    Is there any way of doing this using SQL?

    Regards,

    Ankit

    ______________________________________________________________________

    Ankit
    MCITP Database developer SQL 2008,MCTS Database Administration SQL 2008
    "Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
    You are already naked. There is no reason not to follow your heart.”

  • You can use a character splitter function such as the one described by Jeff Moden in his excellent article[/url].

    Please note that if you're doing this, you're probably violating 1NF. Consider re-designing your table.

    Hope this helps

    Gianluca

    -- Gianluca Sartori

  • Hi Gila,

    Thanks for your reply. The data is normalized but I have concatenated fields separated by "/". I want to now populate ID to these fields with fields eg: [abc/pqr] = [pqr/abc].

    I wanted to know if it was possible without splitting.

    Eagerly waiting for your response.

    Ankit

    ______________________________________________________________________

    Ankit
    MCITP Database developer SQL 2008,MCTS Database Administration SQL 2008
    "Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
    You are already naked. There is no reason not to follow your heart.”

  • SQL_By_Chance (11/25/2011)


    The data is normalized but I have concatenated fields separated by "/".

    Oxymoron. The former or the latter, it can't be both at the same time.

    I wanted to know if it was possible without splitting.

    No, not possible. Why don't you want to split? You can do it on the fly in your query, you don't need to apply changes to your existing data.

    -- Gianluca Sartori

  • The data is normalized but I have concatenated fields separated by "/".

    Only one question. Why???

  • SQL_By_Chance (11/25/2011)


    The data is normalized but I have concatenated fields separated by "/"

    Hope that CELKO doesn't see that comment 😉

    SQL_By_Chance (11/25/2011)


    I wanted to know if it was possible without splitting.

    So you want to compare split parts of a string without splitting? I don't think you'll find a way to do that.

    DECLARE @TABLE AS TABLE (col1 VARCHAR(7), col2 VARCHAR(7))

    INSERT INTO @TABLE

    SELECT 'abc\aaa', 'aaa\abc'

    UNION ALL SELECT 'ax\pq', 'pqr\aaa'

    UNION ALL SELECT 'prq\aaa', 'prt\arz'

    UNION ALL SELECT 'axi\pq', 'prq\aaa'

    ;WITH CTE AS (

    SELECT a.col1, b.col2

    FROM @TABLE a

    CROSS JOIN @TABLE b)

    SELECT col1, col2

    FROM CTE a

    CROSS APPLY (SELECT PATINDEX('%\%', col1) AS patternCol1,

    PATINDEX('%\%', col2) AS patternCol2,

    LEN(col1) AS lengthCol1, LEN(col2) AS lengthCol2

    FROM CTE

    WHERE a.col1 = col1 AND a.col2 = col2) b

    WHERE ((SUBSTRING(col1,1,patternCol1-1) = SUBSTRING(col2,1,patternCol2-1))

    OR (SUBSTRING(col1,1,patternCol1-1) = SUBSTRING(col2,patternCol2+1, lengthCol2)))

    AND ((SUBSTRING(col1,patternCol1+1, lengthCol1) = SUBSTRING(col2,1,patternCol2-1))

    OR (SUBSTRING(col1,patternCol1+1, lengthCol1) = SUBSTRING(col2,patternCol2+1, lengthCol2)))

    The above, which is not as efficient as the string splitter that you were pointed at, will compare the string parts with each other.

    e.g. abc\aaa = aaa\abc but prq\aaa != pqr\aaa


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Thanks Gina

    ______________________________________________________________________

    Ankit
    MCITP Database developer SQL 2008,MCTS Database Administration SQL 2008
    "Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
    You are already naked. There is no reason not to follow your heart.”

  • Cause the client requirement is like that 🙂 .

    ______________________________________________________________________

    Ankit
    MCITP Database developer SQL 2008,MCTS Database Administration SQL 2008
    "Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
    You are already naked. There is no reason not to follow your heart.”

  • Thanks Cadavre,

    Wish you many more centuries for providing the solution.

    Thanks,

    Ankit

    ______________________________________________________________________

    Ankit
    MCITP Database developer SQL 2008,MCTS Database Administration SQL 2008
    "Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
    You are already naked. There is no reason not to follow your heart.”

  • SQL_By_Chance (11/28/2011)


    Thanks Cadavre,

    Wish you many more centuries for providing the solution.

    Thanks,

    Ankit

    Bear in mind that my solution is going to die a horrible death on a real table with even a few thousand rows. I'll set up a test to prove how poor the performance is later on.

    This can be avoided by normalising the data.


    --EDIT--

    Tests, as promised: -

    BEGIN TRAN

    SET NOCOUNT ON

    IF object_id('tempdb..#testEnvironment') IS NOT NULL

    BEGIN

    DROP TABLE #testEnvironment

    END

    --1000 Random rows of data

    SELECT TOP 1000 IDENTITY(INT,1,1) AS ID,

    CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65)

    + '\' +

    CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) AS col1,

    CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65)

    + '\' +

    CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) AS col2

    INTO #testEnvironment

    FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3

    PRINT '========== BASELINE =========='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    SELECT COUNT(*)

    FROM #testEnvironment

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    PRINT REPLICATE('=',80)

    PRINT '========== CTE MATCH =========='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    ;WITH CTE AS (

    SELECT a.col1, b.col2

    FROM #testEnvironment a

    CROSS JOIN #testEnvironment b)

    SELECT COUNT(*)

    FROM (

    SELECT col1, col2

    FROM CTE a

    CROSS APPLY (SELECT PATINDEX('%\%', col1) AS patternCol1,

    PATINDEX('%\%', col2) AS patternCol2,

    LEN(col1) AS lengthCol1, LEN(col2) AS lengthCol2

    FROM CTE

    WHERE a.col1 = col1 AND a.col2 = col2) b

    WHERE ((SUBSTRING(col1,1,patternCol1-1) = SUBSTRING(col2,1,patternCol2-1))

    OR (SUBSTRING(col1,1,patternCol1-1) = SUBSTRING(col2,patternCol2+1, lengthCol2)))

    AND ((SUBSTRING(col1,patternCol1+1, lengthCol1) = SUBSTRING(col2,1,patternCol2-1))

    OR (SUBSTRING(col1,patternCol1+1, lengthCol1) = SUBSTRING(col2,patternCol2+1, lengthCol2)))

    ) a

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    PRINT REPLICATE('=',80)

    ROLLBACK

    First, the above script tests 1000 rows.

    ========== BASELINE ==========

    Table '#testEnvironment____________________________________________________________________________________________________000000000026'. Scan count 1, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    ================================================================================

    ========== CTE MATCH ==========

    Table '#testEnvironment____________________________________________________________________________________________________000000000026'. Scan count 19, logical reads 42, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 4, logical reads 13320, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 2235 ms, elapsed time = 619 ms.

    ================================================================================

    OK, not too terrible. Let's try 5,000 rows.

    BEGIN TRAN

    SET NOCOUNT ON

    IF object_id('tempdb..#testEnvironment') IS NOT NULL

    BEGIN

    DROP TABLE #testEnvironment

    END

    --5000 Random rows of data

    SELECT TOP 5000 IDENTITY(INT,1,1) AS ID,

    CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65)

    + '\' +

    CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) AS col1,

    CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65)

    + '\' +

    CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) AS col2

    INTO #testEnvironment

    FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3

    PRINT '========== BASELINE =========='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    SELECT COUNT(*)

    FROM #testEnvironment

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    PRINT REPLICATE('=',80)

    PRINT '========== CTE MATCH =========='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    ;WITH CTE AS (

    SELECT a.col1, b.col2

    FROM #testEnvironment a

    CROSS JOIN #testEnvironment b)

    SELECT COUNT(*)

    FROM (

    SELECT col1, col2

    FROM CTE a

    CROSS APPLY (SELECT PATINDEX('%\%', col1) AS patternCol1,

    PATINDEX('%\%', col2) AS patternCol2,

    LEN(col1) AS lengthCol1, LEN(col2) AS lengthCol2

    FROM CTE

    WHERE a.col1 = col1 AND a.col2 = col2) b

    WHERE ((SUBSTRING(col1,1,patternCol1-1) = SUBSTRING(col2,1,patternCol2-1))

    OR (SUBSTRING(col1,1,patternCol1-1) = SUBSTRING(col2,patternCol2+1, lengthCol2)))

    AND ((SUBSTRING(col1,patternCol1+1, lengthCol1) = SUBSTRING(col2,1,patternCol2-1))

    OR (SUBSTRING(col1,patternCol1+1, lengthCol1) = SUBSTRING(col2,patternCol2+1, lengthCol2)))

    ) a

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    PRINT REPLICATE('=',80)

    ROLLBACK

    ========== BASELINE ==========

    Table '#testEnvironment____________________________________________________________________________________________________000000000027'. Scan count 1, logical reads 22, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    ================================================================================

    ========== CTE MATCH ==========

    Table '#testEnvironment____________________________________________________________________________________________________000000000027'. Scan count 19, logical reads 154, physical reads 0, read-ahead reads 2, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 4, logical reads 157464, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 74110 ms, elapsed time = 19701 ms.

    ================================================================================

    Ack!! Not scaling well at all, as promised 😉

    Try it yourself, but you'll find that this sort of problem has to be sorted out by normalisation rather than methods like this.

    If you insist on sticking with this sort of design, then I think you'll always have massive performance degradation.


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • SQL_By_Chance (11/28/2011)


    Cause the client requirement is like that 🙂 .

    How often do I hear this but in reality it is not that the client requires the data to be stored a certain (and horrible) way. It is that the client wants it presented in a certain way. VERY rarely do clients even want to be involved at the data level let alone make system design decisions about how the data is stored. The reason they are hiring this work out is because they lack the expertise to do it themselves. If the client is mandating how you write your code it may be time to fire the client. 😛

    Splitting this like Gianluca suggested is by far the best approach to this.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean u had me ...The data is very well normalized but I wanted to know a logic to match such data other than normalizing it (out of curiosity :hehe: and to know the power of SQL vs .net)

    ______________________________________________________________________

    Ankit
    MCITP Database developer SQL 2008,MCTS Database Administration SQL 2008
    "Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
    You are already naked. There is no reason not to follow your heart.”

  • SQL_By_Chance (11/28/2011)


    Sean u had me ...The data is very well normalized but I wanted to know a logic to match such data other than normalizing it (out of curiosity :hehe: and to know the power of SQL vs .net)

    There is nothing like normalized data, and your description is nothing like normalized data.

    SQL_By_Chance (11/25/2011)


    The data is normalized but I have concatenated fields separated by "/".

    Data in that type of format is not normalized at all and is a total PITA to deal with.

    ...the power of SQL vs .net

    SQL is incredibly powerful at doing what it does best. .NET is incredibly powerful at doing what it does best. They do NOT do the same thing. That is why they are frequently used together to provide a complete solution. There are some things that can be done in .NET that should be done in SQL, and some things you can do in SQL should be left to .NET. The challenge is in knowing which technology is best suited for the task at hand.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (11/28/2011)


    SQL_By_Chance (11/28/2011)


    Sean u had me ...The data is very well normalized but I wanted to know a logic to match such data other than normalizing it (out of curiosity :hehe: and to know the power of SQL vs .net)

    There is nothing like normalized data, and your description is nothing like normalized data.

    SQL_By_Chance (11/25/2011)


    The data is normalized but I have concatenated fields separated by "/".

    Data in that type of format is not normalized at all and is a total PITA to deal with.

    ...the power of SQL vs .net

    SQL is incredibly powerful at doing what it does best. .NET is incredibly powerful at doing what it does best. They do NOT do the same thing. That is why they are frequently used together to provide a complete solution. There are some things that can be done in .NET that should be done in SQL, and some things you can do in SQL should be left to .NET. The challenge is in knowing which technology is best suited for the task at hand.

    +100 for it 😀

  • What I mean is the data is normalized in the table but I have created a derived column by concatenating the columns 🙂 .

    I just wanted a way to match [pq/ab] = [ab/pq] that too out of curiosity other than using Array in .net. I apologise for my inability to express my hidden desires to you. But Cadaver's solution (no matter how under performing it was ) gave me a chance to think in that direction.

    I can't express why exactly I asked the question. I think in weird directions 😛

    http://www.mercurialneophyte.blogspot.com/

    ______________________________________________________________________

    Ankit
    MCITP Database developer SQL 2008,MCTS Database Administration SQL 2008
    "Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
    You are already naked. There is no reason not to follow your heart.”

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply