Matching the Char

Question

Post reply

Matching the Char

SQL_By_Chance

Hall of Fame

Points: 3915
More actions
November 25, 2011 at 5:19 am

#247397

Hi,
I have 2 columns:
col1 col2
abc\aaa aaa\abc
ax\pq pqr\aaa
prq\aaa prt\arz
I want to query out results with maching characters :
Result
col1 col2
abc\aaa aaa\abc
prq\aaa pqr\aaa
Is there any way of doing this using SQL?
Regards,
Ankit
______________________________________________________________________
Ankit
MCITP Database developer SQL 2008,MCTS Database Administration SQL 2008
"Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
You are already naked. There is no reason not to follow your heart.”

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

spaghettidba SSC Guru Points: 105712 More actions · Answer 1

You can use a character splitter function such as the one described by Jeff Moden in his excellent article[/url].

Please note that if you're doing this, you're probably violating 1NF. Consider re-designing your table.

Hope this helps

Gianluca

-- Gianluca Sartori

SQL_By_Chance Hall of Fame Points: 3915 More actions · Answer 2

Hi Gila,

Thanks for your reply. The data is normalized but I have concatenated fields separated by "/". I want to now populate ID to these fields with fields eg: [abc/pqr] = [pqr/abc].

I wanted to know if it was possible without splitting.

Eagerly waiting for your response.

Ankit

______________________________________________________________________

Ankit
MCITP Database developer SQL 2008,MCTS Database Administration SQL 2008
"Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
You are already naked. There is no reason not to follow your heart.”

spaghettidba SSC Guru Points: 105712 More actions · Answer 3

SQL_By_Chance (11/25/2011)
The data is normalized but I have concatenated fields separated by "/".

Oxymoron. The former or the latter, it can't be both at the same time.

I wanted to know if it was possible without splitting.

No, not possible. Why don't you want to split? You can do it on the fly in your query, you don't need to apply changes to your existing data.

-- Gianluca Sartori

Devendrakumar Shirbad SSC-Forever Points: 42493 More actions · Answer 4

The data is normalized but I have concatenated fields separated by "/".

Only one question. Why???

Cadavre SSC-Forever Points: 41690 More actions · Answer 5

SQL_By_Chance (11/25/2011)
The data is normalized but I have concatenated fields separated by "/"

Hope that CELKO doesn't see that comment 😉

SQL_By_Chance (11/25/2011)
I wanted to know if it was possible without splitting.

So you want to compare split parts of a string without splitting? I don't think you'll find a way to do that.

DECLARE @TABLE AS TABLE (col1 VARCHAR(7), col2 VARCHAR(7))

INSERT INTO @TABLE

SELECT 'abc\aaa', 'aaa\abc'

UNION ALL SELECT 'ax\pq', 'pqr\aaa'

UNION ALL SELECT 'prq\aaa', 'prt\arz'

UNION ALL SELECT 'axi\pq', 'prq\aaa'

;WITH CTE AS (

SELECT a.col1, b.col2

FROM @TABLE a

CROSS JOIN @TABLE b)

SELECT col1, col2

FROM CTE a

CROSS APPLY (SELECT PATINDEX('%\%', col1) AS patternCol1,

PATINDEX('%\%', col2) AS patternCol2,

LEN(col1) AS lengthCol1, LEN(col2) AS lengthCol2

FROM CTE

WHERE a.col1 = col1 AND a.col2 = col2) b

WHERE ((SUBSTRING(col1,1,patternCol1-1) = SUBSTRING(col2,1,patternCol2-1))

OR (SUBSTRING(col1,1,patternCol1-1) = SUBSTRING(col2,patternCol2+1, lengthCol2)))

AND ((SUBSTRING(col1,patternCol1+1, lengthCol1) = SUBSTRING(col2,1,patternCol2-1))

OR (SUBSTRING(col1,patternCol1+1, lengthCol1) = SUBSTRING(col2,patternCol2+1, lengthCol2)))

The above, which is not as efficient as the string splitter that you were pointed at, will compare the string parts with each other.

e.g. abc\aaa = aaa\abc but prq\aaa != pqr\aaa

Forever trying to learn
My blog - http://www.cadavre.co.uk/
For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

SQL_By_Chance Hall of Fame Points: 3915 More actions · Answer 6

Thanks Gina

______________________________________________________________________

Ankit
MCITP Database developer SQL 2008,MCTS Database Administration SQL 2008
"Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
You are already naked. There is no reason not to follow your heart.”

SQL_By_Chance Hall of Fame Points: 3915 More actions · Answer 7

Cause the client requirement is like that 🙂 .

______________________________________________________________________

Ankit
MCITP Database developer SQL 2008,MCTS Database Administration SQL 2008
"Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
You are already naked. There is no reason not to follow your heart.”

SQL_By_Chance Hall of Fame Points: 3915 More actions · Answer 8

Thanks Cadavre,

Wish you many more centuries for providing the solution.

Thanks,

Ankit

______________________________________________________________________

Ankit
MCITP Database developer SQL 2008,MCTS Database Administration SQL 2008
"Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
You are already naked. There is no reason not to follow your heart.”

Cadavre SSC-Forever Points: 41690 More actions · Answer 9

SQL_By_Chance (11/28/2011)
Thanks Cadavre,
Wish you many more centuries for providing the solution.
Thanks,
Ankit

Bear in mind that my solution is going to die a horrible death on a real table with even a few thousand rows. I'll set up a test to prove how poor the performance is later on.

This can be avoided by normalising the data.

--EDIT--

Tests, as promised: -

BEGIN TRAN

SET NOCOUNT ON

IF object_id('tempdb..#testEnvironment') IS NOT NULL

BEGIN

DROP TABLE #testEnvironment

END

--1000 Random rows of data

SELECT TOP 1000 IDENTITY(INT,1,1) AS ID,

CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65)

+ '\' +

CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) AS col1,

CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65)

+ '\' +

CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) AS col2

INTO #testEnvironment

FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3

PRINT '========== BASELINE =========='

SET STATISTICS IO ON

SET STATISTICS TIME ON

SELECT COUNT(*)

FROM #testEnvironment

SET STATISTICS TIME OFF

SET STATISTICS IO OFF

PRINT REPLICATE('=',80)

PRINT '========== CTE MATCH =========='

SET STATISTICS IO ON

SET STATISTICS TIME ON

;WITH CTE AS (

SELECT a.col1, b.col2

FROM #testEnvironment a

CROSS JOIN #testEnvironment b)

SELECT COUNT(*)

FROM (

SELECT col1, col2

FROM CTE a

CROSS APPLY (SELECT PATINDEX('%\%', col1) AS patternCol1,

PATINDEX('%\%', col2) AS patternCol2,

LEN(col1) AS lengthCol1, LEN(col2) AS lengthCol2

FROM CTE

WHERE a.col1 = col1 AND a.col2 = col2) b

WHERE ((SUBSTRING(col1,1,patternCol1-1) = SUBSTRING(col2,1,patternCol2-1))

OR (SUBSTRING(col1,1,patternCol1-1) = SUBSTRING(col2,patternCol2+1, lengthCol2)))

AND ((SUBSTRING(col1,patternCol1+1, lengthCol1) = SUBSTRING(col2,1,patternCol2-1))

OR (SUBSTRING(col1,patternCol1+1, lengthCol1) = SUBSTRING(col2,patternCol2+1, lengthCol2)))

) a

SET STATISTICS TIME OFF

SET STATISTICS IO OFF

PRINT REPLICATE('=',80)

ROLLBACK

First, the above script tests 1000 rows.

========== BASELINE ==========

Table '#testEnvironment____________________________________________________________________________________________________000000000026'. Scan count 1, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

================================================================================

========== CTE MATCH ==========

Table '#testEnvironment____________________________________________________________________________________________________000000000026'. Scan count 19, logical reads 42, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Worktable'. Scan count 4, logical reads 13320, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 2235 ms, elapsed time = 619 ms.

================================================================================

OK, not too terrible. Let's try 5,000 rows.

BEGIN TRAN

SET NOCOUNT ON

IF object_id('tempdb..#testEnvironment') IS NOT NULL

BEGIN

DROP TABLE #testEnvironment

END

--5000 Random rows of data

SELECT TOP 5000 IDENTITY(INT,1,1) AS ID,

CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65)

+ '\' +

CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) AS col1,

CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65)

+ '\' +

CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 5) + 65) AS col2

INTO #testEnvironment

FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3

PRINT '========== BASELINE =========='

SET STATISTICS IO ON

SET STATISTICS TIME ON

SELECT COUNT(*)

FROM #testEnvironment

SET STATISTICS TIME OFF

SET STATISTICS IO OFF

PRINT REPLICATE('=',80)

PRINT '========== CTE MATCH =========='

SET STATISTICS IO ON

SET STATISTICS TIME ON

;WITH CTE AS (

SELECT a.col1, b.col2

FROM #testEnvironment a

CROSS JOIN #testEnvironment b)

SELECT COUNT(*)

FROM (

SELECT col1, col2

FROM CTE a

CROSS APPLY (SELECT PATINDEX('%\%', col1) AS patternCol1,

PATINDEX('%\%', col2) AS patternCol2,

LEN(col1) AS lengthCol1, LEN(col2) AS lengthCol2

FROM CTE

WHERE a.col1 = col1 AND a.col2 = col2) b

WHERE ((SUBSTRING(col1,1,patternCol1-1) = SUBSTRING(col2,1,patternCol2-1))

OR (SUBSTRING(col1,1,patternCol1-1) = SUBSTRING(col2,patternCol2+1, lengthCol2)))

AND ((SUBSTRING(col1,patternCol1+1, lengthCol1) = SUBSTRING(col2,1,patternCol2-1))

OR (SUBSTRING(col1,patternCol1+1, lengthCol1) = SUBSTRING(col2,patternCol2+1, lengthCol2)))

) a

SET STATISTICS TIME OFF

SET STATISTICS IO OFF

PRINT REPLICATE('=',80)

ROLLBACK

========== BASELINE ==========

Table '#testEnvironment____________________________________________________________________________________________________000000000027'. Scan count 1, logical reads 22, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

================================================================================

========== CTE MATCH ==========

Table '#testEnvironment____________________________________________________________________________________________________000000000027'. Scan count 19, logical reads 154, physical reads 0, read-ahead reads 2, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Worktable'. Scan count 4, logical reads 157464, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 74110 ms, elapsed time = 19701 ms.

================================================================================

Ack!! Not scaling well at all, as promised 😉

Try it yourself, but you'll find that this sort of problem has to be sorted out by normalisation rather than methods like this.

If you insist on sticking with this sort of design, then I think you'll always have massive performance degradation.

Forever trying to learn
My blog - http://www.cadavre.co.uk/
For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Sean Lange SSC Guru Points: 286562 More actions · Answer 10

SQL_By_Chance (11/28/2011)
Cause the client requirement is like that 🙂 .

How often do I hear this but in reality it is not that the client requires the data to be stored a certain (and horrible) way. It is that the client wants it presented in a certain way. VERY rarely do clients even want to be involved at the data level let alone make system design decisions about how the data is stored. The reason they are hiring this work out is because they lack the expertise to do it themselves. If the client is mandating how you write your code it may be time to fire the client. 😛

Splitting this like Gianluca suggested is by far the best approach to this.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

SQL_By_Chance Hall of Fame Points: 3915 More actions · Answer 11

Sean u had me ...The data is very well normalized but I wanted to know a logic to match such data other than normalizing it (out of curiosity :hehe: and to know the power of SQL vs .net)

______________________________________________________________________

Ankit
MCITP Database developer SQL 2008,MCTS Database Administration SQL 2008
"Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
You are already naked. There is no reason not to follow your heart.”

Sean Lange SSC Guru Points: 286562 More actions · Answer 12

SQL_By_Chance (11/28/2011)
Sean u had me ...The data is very well normalized but I wanted to know a logic to match such data other than normalizing it (out of curiosity :hehe: and to know the power of SQL vs .net)

There is nothing like normalized data, and your description is nothing like normalized data.

SQL_By_Chance (11/25/2011)
The data is normalized but I have concatenated fields separated by "/".

Data in that type of format is not normalized at all and is a total PITA to deal with.

...the power of SQL vs .net

SQL is incredibly powerful at doing what it does best. .NET is incredibly powerful at doing what it does best. They do NOT do the same thing. That is why they are frequently used together to provide a complete solution. There are some things that can be done in .NET that should be done in SQL, and some things you can do in SQL should be left to .NET. The challenge is in knowing which technology is best suited for the task at hand.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Devendrakumar Shirbad SSC-Forever Points: 42493 More actions · Answer 13

Sean Lange (11/28/2011)
SQL_By_Chance (11/28/2011)
Sean u had me ...The data is very well normalized but I wanted to know a logic to match such data other than normalizing it (out of curiosity :hehe: and to know the power of SQL vs .net)
There is nothing like normalized data, and your description is nothing like normalized data.
SQL_By_Chance (11/25/2011)
The data is normalized but I have concatenated fields separated by "/".
Data in that type of format is not normalized at all and is a total PITA to deal with.
...the power of SQL vs .net
SQL is incredibly powerful at doing what it does best. .NET is incredibly powerful at doing what it does best. They do NOT do the same thing. That is why they are frequently used together to provide a complete solution. There are some things that can be done in .NET that should be done in SQL, and some things you can do in SQL should be left to .NET. The challenge is in knowing which technology is best suited for the task at hand.

+100 for it 😀

SQL_By_Chance Hall of Fame Points: 3915 More actions · Answer 14

What I mean is the data is normalized in the table but I have created a derived column by concatenating the columns 🙂 .

I just wanted a way to match [pq/ab] = [ab/pq] that too out of curiosity other than using Array in .net. I apologise for my inability to express my hidden desires to you. But Cadaver's solution (no matter how under performing it was ) gave me a chance to think in that direction.

I can't express why exactly I asked the question. I think in weird directions 😛

http://www.mercurialneophyte.blogspot.com/

______________________________________________________________________

Ankit
MCITP Database developer SQL 2008,MCTS Database Administration SQL 2008
"Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
You are already naked. There is no reason not to follow your heart.”