Difference between Like operator and Contains operator

  • I am trying to do a full text search on a text field. I have added the neccesary full text index on the field.

    I am a little disapointed in the full text search capabilities of SQL server, or maybe I am missing something really basic?

    As far as I can tell there is little difference between these two statements:

    Select * from myTable where textField like '%Cats and Dogs%'

    Select * from myTable where CONTAINS(textField, '"Cats and Dogs"')

    Both options will only return rows that contain the exact phrase "Cats and Dogs", neither will return records that contain either "Cats" or "Dogs" in them.

    If I type cats and dogs in google I will get a real full text search, meaning that I will get results that contain either words. How can i get SQL server 2005 to do a real full text search?

  • garethmann101 (11/18/2008)


    I am trying to do a full text search on a text field. I have added the neccesary full text index on the field.

    I am a little disapointed in the full text search capabilities of SQL server, or maybe I am missing something really basic?

    As far as I can tell there is little difference between these two statements:

    Select * from myTable where textField like '%Cats and Dogs%'

    Select * from myTable where CONTAINS(textField, '"Cats and Dogs"')

    Both options will only return rows that contain the exact phrase "Cats and Dogs", neither will return records that contain either "Cats" or "Dogs" in them.

    If I type cats and dogs in google I will get a real full text search, meaning that I will get results that contain either words. How can i get SQL server 2005 to do a real full text search?

    First, examine the execution plans between these two statements. You should see quite a difference.

    Second, using CONTAINS(textField, '"Cats and Dogs"') is the bare minimum format. For a "google-like" search that returns both words try Contains( textField, "Cat" OR "Dog").

    Finally, full text searching allows much deeper type searches such as proximity, inflection, and weights.

    Please read BOL for examples.

    DAB

  • Your Contains example using the Or operator is still no different from a Like using the or operator.

    Select * from MyTable where textfield like '%Cat%' or textfield like '%Dog%'

    Is essentially the same as

    Select * from MyTable where contains (textfield, '"Cat" or "Dog"')

    Both queries first require some elaborate code to split seperate words of a sentence entered by a user, which as far as I can tell would involve looping through each character in a string and checking to see if that character is " " and if it is put a breakpoint in the string and seperate it off.

    Is there no simpler way to split the string into seperate words so that either of these code examples will work?

  • Look at the execution plans for contains (textField, '"Cat" or "Dog"') and textField like '%Cat%' or textField like '%Dog%'

    One should contain an index seek and the latter an index scan. FOr a small table that may not be a big deal but as that table grows the index scan will perform like a table scan. That's why code similar to LIKE '%cat%' is not good. Say you have an index (regular, not fulltext) on that field. By putting the wildcard character first you eliminate the possibility of using that in a lookup (compared to LIKE 'Cat%').

    DAB

  • Yes, I have no doubt that the fulltext indexing is far more efficient, but efficiency is not the problem I am having, see my last post.

  • As far as I can tell there is little difference between these two statements:

    Select * from myTable where textField like '%Cats and Dogs%'

    Select * from myTable where CONTAINS(textField, '"Cats and Dogs"')

    That's what I saw in the original post and thus my answers. As far as your last post is concerned, it's not really a good use of SQL Server to parse a string. Several languages have built in tokenizers that can be used to pass a proper serach string to SQL Server for use in the CONTAINS clause

  • Yes, since my application is web-based, I have written mine in Vb Script. In case anyone else is also stuck on this, here it is:

    [Code]

    Function Parse2Array(sBuffer)

    Dim lEnd

    Dim i

    Dim sData

    Dim sTempArray()

    lEnd = InStr(sBuffer, " ") ' Gets Starting position of " "

    Do ' Starts the loop

    If lEnd > 0 Then ' If > 0 then we have an " " in the buffer

    sData = Left(sBuffer, lEnd - 1)

    ' sData will be all characters before lEnd

    sBuffer = Mid(sBuffer, lEnd + 1)

    ' We want to delete the data that we just got from the buffer including

    ReDim Preserve sTempArray(i)

    sTempArray(i) = sData ' Display the data

    i = i + 1

    lEnd = InStr(sBuffer, " ")

    ' Gets Starting position of " " in the new buffer

    End If

    Loop While lEnd > 0 ' Loop while " " is still present in the buffer

    'finaly add the last word to the array:

    ReDim Preserve sTempArray(i)

    sTempArray(i) = sBuffer ' Display the data

    'return:

    Parse2Array = sTempArray

    End Function

    [/CODE]

  • Since you're using VBScript, why not use the Split() function?

    DAB

  • Oh does that do the same thing? I did not know about the split function.

  • Yes it does...

    strContains = split( "cats dogs mice fish") will return an array of 4 values

    strContains(0) = "cats", etc.

    In addition, you can specify different delimiter values (the above sample uses the default delimiter value of a space).

    [code strContains = split ("cats, dogs, mice, fish", ",")

    DAB

  • One more thing on the full-text index. Is that fully contained within the database? I noticed when I created it that I first had to create a catalog. When I back-up my database to post to my remote host, will that catalog and all that is needed for the full text index to work be contained within that standalone back-up?

    There is also the added complication of noise words, I am aware that there is only one collection of noise words per instance of SQL Server, not per database. The remote host will not want to add my noise words to their entire server. So I will probably have to remove all my personal noise words from queries using VB script before they are passed to SQL Server. Is that the most sensible way to do it?

  • I want to know the difference between the working of these 2 opertors. LIKE and CONTAINS

  • garethmann101 (11/20/2008)


    One more thing on the full-text index. Is that fully contained within the database? I noticed when I created it that I first had to create a catalog. When I back-up my database to post to my remote host, will that catalog and all that is needed for the full text index to work be contained within that standalone back-up?

    There is also the added complication of noise words, I am aware that there is only one collection of noise words per instance of SQL Server, not per database. The remote host will not want to add my noise words to their entire server. So I will probably have to remove all my personal noise words from queries using VB script before they are passed to SQL Server. Is that the most sensible way to do it?

    Both items are addressed in SQL Server 2008.

    1. The full-text catalog is stored in the database vs. the file system

    2. You can customize the noise words per database.


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply