Inequality!

  • Comments posted to this topic are about the item Inequality!

  • In interesting corollary to this question (and, I have to admit, the question I thought it was asking) is "what happens if the number of spaces is different?", eg:

    declare @v varchar(20)

    select @v = ' '

    if isnull(@v,' ') ' ' select 'A'

    else

    select 'B'

    (interesting, at least to me, because SQL server truncates trailing spaces when doing string comparisons, for long-time backwards-compatibility with "char()" datatypes I assume - the answer is the same regardless of whether the number of spaces is the same or not)

    http://poorsql.com for T-SQL formatting: free as in speech, free as in beer, free to run in SSMS or on your version control server - free however you want it.

  • Tao Klerks (7/6/2009)


    (interesting, at least to me, because SQL server truncates trailing spaces when doing string comparisons, for long-time backwards-compatibility with "char()" datatypes I assume - the answer is the same regardless of whether the number of spaces is the same or not)

    Hi Tao,

    Actually, SQL Server does not truncate trailing spaces for string comparison; instead, it pads the length of the shorter string with spaces to match the path of the longer string. (The end result will always be the same, though). And this is not for backwards compatibility with CHAR (see below), but because the ANSI standard says that this is how strings have to be compared.

    (*) There are in fact two reasons that this is not for backward compatibility with CHAR. One is that CHAR is not an older feature; it's just as much part of the product as VARCHAR. The other is that CHAR doesn't truncate either, but also pads - CHAR does in fact already pad with spaces upon assignment. (Which makes sense, for how else would one store a three-character string in a fixed-length CHAR(10) column, if not by adding 7 spaces?)


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Thanks Hugo, as instructive as always!

    (I guess I probably should dig through those ANSI standards at some point... :Whistling:)

    http://poorsql.com for T-SQL formatting: free as in speech, free as in beer, free to run in SSMS or on your version control server - free however you want it.

  • Good and not difficult question. I'm pleased.

  • good question.

    Why is it "always a better practice to be checking for equality in conditions rather than inequality"?

    I would have thought that this rather depends on the check you're interested in. If, for example, you are only interested in the case where the two items are unequal, then you are introducing a redundant (empty) "then" clause...

    Kelsey Thornton
    MBCS CITP

  • Kelsey Thornton (7/6/2009)


    good question.

    Why is it "always a better practice to be checking for equality in conditions rather than inequality"?

    It's not.

    The question is good, but the explanation is utter nonsense.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • 😀

    In general, I would agree that testing for equality often makes for more easily readable code, but there are always exceptions...

    Kelsey Thornton
    MBCS CITP

  • Better how? It is always best to express the test in the most natural method possible to improve understanding. If a task is exclusionary test for inequality, if inclusive test for equality (or more accurately membership). The question is just a contrived example to make a point that is itself invalid.

    Running low on decent questions?

  • I'm not sure there's a reason to pick one over the other for simple tests, but as you get into queries, testing for equality can be more efficient. People often look for a column not equal to some value, which means every row must be read to determine if it is not equal. Equality gets a match quickly (or not) using indexes if they are around.

    The question was one submitted, and we ran it. I shall look over the explanation and change it.

  • Hi Steve,

    Steve Jones - Editor (7/6/2009)


    People often look for a column not equal to some value, which means every row must be read to determine if it is not equal. Equality gets a match quickly (or not) using indexes if they are around.

    Do you have any examples demonstrating this behaviour?

    This possible issue is interesting to me because we do have code that relies on indexes being used when appropriate, and does use the SQL server inequality operator. I have never seen cause for concern, but your statement piqued my curiosity.

    I just tested (with a completely trivial example, of course), but found that the indexes are used equally (and with efficient SEEKs) in both cases. Interestingly, the one "bad" case seemed to be "NOT IN()" with multiple entries. This caused a WHERE operation to appear in the plan.

    EG test:

    CREATE TABLE ZZ_DummyTable1 (

    RecordID INT NOT NULL,

    Type INT NOT NULL,

    Data NVarChar(255),

    CONSTRAINT PK_ZZ_DummyTable1 PRIMARY KEY CLUSTERED (

    RecordID

    )

    )

    GO

    CREATE INDEX IX_ZZ_DummyTable1_Type

    ON ZZ_DummyTable1 (Type)

    GO

    INSERT INTO ZZ_DummyTable1 SELECT 1, 1, 'This is just padding really, nothing useful to add...'

    INSERT INTO ZZ_DummyTable1 SELECT 2, 1, 'This is just padding really, nothing useful to add...'

    INSERT INTO ZZ_DummyTable1 SELECT 3, 3, 'This is just padding really, nothing useful to add...'

    INSERT INTO ZZ_DummyTable1 SELECT 4, 2, 'This is just padding really, nothing useful to add...'

    INSERT INTO ZZ_DummyTable1 SELECT 5, 1, 'This is just padding really, nothing useful to add...'

    INSERT INTO ZZ_DummyTable1 SELECT 6, 3, 'This is just padding really, nothing useful to add...'

    INSERT INTO ZZ_DummyTable1 SELECT 7, 1, 'This is just padding really, nothing useful to add...'

    INSERT INTO ZZ_DummyTable1 SELECT 8, 1, 'This is just padding really, nothing useful to add...'

    INSERT INTO ZZ_DummyTable1 SELECT 9, 3, 'This is just padding really, nothing useful to add...'

    INSERT INTO ZZ_DummyTable1 SELECT 10, 1, 'This is just padding really, nothing useful to add...'

    --Compare inequality to alternative - matching all the good alternatives

    -- It looks like the cost is the same, both Seek directly.

    SELECT Count(*)

    FROM ZZ_DummyTable1

    WHERE Type IN (2,3)

    SELECT Count(*)

    FROM ZZ_DummyTable1

    WHERE Type 1

    --Test discarding all the alternatives vs looking for the one you want

    -- Here looking for the one you want IS more efficient

    -- discarding multiple causes WHERE portion of operation to appear in query plan.

    SELECT Count(*)

    FROM ZZ_DummyTable1

    WHERE Type NOT IN (2,3)

    SELECT Count(*)

    FROM ZZ_DummyTable1

    WHERE Type = 1

    GO

    DROP TABLE ZZ_DummyTable1

    Thanks for any input!

    Tao

    http://poorsql.com for T-SQL formatting: free as in speech, free as in beer, free to run in SSMS or on your version control server - free however you want it.

  • I don't have time to test, and perhaps this works better in 2005/2008, but I know that a NOT IN() or a has resulted in a table scan before with indexes on that column. Perhaps it's a result of table size as well, but it's something I'd like to see tested by someone if I don't get to it.

  • With an ordered index, there is no difference between equal and not equal since both can easily be determined by the index with the not equal requiring (at most) one extra look-up. Without a usable index, the not equals would require a full table scan. However, without an index, the equals could also require a full table scan to find the last row. The only advantage for equals is that ON AVERAGE it wouldn't require a full scan while a not equal ALWAYS will.

  • Interesting... I was testing in 2000, it did not occur to me to check 2005/2008.

    Based on the results above, NOT IN() could well turn into a table scan rather than an index seek, because the "WHERE" part of the index seek is more expensive (and when combined with bookmark lookups, might well cause SQL Server to stick to the clustered index or heap).

    Expanding on the example above, SQL server (2000) is happy to use SEEKs to:

    - locate any number of individual values ("WHERE Type IN (1, 2)" or "WHERE Type = 1 OR Type = 2")

    - locate any number of "inner" ranges ("WHERE (Type > 3 AND Type 6 AND Type < 8")

    - locate a single inequality or "outer" range ("WHERE Type 1" or "WHERE Type > 1 OR Type < 1")

    But it is not happy to handle multiple inequalities or "outer" ranges. The following each end up with a WHERE component in the index seek:

    - "WHERE Type 1 AND Type 10"

    - "WHERE (Type > 1 OR Type 10 OR Type < 6)

    Does this make sense, does anyone know whether it generally holds true and/or why?

    (sorry Steve, I went off-topic, but I think I now understand where you were coming from :-))

    Barry McConnell (7/6/2009)


    With an ordered index, there is no difference between equal and not equal since both can easily be determined by the index with the not equal requiring (at most) one extra look-up. Without a usable index, the not equals would require a full table scan. However, without an index, the equals could also require a full table scan to find the last row. The only advantage for equals is that ON AVERAGE it wouldn't require a full scan while a not equal ALWAYS will.

    Not sure I understand. When does a not equals require a full table scan and an equals not require it? Can you provide an example?

    http://poorsql.com for T-SQL formatting: free as in speech, free as in beer, free to run in SSMS or on your version control server - free however you want it.

  • Steve Jones - Editor (7/6/2009)


    I shall look over the explanation and change it.

    Hi Steve,

    I see that you now have made a small change to the explanation, but I still do not like it.

    "It is sometimes a better practice to check for equality in conditions rather than inequality."

    Why? When? (i.e. under what conditions)

    "A simple change that would be better: (...)"

    What makes this change better? I fail to see any reaason to prefer one over the other. Unless you know something about the requirements documentation that I failed to find... 😉


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Viewing 15 posts - 1 through 15 (of 28 total)

You must be logged in to reply to this topic. Login to reply