Does the order of a CASE statement improve performance

  • Hi all,

    I've been given a rather meaty CASE statement which is of the form:

    UPDATE Table SET x =

    CASE

    WHEN y LIKE '%EFG%' THEN 3

    WHEN y LIKE 'ABC%' THEN 1

    WHEN y LIKE 'DEF%' THEN 2

    END

    I know that %EFG% will perform the worst as it won't be able to use the index on Column y, so if that was at the bottom of the CASE statement, would performance improve (let's assume all three WHENs have equal significance), and the ordering doesn't affect the logic.

    Thanks

    Singhster

  • Cool question and kind of fun. I tested it out this way:

    IF (SELECT OBJECT_ID('Test1')

    ) IS NOT NULL

    DROP TABLE dbo.Test1;

    GO

    CREATE TABLE dbo.Test1 (C1 VARCHAR(50),C2 INT, C3 INT IDENTITY);

    SELECT TOP 1500

    IDENTITY( INT,1,1 ) AS n

    INTO #Nums

    FROM Master.dbo.SysColumns sC1,

    Master.dbo.SysColumns sC2;

    INSERT INTO dbo.Test1

    (C1,C2)

    SELECT n, n

    FROM #Nums;

    DROP TABLE #Nums;

    CREATE NONCLUSTERED INDEX i1 ON dbo.Test1 (C1) ;

    UPDATE dbo.test1

    SET C2 =

    CASE

    WHEN C1 LIKE '%42%' THEN 3

    WHEN C1 LIKE '24%' THEN 2

    WHEN C1 LIKE '36%' THEN 1

    END

    DBCC FREEPROCCACHE()

    UPDATE dbo.test1

    SET C2 =

    CASE

    WHEN C1 LIKE '19%' THEN 3

    WHEN C1 LIKE '25%' THEN 2

    WHEN C1 LIKE '37%' THEN 1

    END

    Both queries ended up with identical execution plans, right down to the query hash and plan hash. The issue is, the CASE statement itself is a function that forces a scan on the table. You won't, ever, see a performance improvement by modifying the order of the CASE statements or even eliminating the wild card as I did because the CASE itself is a function on the column, requiring a scan.

    And thanks for the blog post, writing it up now.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Great work Grant, thanks for the response. I suspected as such, but didn't have the nous to test it myself the way you have. Thanks again.

  • Nice analysis, Grant! While this hypothetical assumes the order of the WHENs don't have any logical significance, it should be emphasized for any folks new to CASE statements that the order usually does have logical significance. The first WHEN that matches your data will be used. So if we had a string 'DEFG', this version

    UPDATE Table SET x =

    CASE

    WHEN y LIKE '%EFG%' THEN 3

    WHEN y LIKE 'ABC%' THEN 1

    WHEN y LIKE 'DEF%' THEN 2

    END

    Will code it as 3. If we change the order of WHENs like this:

    UPDATE Table SET x =

    CASE

    WHEN y LIKE 'ABC%' THEN 1

    WHEN y LIKE 'DEF%' THEN 2

    WHEN y LIKE '%EFG%' THEN 3

    END

    It will be coded as 2. Just a pedantic point to keep in mind. 😀

  • Grant,

    My understanding was the short circuit capabilities of the CASE statement would improve performance, but in a limited variety.

    Instead of a simple data change on the table with no where clause (so permanent scanning), something like this would definately be different depending on what the data looked like:

    ...

    CASE WHEN fielda = 1

    THEN 2

    WHEN fielda = (SELECT somevalue FROM otherTable WHERE keyfield = tablea.keyfield)

    THEN NULL

    ELSE 3

    END

    ...

    If you don't have anything that doesn't have a fielda = 1, that subquery will never process... or so I understood. If this isn't your understanding I'll go setup a test process to see where I'm confused.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Evil Kraig F (2/13/2014)


    Grant,

    My understanding was the short circuit capabilities of the CASE statement would improve performance, but in a limited variety.

    Instead of a simple data change on the table with no where clause (so permanent scanning), something like this would definately be different depending on what the data looked like:

    ...

    CASE WHEN fielda = 1

    THEN 2

    WHEN fielda = (SELECT somevalue FROM otherTable WHERE keyfield = tablea.keyfield)

    THEN NULL

    ELSE 3

    END

    ...

    If you don't have anything that doesn't have a fielda = 1, that subquery will never process... or so I understood. If this isn't your understanding I'll go setup a test process to see where I'm confused.

    What you said is correct.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Evil Kraig F (2/13/2014)


    Grant,

    My understanding was the short circuit capabilities of the CASE statement would improve performance, but in a limited variety.

    Instead of a simple data change on the table with no where clause (so permanent scanning), something like this would definately be different depending on what the data looked like:

    ...

    CASE WHEN fielda = 1

    THEN 2

    WHEN fielda = (SELECT somevalue FROM otherTable WHERE keyfield = tablea.keyfield)

    THEN NULL

    ELSE 3

    END

    ...

    If you don't have anything that doesn't have a fielda = 1, that subquery will never process... or so I understood. If this isn't your understanding I'll go setup a test process to see where I'm confused.

    I tried this:

    SELECT CASE WHEN t1.C2 = 1 THEN 2

    WHEN t1.C3 = (SELECT t2.C3

    FROM dbo.Test1 AS t2

    WHERE t2.C1 = '666'

    ) THEN NULL

    ELSE 3

    END AS SomeValue

    FROM dbo.Test1 t1

    WHERE t1.c1 = '1';

    And for the data loaded, that returns one row and one row only, the initial CASE value. But, if you look at the execution plan, the other SELECT statement is defined. Not saying it gets executed, but it's there, directly impacting the plan created.

    If I then run this:

    SELECT CASE WHEN t1.C2 = 1 THEN 2

    --WHEN t1.C3 = (SELECT t2.C3

    -- FROM dbo.Test1 AS t2

    -- WHERE t2.C1 = '666'

    -- ) THEN NULL

    ELSE 3

    END AS SomeValue

    FROM dbo.Test1 t1

    WHERE t1.c1 = '1';

    I get exactly the same number of reads between the two plans, but the execution plans are radically different, and the execution time of the first plan is a whole lot higher, even though, execution does short circuit. So... I don't know. Based on what I'm seeing, the short-circuit is only for execution, not plan creation, and plan creation kind of doesn't care about the short circuit, so you can end up with a performance loss, that I attribute to the extra stuff in the plan (which doesn't seem to get executed).

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Great question. Great analysis. Great follow up.

  • Grant,

    that's a very interesting result you found there, but I guess I should have restated my question. The order does matter, but not for the reasons you checked. You've compared two different sets of query logic.

    Whereas something like this would be the same, but just checking the effect of case order:

    CASE field<> 2 AND (SELECT...)

    WHEN 2 THEN...

    ELSE...

    END

    vs.

    CASE field = 2

    WHEN (Select...)

    ELSE...

    END

    This would be the same execution plan (basically) but with different timings, I believe. If i can ever get 30 minutes to myself today I'll build out a test structure.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply