Nested loop join operator estimate rows is 10000 in compatibility level 2014.

  • A simple query joining 2 tables:

    select * from t1 join t2 on t1.id = t2.id

    t1 is a table with only column id with a user defined statistic with norecompute. t1 is very volatile and stores temporary id's. The norecompute is to prevent recompilations (the actual query is much larger, I've trimmed it down to the essential part).

    t2 is a table with id as pk and clustered index.

    For my test I have put 5 id's in t1 and a million in t2.

    The execution plan shows a table scan on t1 and a clustered index seek on t2. The nested loop operator probes t2 using the results from t1.

    The nested loop operator shows a high number for the estimated nr of rows. Clearly it can never be higher than 5 but it says 10000.

    And when I set the compatibility level to 2012 the estimate is 5!

    The problem is that these volatile tables storing temporary ids with norecompute stats are quite intensively used in many queries. The wrong estimation is propagated to the rest of the query plans and this turns out to be not optimal. Query duration goes from 5ms in 2012 to 200ms in 2014 comp level and this will hurt performance when execution frequency goes up. And we are not too keen on putting databases in older compatibility levels.

    Exec plans for 2014 and 2012 attached.

    if object_id('dbo.t1') is not null drop table dbo.t1

    go

    if object_id('dbo.t2') is not null drop table dbo.t2

    go

    create table dbo.t1(id int not null)

    go

    create statistics stat_id on dbo.t1(id) with norecompute

    go

    --insert 5 integers into t1

    insert into dbo.t1 values(1),(2),(3),(4),(5)

    go

    create table dbo.t2(id int not null

    constraint pk_demotable primary key nonclustered (id asc)

    )

    go

    --insert 1000000 integers into t2

    ;with e1(n) as

    (

    select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all

    select 1 union all select 1 union all select 1 union all select 1 union all select 1

    ), -- 10

    e2(n) as (select 1 from e1 cross join e1 as a), -- 10*10

    e3(n) as (select 1 from e1 cross join e2), -- 10*100

    e6(n) as (select 1 from e3 cross join e3 as b), -- 1000*1000

    ints(n) as (select n = row_number() over (order by n) from e6)

    insert into t2 (id)

    select n

    from ints

    go

    --Examine exec plan in 2012 and 2014 compatibility mode

    select * from t1 join t2 on t1.id = t2.id

    go

  • Bouke Bruinsma (11/17/2015)


    A simple query joining 2 tables:

    select * from t1 join t2 on t1.id = t2.id

    t1 is a table with only column id with a user defined statistic with norecompute. t1 is very volatile and stores temporary id's. The norecompute is to prevent recompilations (the actual query is much larger, I've trimmed it down to the essential part).

    t2 is a table with id as pk and clustered index.

    For my test I have put 5 id's in t1 and a million in t2.

    The execution plan shows a table scan on t1 and a clustered index seek on t2. The nested loop operator probes t2 using the results from t1.

    The nested loop operator shows a high number for the estimated nr of rows. Clearly it can never be higher than 5 but it says 10000.

    And when I set the compatibility level to 2012 the estimate is 5!

    The problem is that these volatile tables storing temporary ids with norecompute stats are quite intensively used in many queries. The wrong estimation is propagated to the rest of the query plans and this turns out to be not optimal. Query duration goes from 5ms in 2012 to 200ms in 2014 comp level and this will hurt performance when execution frequency goes up. And we are not too keen on putting databases in older compatibility levels.

    Exec plans for 2014 and 2012 attached.

    if object_id('dbo.t1') is not null drop table dbo.t1

    go

    if object_id('dbo.t2') is not null drop table dbo.t2

    go

    create table dbo.t1(id int not null)

    go

    create statistics stat_id on dbo.t1(id) with norecompute

    go

    --insert 5 integers into t1

    insert into dbo.t1 values(1),(2),(3),(4),(5)

    go

    create table dbo.t2(id int not null

    constraint pk_demotable primary key nonclustered (id asc)

    )

    go

    --insert 1000000 integers into t2

    ;with e1(n) as

    (

    select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all

    select 1 union all select 1 union all select 1 union all select 1 union all select 1

    ), -- 10

    e2(n) as (select 1 from e1 cross join e1 as a), -- 10*10

    e3(n) as (select 1 from e1 cross join e2), -- 10*100

    e6(n) as (select 1 from e3 cross join e3 as b), -- 1000*1000

    ints(n) as (select n = row_number() over (order by n) from e6)

    insert into t2 (id)

    select n

    from ints

    go

    --Examine exec plan in 2012 and 2014 compatibility mode

    select * from t1 join t2 on t1.id = t2.id

    go

    OK, good summary. What is your question, exactly?

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Issue seems to be down to stale statistics on t1.

    The cardinality estimator had its first rewrite since SQL 2000 in the 2014 version.

    The following document may help, especially the troubleshooting section

    https://msdn.microsoft.com/en-us/library/dn673537.aspx

  • You could try using OPTIMIZE FOR ...

    You could also consider adding trace flag 9481 to your problematic queries, at least in the short term.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • There is a void statistic on t1 with norecompute to prevent recompilations. It cannot be stale by design.

    The issue here is that the estimates of the nested loop are much higher in 2014 comp level and this causes different, worse exec plans.

    The question is therefore: does anyone recognize this behaviour? Is 10000 a default estimate?

    I have tried OPTIMIZE FOR UNKNOWN. The query is not parametrized so OPTIMIZE FOR <parameter values> will not work. Interestingly the OPTION(FAST 5) improves a lot since it sort of forces the estimate.

    Trace 9481

    Good suggestion on the CE document. Will read it.

    Trace flag 9481 might be a good short term workaround.

  • From the CE document

    Page 39

    Stale Statistics

    You may determine that the statistics are outdated. This is not necessarily an issue if the data distribution within the statistics object histogram and density vector still represent the current state of the data. Stale statistics can be problematic, however, if they no longer accurately represent the current data distribution. Check the statistics object using DBCC SHOW_STATISTICS to verify if the lack of recent updates is actually causing bad estimates. Additionally, verify the histogram and density vector. If stale statistics are an issue, verify that automatic statistics updates are enabled for the database using sys.databases and the is_auto_update_stats_on column. If these are disabled, verify that manual statistics maintenance is scheduled to occur on an appropriate schedule. Check if individual statistics objects are set to “no recompute”; you can verify this by querying sys.stats and the no_recompute column.

    If your table is very large, the statistics update threshold and the frequency of the associated automatic update of statistics may not be enough to maintain useful statistics. Consider a more frequent manual scheduling of statistics maintenance or consider activating trace flag 2371. Trace flag 2371 uses a decreasing, dynamic statistics update threshold, calculated as the square root of 1,000 multiplied by the table cardinality.

    With the no_recompute option being on, on the statistic, the auto update stats process isn't happening, which could lead to stale statistics if your not manually maintaining the statistic. As the data is volatile as you have detailed, if the statistic isn't updated in line with how volatile the data is, the statistics will be stale and could produce an incorrect estimate as you have told SQL to not bother recreating the histogram for that particular stat.

    Just out of interest if you do a manual update statistics on t1 is the estimation different?

  • Is t1 a #temp table?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Updating the statistic manually indeed helps produce a better estimate. Not sure why it's 1 though as I would expect 5.

    But this is good to know that we could have some means of forcing better estimates beside the already mentioned query trace flag.

    t1 is not a #temp table. Meaning it's not stored in the tempdb but it's purpose is to contain temporary data briefly. The table is even re-created when our software is restarted.

  • Bouke Bruinsma (11/17/2015)


    There is a void statistic on t1 with norecompute to prevent recompilations. It cannot be stale by design.

    And that, right there, is the cause. Useless statistics resulting in a row estimation error. The fact that you're intentionally making the statistic stale doesn't change the situation.

    Why avoid recompiles? It's just the statement that will recompile. Unless you have such a high rate of compiles that it's either causing CPU pressure or contention on the plan cache itself, it's probably better to recompile. It's not like it was in SQL 2000 any longer.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • The software refreshes data to subscribers. Various sources can generate a change. These changes are identified by a set of id's. These id's are first inserted into t1, queried against and then deleted again. My example is quite simple but in practice t2 is a view touching several tables.

    There is a huge amount of transactions. Re-compile times would become a problem when t1 has a normal statistic that can be automatically updated.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply