Co-worker Dispute

  • Say you are returning 200-500K in records the temp table part is the bottle neck. this SP is hit a lot. Figure average users 20. which process is faster and better for longevity?

    CREATE TABLE #BlockMaterials ([BlockMaterialsID] uniqueidentifier,

    [BlockMaterialCategoriesId] uniqueidentifier,

    [MaterialsId] uniqueidentifier)

    CREATE CLUSTERED INDEX idx_#BlockMaterials_BlockMaterialsID ON #BlockMaterials ([BlockMaterialsID] ASC)

    SET @InsertSQL = 'INSERT INTO #BlockMaterials

    SELECT [BlockMaterialsID],BM.[BlockMaterialCategoriesId],BM.[MaterialsId]

    FROM [dbo].[BlockMaterials] AS BM

    INNER JOIN [dbo].[BlockMaterialCategories] AS BMC

    ON BM.[BlockMaterialCategoriesId] = BMC.[BlockMaterialCategoriesId]

    WHERE BMC.[BlocksID] = ''' + CONVERT(NVARCHAR(100),@BlocksID) +

    ''' AND BM.[MaterialsID] in (' + @MaterialsIDList + ')'

    EXEC(@InsertSQL)

    SELECT MC.[MaterialCostsID]

    ,MC.[CostEffectiveDatesID]

    ,BM.[BlockMaterialCategoriesId]

    ,BM.[MaterialsId]

    ,MC.[BlockMaterialsID]

    ,MC.[CostingMethod]

    ,MC.[Cost]

    ,MC.[BasedUnitsID]

    ,MC.[IsUsedForCosting]

    ,MC.[IsStocked]

    ,MC.[PartCode]

    ,MC.[StockLength]

    ,MC.[LengthInformation]

    ,MC.[TimeStamp]

    FROM [dbo].[MaterialCosts] AS MC

    INNER JOIN [dbo].[#BlockMaterials] AS BM ON MC.[BlockMaterialsID] = BM.[BlockMaterialsID]

    WHERE MC.[BlockMaterialsID] = BM.[BlockMaterialsID]

    AND MC.[CostEffectiveDatesID] = @CostEffectiveDatesID

    ORDER BY MC.[StockLength]

    VS.

    SELECT MC.[MaterialCostsID]

    ,MC.[CostEffectiveDatesID]

    ,BM.[BlockMaterialCategoriesId]k

    ,BM.[MaterialsId]

    ,MC.[BlockMaterialsID]

    ,MC.[CostingMethod]

    ,MC.[Cost]

    ,MC.[BasedUnitsID]

    ,MC.[IsUsedForCosting]

    ,MC.[IsStocked]

    ,MC.[PartCode]

    ,MC.[StockLength]

    ,MC.[LengthInformation]

    ,MC.[TimeStamp]

    FROM [dbo].[MaterialCosts] AS MC

    INNER JOIN

    (

    SELECT [BlockMaterialsID],BM.[MaterialsId],BMC.[BlockMaterialCategoriesId]

    FROM [dbo].[BlockMaterials] AS BM

    INNER JOIN [dbo].[BlockMaterialCategories] AS BMC

    ON BM.[BlockMaterialCategoriesId] = BMC.[BlockMaterialCategoriesId]

    WHERE BMC.[BlocksID] = @BlocksID AND BM.[MaterialsID] in (@MaterialsIDList)

    ) BM ON MC.[BlockMaterialsID] = BM.[BlockMaterialsID]

    WHERE MC.[CostEffectiveDatesID] = @CostEffectiveDatesID

    ORDER BY MC.[StockLength]

  • I generally would prefer the subquery over the temp table. I'm not sure if I would see any savings by using the temp table with an index but based on how the indexes are setup in the base tables I could be wrong.

    The true test is to run both and observe the query plans. Then run both with a trace running to see the true difference in CPU and reads. With the proper index I would vote for the subquery.

  • The second one for sure, since it does not populate an insert everytime you need the table. If 20 users use this procedure, 20 tables will be created and deleted everytime an user needs the data, compared to the second procedure, which only selects the data and uses it in the query.

    If I'm not mistaking, there might even be issues with the first for concurrency of the temp tables. Look out on BOL but if I remember, there are 2 types of temp tables, one that is shared to other connections, one that is not.

    Anyway, the best query is by far the second one.

    Cheers,

    J-F

    Cheers,

    J-F

  • Test both. Set IO and time stats on, and run each a few times, to get average run-time.

    In case you don't know:

    set statistics io on;

    set statistics time on;

    I'm going to guess that the second one will possibly run faster, but the only way to be sure is to test both.

    As far as contention for tempdb, that's not really going to be an issue (shouldn't be anyway) in SQL 2005. Plus, if the derived table is large enough, it'll get dumped into a worktable in tempdb anyway.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • When I do the index wizard there are not any optimizations. also I think it is only within the SP itself since there is only one #. ## is global

  • The index wizard, DTA, is not the final authority in indexing for sure. It is a guideline. I would do as GSquared stated and use the set stastics io and time on to see where you are incurring your highest reads and then take a look at your execution plan to ensure that you are getting the most of the indexing strategy you have in place.

    As he also stated, you should check the output from the statistics to look at the differences in reads with the two queries. Being that this is a pretty simple select they will probably be pretty similar.

    You could also post the execution plans for review to see if there are other thoughts on this as well.

    David

    @SQLTentmaker

    “He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • JKSQL (1/21/2009)


    which process is faster and better for longevity?

    Depends. Test them both under load and see which one scales better. Sometimes the temp table is better as it leads to a cleaner exec plan, sometimes the subquery is better as it avoids the insert hit. There's no one that's always better.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I'm inclined to say the second without any data. But I've seen instances where the first can be faster.

    Testing and examining the execution plans are the only way to be sure.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Grant Fritchey (1/21/2009)


    I'm inclined to say the second without any data. But I've seen instances where the first can be faster.

    If the indexes are good, the stats accurate and the plan optimal, the second should be faster. But that's a lot of ifs, especially on large rowsets.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Locking could be an issue if there is a lot of contention and temptable version "could" be better and on the other hand the subquery version could skip some extra work BUT I agree with the others without "typical load" is hard to tell.


    * Noel

  • FWIW - the second option would be the better option if it works. However, it will not work the way it has been presented here.

    SELECT [BlockMaterialsID],BM.[MaterialsId],BMC.[BlockMaterialCategoriesId]

    FROM [dbo].[BlockMaterials] AS BM

    INNER JOIN [dbo].[BlockMaterialCategories] AS BMC

    ON BM.[BlockMaterialCategoriesId] = BMC.[BlockMaterialCategoriesId]

    WHERE BMC.[BlocksID] = @BlocksID AND BM.[MaterialsID] in (@MaterialsIDList)

    In the above - you are using the variable @MaterialsIDList, which I am assuming is a comma delimited list of ID's. This structure does not work and has to be modified to use a function to convert the delimited list into a table, or you can use dynamic SQL.

    I should also point out that the second query using a derived table is no different than just joining the tables. SQL Server will convert the derived table to the appropriate joins as needed - so you might as well write the query that way in the first place.

    Jeffrey Williams
    Problems are opportunities brilliantly disguised as insurmountable obstacles.

    How to post questions to get better answers faster
    Managing Transaction Logs

  • They say temp table I say subquery

    I say "Test it."

    As J. Hennessey, a founder of MIPS Technologies once said, "Don't argue. Measure."

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • So I have the results. Of course I do not have actual DBs that big. so in this query I actually did not get any records back since the seed data is not good. I still look at this as the subquery winning. Am I correct?

    -----theSUBQUERY

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    (0 row(s) affected)

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'MaterialCosts'. Scan count 1, logical reads 26, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'BlockMaterialCategories'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'BlockMaterials'. Scan count 1, logical reads 420, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 16 ms, elapsed time = 30 ms.

    Here is the #Temp Table approach

    -----Temp table

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 3 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 3 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server parse and compile time:

    CPU time = 15 ms, elapsed time = 40 ms.

    Table '#BlockMaterials_____________________________________________________________________________________________________000000000011'. Scan count 0, logical reads 81, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'BlockMaterialCategories'. Scan count 0, logical reads 80, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'BlockMaterials'. Scan count 1, logical reads 420, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 94 ms, elapsed time = 226 ms.

    (40 row(s) affected)

    SQL Server Execution Times:

    CPU time = 109 ms, elapsed time = 266 ms.

    (0 row(s) affected)

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'MaterialCosts'. Scan count 1, logical reads 26, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table '#BlockMaterials_____________________________________________________________________________________________________000000000011'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 5 ms.

    When I see the time for the actual query it is faster but the prep is awful. I think the subquery wins.

  • In that (semi-valid) test, the sub-query definitely won.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • The only really valid test is with production levels of data in the tables.

    I have used temp tables for things like this when it prevented SQL Server from choosing a really bad query plan. When you have the temp table, SQL Server does not have the same options for picking query plans, and you may be able to use it to force it to pick a better one then it would otherwise. However, I would only resort to that after I was sure it was picking a bad plan and there was nothing else I could do to prevent it. I did this more with older versions of SQL Server, especially 6.5, where there seemed to be more need for nudging the query optimizer in the right direction.

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply