Is the logic the same for these two queries?

  • Are these two logically the same? 


    declare @chrono_start datetime, @chrono_end datetime, @record_id int
    if exists(select * from reclocat where record_id = @record_id)
    begin
      select @chrono_start = min(t.tabdate), @chrono_end = max(t.tabdate)
      from idxtabs t inner join idx it on t.idxid = it.idxid
      where it.record_id = @record_id and it.active = 1 and it.submitted = 1 and year(t.tabdate) > 1901
      update reclocat
         set
            pdf_chrono_start = @chrono_start,
            pdf_chrono_end = @chrono_end
         where record_id = @record_id
    end

    vs


    declare @chrono_start datetime, @chrono_end datetime, @record_id int
      update r
         set
            r.pdf_chrono_start = DateValues.Minimum,
            r.pdf_chrono_end = DateValues.Maximum
        FROM reclocat r
        INNER JOIN
        (
            SELECT
                min(t.tabdate) as Minimum,
                max(t.tabdate) as Maximum,
                it.record_id
            FROM idxtabs t inner join idx it on t.idxid = it.idxid
            WHERE it.record_id = @record_id and it.active = 1 AND it.submitted = 1 AND t.tabdate > 1901
            GROUP BY it.record_id
        
        ) AS DateValues
        ON DateValues.record_id=r.record_id

  • No, the first one will update the reclocat table regardless of whether this criteria is met, it.active = 1 and it.submitted = 1 and year(t.tabdate) > 1901

  • ZZartin - Friday, June 9, 2017 10:08 AM

    No, the first one will update the reclocat table regardless of whether this criteria is met, it.active = 1 and it.submitted = 1 and year(t.tabdate) > 1901

    And if there are no rows that match the SELECT, it'll update pdf_chrono_start and pdf_chrono_end to NULL, whereas the second one won't update if there are no rows returned from the subquery.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • ZZartin - Friday, June 9, 2017 10:08 AM

    No, the first one will update the reclocat table regardless of whether this criteria is met, it.active = 1 and it.submitted = 1 and year(t.tabdate) > 1901

    won't the update do its DML changes based on the dates collected by the 1st SELECT statement? On the 1st one, the whole purpose of the SELECT, in my opinion, is retrieve start and end based on the WHERE clause.

  • TheSQL_fan - Friday, June 9, 2017 11:03 AM

    ZZartin - Friday, June 9, 2017 10:08 AM

    No, the first one will update the reclocat table regardless of whether this criteria is met, it.active = 1 and it.submitted = 1 and year(t.tabdate) > 1901

    won't the update do its DML changes based on the dates collected by the 1st SELECT statement? On the 1st one, the whole purpose of the SELECT, in my opinion, is retrieve start and end based on the WHERE clause.

    Yes, but those will be NULL if the conditions in the WHERE clause aren't met. The update only filters by the recordid, and hence the dates get set to NULL for any recordid where the conditions in the SELECT aren't met

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Set up a sandbox with the tables and data involved and try them both.  Be sure to include test data that will result in null data for the start/end datetime variables and see for yourself.
    A cursory look at the code confirms, at least for me, what both Martin and Gail have said.

  • Thanks Gail

    So let me see if I understand. On the 1st case, the UPDATE will always run. And if the dates are null for both, it will UPDATE the records (only) based on record_id ?

  • TheSQL_fan - Friday, June 9, 2017 11:27 AM

    Thanks Gail

    So let me see if I understand. On the 1st case, the UPDATE will always run. And if the dates are null for both, it will UPDATE the records (only) based on record_id ?

    That is exactly correct.  SQL Server doesn't just stop processing if a statement doesn't return any rows, so even if the SELECT doesn't actually set anything it will continue on to the UPDATE which will simply update the values to NULL.

    The second query will not do that because it is inner joined to the sub query so if there are no rows found the update won't run on any rows.

  • TheSQL_fan - Friday, June 9, 2017 11:27 AM

     And if the dates are null for both, it will UPDATE the records (only) based on record_id ?

    It always updates only based on the record_id


    where record_id = @record_id

    That's the only filter on that UPDATE statement

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • ok, I checked with the developer, and while the logic sounded weird to me, it seems is doing what it's intended for.

    My problem is that I am having some performance issues with it. It ran thousands of times and it is slowing down the app, due locks and the long time it takes sometimes to run.

    I got an estimated plan (attached to this email) I can see that the function can be eliminated from the date attribute but an Index there did not help either. Any other tsql performance improvement I should take into consideration?

    EDIT:
    I can't attach the plan. I am assuming is because my user is new to the forum? Anyway, there is a keylookup against the table that contains the dates and that's the most expensive operator there.

  • Based on the fact that someone used the YEAR() function in one of the renditions of the code, can we safely assume that the t.tabdate column is a DATETIME column?  If so, then all versions of the code that use ">1901" are incorrect (but possibly still acceptable) because that's actually comparing the date of 1905-03-17, which is 1901 days after the first of January, 1900.  It does this because there are no quotes and the integer value is being used as a "Date Serial Number" rather than a year.  You need to put single quotes around the 1901 to have it be interpreted as the first of January, 1901.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • t.tabdate is datetime

  • Jeff Moden - Monday, June 12, 2017 12:24 PM

    Based on the fact that someone used the YEAR() function in one of the renditions of the code, can we safely assume that the t.tabdate column is a DATETIME column?  If so, then all versions of the code that use ">1901" are incorrect (but possibly still acceptable) because that's actually comparing the date of 1905-03-17, which is 1901 days after the first of January, 1900.  It does this because there are no quotes and the integer value is being used as a "Date Serial Number" rather than a year.  You need to put single quotes around the 1901 to have it be interpreted as the first of January, 1901.

     

    In addition Year(field) > 1901 and > '1901' are also not equal.  As Jeff mentioned, '1901' would be interpreted as > '01/01/1901', whereas Year(field)> 1901 would be interpreted as >= '01/01/1902'

    WHERE YEAR(datetimefield) > 1901 --next smallest date value = '1/1/1902'

    WHERE datetimefield  > 1901 --implicit DATEADD(DAY,1901,'1/1/1900'); next smallest date value = '3/17/1905 00:00:00.003'

    WHERE datetimefield > '1901' --implicit conversion to '1/1/1901';  next smallest date value = '1/1/1901 00:00:00.003'

    • This reply was modified 5 years, 5 months ago by  whenriksen.
    • This reply was modified 4 years, 6 months ago by  whenriksen.

    Wes
    (A solid design is always preferable to a creative workaround)

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply