Modulo operator problem

  • Cadavre (12/12/2011)


    Be aware, this code is different to Jim's because I think he was hitting the tables more than he needed to.

    Thank you for your insight. There's a problem though with your code Cadavre, it doesn't meet one of critireas that was set.

    That is, get users that only have old jobs. You might have overlooked that value 9999 represent ongoing job in timeendyear and timeendmonth.

    So if Noah would get an ongoing job "Cattle Driver", he'd still get selected in your script. Where Jims script excludes Noah, which is the right way.

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (4, 'Cattle Driver', 2006, 1, 9999, 9999)

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (4, 'Farmer', 2006, 1, 2009, 7)

    Is it possible to select only a single job if a user have same date for several jobs. Preferably the job with highest alfabetic value. That is select first jobname that is ordered by ASC.

    Here's the whole code:

    BEGIN TRAN

    SET NOCOUNT ON

    CREATE TABLE #workexperience (id INT identity(1, 1), userid INT, jobname VARCHAR(20) NULL, timestartyear INT, timestartmonth INT, timeendyear INT, timeendmonth INT)

    CREATE TABLE #users (id INT identity(1, 1), firstname VARCHAR(20), lastlogin DATE, inactive INT)

    INSERT INTO #users (firstname, lastlogin, inactive) VALUES ('Peter', '20111201', 0)

    INSERT INTO #users (firstname, lastlogin, inactive) VALUES ('Robert', '20111202', 1)

    INSERT INTO #users (firstname, lastlogin, inactive) VALUES ('Johan', '20111203', 0)

    INSERT INTO #users (firstname, lastlogin, inactive) VALUES ('Noah', '20111204', 0)

    INSERT INTO #users (firstname, lastlogin, inactive) VALUES ('David', '20001205', 0)

    INSERT INTO #users (firstname, lastlogin, inactive) VALUES ('Evil Twin', '20111204', 0)

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (1, 'Manager', 2011, 1, 9999, 9999)

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (2, 'Pilot', 2011, 2, 2011, 5)

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (3, 'Clown', 2011, 3, 2011, 11)

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (3, 'Librarian', 2010, 5, 2010, 12)

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (3, 'Accountant', 2010, 6, 2010, 7)

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (4, 'Cattle Driver', 2006, 1, 9999, 9999)

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (4, 'Farmer', 2006, 1, 2009, 7)

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (5, 'Driver', 2007, 4, 2008, 5)

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (6, 'Emperor', 2006, 1, 2009, 7)

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (6, 'Conqueror', 2006, 1, 2009, 7)

    insert into #workexperience(userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth)VALUES (6, 'Sailor', 2003, 1, 2004, 7)

    insert into #workexperience(userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth)VALUES (6, 'Captain', 2002, 1, 2005, 7)

    PRINT '========== Jim-720070 version =========='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    SELECT id, firstname, jobname, DaysSinceLastJob, lastJob

    FROM (SELECT u.id, u.firstname, we.jobname, DATEDIFF(DD, lastJob, GETDATE()) DaysSinceLastJob, --The below ROW_NUMBER function lets you get the most recent job

    ROW_NUMBER() OVER (ORDER BY u.id, we.timeendyear DESC, we.timeendmonth DESC) AS RecentJob, lastJob

    FROM (SELECT id, MAX(JobEnd) AS lastJob

    FROM (SELECT u.id, we.id AS WE_Id, we.jobname, --Here we're combining timestartyear and timestartmonth into a more useable DATETIME.

    DATEADD(MONTH, ((we.timestartyear - 1900) * 12) + we.timestartmonth - 1, 1 - 1) AS JobStart, --Here we're converting timeendyear and timeendmonth into a DATETIME

    --while creating a NULL if either of the values are 9999

    CASE WHEN we.timeendyear = 9999 OR we.timeendmonth = 9999

    THEN NULL

    ELSE DATEADD(MONTH, ((we.timeendyear - 1900) * 12) + we.timeendmonth - 1, 1 - 1) END AS JobEnd

    FROM #users u

    INNER JOIN #workexperience we ON we.userid = u.id

    WHERE u.lastlogin >= DATEADD(MM, - 6, GETDATE()) --Only users that have logged in in the last 6 months

    AND u.inactive <> 1 --Only users that are active

    ) job

    GROUP BY job.id --We need to group by the #users.id field to get the user's last job.

    --VV Here we need to find all users have a last job more than 6 months ago.

    HAVING MAX(ISNULL(JobEnd, GETDATE())) < DATEADD(M, - 6, GETDATE()) --VV Here we need to figure out the difference between their last job and today and make sure %2 = 0

    AND DATEDIFF(DD, MAX(JobEnd), GETDATE()) % 2 = 0

    ) oldJobs

    INNER JOIN #users u ON u.id = oldJobs.id --Now join back to the #users table to find the names of all the users that fit the criteria.

    INNER JOIN #workexperience we ON we.userid = u.id

    ) FinalJobs

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    PRINT REPLICATE('=',80)

    PRINT '========== Cadavre version =========='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    SELECT userid, firstname, lastEndDate AS lastJob, DATEDIFF(dd, lastEndDate, GETDATE()) AS daysSinceLastJob,

    MAX(jobname) AS jobname

    FROM (SELECT a.id AS userid, a.firstname, a.lastlogin, a.inactive, b.jobname,

    DATEADD(m, CASE WHEN timestartmonth = 9999

    THEN NULL

    ELSE timestartmonth-1 END, CONVERT(DATETIME, CASE WHEN timestartyear = 9999

    THEN NULL

    ELSE CONVERT(VARCHAR(4),timestartyear) END)) AS startDate,

    DATEADD(m, CASE WHEN timeendmonth = 9999

    THEN NULL

    ELSE timeendmonth-1 END, CONVERT(DATETIME, CASE WHEN timeendyear = 9999

    THEN NULL

    ELSE CONVERT(VARCHAR(4),timeendyear) END)) AS endDate,

    MAX(DATEADD(m, CASE WHEN timeendmonth = 9999

    THEN NULL

    ELSE timeendmonth-1 END, CONVERT(DATETIME, CASE WHEN timeendyear = 9999

    THEN NULL

    ELSE CONVERT(VARCHAR(4),timeendyear) END))) OVER (PARTITION BY a.id) AS lastEndDate

    FROM #users a

    INNER JOIN #workexperience b ON a.id = b.userid

    WHERE a.inactive <> 1 AND lastlogin >= DATEADD(mm, - 6, GETDATE())) workQuery

    WHERE lastEndDate = endDate

    AND DATEDIFF(DD, endDate, GETDATE()) % 2 = 0

    GROUP BY userid, firstname, lastEndDate, DATEDIFF(dd, lastEndDate, GETDATE())

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    PRINT REPLICATE('=',80)

    ROLLBACK

    Ty for your time

  • You're right, I did miss that requirement.

    Here's the corrected version for my code: -

    SELECT userid, firstname, lastEndDate AS lastJob, DATEDIFF(dd, lastEndDate, GETDATE()) AS daysSinceLastJob,

    MAX(jobname) AS jobname

    FROM (SELECT a.id AS userid, a.firstname, a.lastlogin, a.inactive, b.jobname,

    CASE WHEN timestartmonth = 9999 OR timestartyear = 9999

    THEN NULL

    ELSE DATEADD(m, timestartmonth-1, CONVERT(DATETIME, CONVERT(VARCHAR(4),timestartyear)))

    END AS startDate,

    CASE WHEN timeendmonth = 9999 OR timeendyear = 9999

    THEN NULL

    ELSE DATEADD(m, timeendmonth-1, CONVERT(DATETIME, CONVERT(VARCHAR(4),timeendyear)))

    END AS endDate,

    MAX(CASE WHEN timeendmonth = 9999 OR timeendyear = 9999

    THEN GETDATE()

    ELSE DATEADD(m, timeendmonth-1, CONVERT(DATETIME, CONVERT(VARCHAR(4),timeendyear)))

    END) OVER (PARTITION BY a.id) AS lastEndDate

    FROM #users a

    INNER JOIN #workexperience b ON a.id = b.userid

    WHERE a.inactive <> 1 AND lastlogin >= DATEADD(mm, - 6, GETDATE())) workQuery

    WHERE lastEndDate = endDate

    AND DATEDIFF(DD, endDate, GETDATE()) % 2 = 0

    GROUP BY userid, firstname, lastEndDate, DATEDIFF(dd, lastEndDate, GETDATE())

    And this is Jim's code corrected for your new requirement

    SELECT id, firstname, MAX(jobname) AS jobname, DaysSinceLastJob, lastJob

    FROM (SELECT u.id, u.firstname, we.jobname, DATEDIFF(DD, lastJob, GETDATE()) DaysSinceLastJob,

    ROW_NUMBER() OVER (ORDER BY u.id, we.timeendyear DESC, we.timeendmonth DESC) AS RecentJob, lastJob

    FROM (SELECT id, MAX(JobEnd) AS lastJob

    FROM (SELECT u.id, we.id AS WE_Id, we.jobname,

    DATEADD(MONTH, ((we.timestartyear - 1900) * 12) + we.timestartmonth - 1, 1 - 1) AS JobStart,

    CASE WHEN we.timeendyear = 9999 OR we.timeendmonth = 9999

    THEN NULL

    ELSE DATEADD(MONTH, ((we.timeendyear - 1900) * 12) + we.timeendmonth - 1, 1 - 1) END AS JobEnd

    FROM #users u

    INNER JOIN #workexperience we ON we.userid = u.id

    WHERE u.lastlogin >= DATEADD(MM, - 6, GETDATE())

    AND u.inactive <> 1

    ) job

    GROUP BY job.id

    HAVING MAX(ISNULL(JobEnd, GETDATE())) < DATEADD(M, - 6, GETDATE())

    AND DATEDIFF(DD, MAX(JobEnd), GETDATE()) % 2 = 0

    ) oldJobs

    INNER JOIN #users u ON u.id = oldJobs.id

    INNER JOIN #workexperience we ON we.userid = u.id

    ) FinalJobs

    GROUP BY id, firstname, DaysSinceLastJob, lastJob

    And some stats to compare: -

    ========== Jim version ==========

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table '#workexperience'. Scan count 2, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table '#users'. Scan count 2, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    ================================================================================

    ========== Cadavre version ==========

    Table 'Worktable'. Scan count 3, logical reads 31, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table '#workexperience'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table '#users'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    ================================================================================

    Test both on your real data to see which is better suited.


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • It seems like it works! In a single SQL, thats amazing...

    Ty, you guys rock!

    *christmas hug*

  • memymasta (12/12/2011)


    It seems like it works! In a single SQL, thats amazing...

    Ty, you guys rock!

    *christmas hug*

    No problem.

    Out of curiosity, which performs better on your real data? (I won't be offended if it isn't mine :-P)


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Thanks for looking into a bit more deeply Cadavre.. I've been working with SQL Server for 5 years now and have literally never had anyone else look at my code to review.

    I was feeling like I was hitting the tables too many times as well.. Def worth another look! Haha..

    SQL SERVER Central Forum Etiquette[/url]

  • Elliott Berkihiser (12/7/2011)


    OK, this looks like as good time as any to ask about DateTime data. The advice to the OP was to use datetime datatypes for date data. What's bugging me is that Microsoft sometimes uses other data types to represent time. See the system table "sysjobhistory" (sorry I just realized that I'm looking at SQL 2005). Run_date and Run_time are integer data types. Can anyone tell me why?

    Heh... yep. Stupid mistake on their part. Go look at sp_spaceused or nearly any sp_help* procs they wrote if you want proof that even the folks at MS can make serious mistakes. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Oh one more thing!

    I compared both queries to view the performance, and realized that Cadavres query return 1/4 more results then Jims...

    Checking the results it showed that Cadavres does include jobs that are newer then 6 months. Which does not follow the criterea of:

    select users that only have jobs that are older then 6 months.

    Think you got tricked by the modulo... which excludes those users today, but shows them tomorrow.

    If you add a new user and new job for that user:

    INSERT INTO #users (firstname, lastlogin, inactive) VALUES ('Nofun', '20111204', 0)

    INSERT INTO #workexperience(userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth)VALUES (7, 'Jailor', 2005, 1, 2011, 9)

    The query returns:

    6Evil Twin2009-07-01 00:00:00.000894Emperor

    7Nofun 2011-09-01 00:00:00.000102Jailor

    Where Nofun has 102 days SinceLastJob, less then (6 months).

    Here's the whole code:

    BEGIN TRAN

    SET NOCOUNT ON

    CREATE TABLE #workexperience (id INT identity(1, 1), userid INT, jobname VARCHAR(20) NULL, timestartyear INT, timestartmonth INT, timeendyear INT, timeendmonth INT)

    CREATE TABLE #users (id INT identity(1, 1), firstname VARCHAR(20), lastlogin DATE, inactive INT)

    INSERT INTO #users (firstname, lastlogin, inactive) VALUES ('Peter', '20111201', 0)

    INSERT INTO #users (firstname, lastlogin, inactive) VALUES ('Robert', '20111202', 1)

    INSERT INTO #users (firstname, lastlogin, inactive) VALUES ('Johan', '20111203', 0)

    INSERT INTO #users (firstname, lastlogin, inactive) VALUES ('Noah', '20111204', 0)

    INSERT INTO #users (firstname, lastlogin, inactive) VALUES ('David', '20001205', 0)

    INSERT INTO #users (firstname, lastlogin, inactive) VALUES ('Evil Twin', '20111204', 0)

    INSERT INTO #users (firstname, lastlogin, inactive) VALUES ('Nofun', '20111204', 0)

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (1, 'Manager', 2011, 1, 9999, 9999)

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (2, 'Pilot', 2011, 2, 2011, 5)

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (3, 'Clown', 2011, 3, 2011, 11)

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (3, 'Librarian', 2010, 5, 2010, 12)

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (3, 'Accountant', 2010, 6, 2010, 7)

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (4, 'Cattle Driver', 2006, 1, 9999, 9999)

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (4, 'Farmer', 2006, 1, 2009, 7)

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (5, 'Driver', 2007, 4, 2008, 5)

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (6, 'Emperor', 2006, 1, 2009, 7)

    INSERT INTO #workexperience (userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth) VALUES (6, 'Conqueror', 2006, 1, 2009, 7)

    insert into #workexperience(userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth)VALUES (6, 'Sailor', 2003, 1, 2004, 7)

    insert into #workexperience(userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth)VALUES (6, 'Captain', 2002, 1, 2005, 7)

    insert into #workexperience(userid, jobname, timestartyear, timestartmonth, timeendyear, timeendmonth)VALUES (7, 'Jailor', 2005, 1, 2011, 9)

    PRINT '========== Jim-720070 version =========='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    SELECT id, firstname, MAX(jobname) AS jobname, DaysSinceLastJob, lastJob

    FROM (SELECT u.id, u.firstname, we.jobname, DATEDIFF(DD, lastJob, GETDATE()) DaysSinceLastJob,

    ROW_NUMBER() OVER (ORDER BY u.id, we.timeendyear DESC, we.timeendmonth DESC) AS RecentJob, lastJob

    FROM (SELECT id, MAX(JobEnd) AS lastJob

    FROM (SELECT u.id, we.id AS WE_Id, we.jobname,

    DATEADD(MONTH, ((we.timestartyear - 1900) * 12) + we.timestartmonth - 1, 1 - 1) AS JobStart,

    CASE WHEN we.timeendyear = 9999 OR we.timeendmonth = 9999

    THEN NULL

    ELSE DATEADD(MONTH, ((we.timeendyear - 1900) * 12) + we.timeendmonth - 1, 1 - 1) END AS JobEnd

    FROM #users u

    INNER JOIN #workexperience we ON we.userid = u.id

    WHERE u.lastlogin >= DATEADD(MM, - 6, GETDATE())

    AND u.inactive <> 1

    ) job

    GROUP BY job.id

    HAVING MAX(ISNULL(JobEnd, GETDATE())) < DATEADD(M, - 6, GETDATE())

    AND DATEDIFF(DD, MAX(JobEnd), GETDATE()) % 2 = 0

    ) oldJobs

    INNER JOIN #users u ON u.id = oldJobs.id

    INNER JOIN #workexperience we ON we.userid = u.id

    ) FinalJobs

    GROUP BY id, firstname, DaysSinceLastJob, lastJob

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    PRINT REPLICATE('=',80)

    PRINT '========== Cadavre version =========='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    SELECT userid, firstname, lastEndDate AS lastJob, DATEDIFF(dd, lastEndDate, GETDATE()) AS daysSinceLastJob,

    MAX(jobname) AS jobname

    FROM (SELECT a.id AS userid, a.firstname, a.lastlogin, a.inactive, b.jobname,

    CASE WHEN timestartmonth = 9999 OR timestartyear = 9999

    THEN NULL

    ELSE DATEADD(m, timestartmonth-1, CONVERT(DATETIME, CONVERT(VARCHAR(4),timestartyear)))

    END AS startDate,

    CASE WHEN timeendmonth = 9999 OR timeendyear = 9999

    THEN NULL

    ELSE DATEADD(m, timeendmonth-1, CONVERT(DATETIME, CONVERT(VARCHAR(4),timeendyear)))

    END AS endDate,

    MAX(CASE WHEN timeendmonth = 9999 OR timeendyear = 9999

    THEN GETDATE()

    ELSE DATEADD(m, timeendmonth-1, CONVERT(DATETIME, CONVERT(VARCHAR(4),timeendyear)))

    END) OVER (PARTITION BY a.id) AS lastEndDate

    FROM #users a

    INNER JOIN #workexperience b ON a.id = b.userid

    WHERE a.inactive <> 1 AND lastlogin >= DATEADD(mm, - 6, GETDATE())) workQuery

    WHERE lastEndDate = endDate

    AND DATEDIFF(DD, endDate, GETDATE()) % 2 = 0

    GROUP BY userid, firstname, lastEndDate, DATEDIFF(dd, lastEndDate, GETDATE())

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    PRINT REPLICATE('=',80)

    ROLLBACK

  • Next time, I might even read the whole thread before I post!!

    Corrected.

    SELECT userid, firstname, lastEndDate AS lastJob, DATEDIFF(dd, lastEndDate, GETDATE()) AS daysSinceLastJob,

    MAX(jobname) AS jobname

    FROM (SELECT a.id AS userid, a.firstname, a.lastlogin, a.inactive, b.jobname,

    CASE WHEN timestartmonth = 9999 OR timestartyear = 9999

    THEN NULL

    ELSE DATEADD(m, timestartmonth-1, CONVERT(DATETIME, CONVERT(VARCHAR(4),timestartyear)))

    END AS startDate,

    CASE WHEN timeendmonth = 9999 OR timeendyear = 9999

    THEN NULL

    ELSE DATEADD(m, timeendmonth-1, CONVERT(DATETIME, CONVERT(VARCHAR(4),timeendyear)))

    END AS endDate,

    MAX(CASE WHEN timeendmonth = 9999 OR timeendyear = 9999

    THEN GETDATE()

    ELSE DATEADD(m, timeendmonth-1, CONVERT(DATETIME, CONVERT(VARCHAR(4),timeendyear)))

    END) OVER (PARTITION BY a.id) AS lastEndDate

    FROM #users a

    INNER JOIN #workexperience b ON a.id = b.userid

    WHERE a.inactive <> 1 AND lastlogin >= DATEADD(mm, - 6, GETDATE())) workQuery

    WHERE lastEndDate = endDate

    AND DATEDIFF(DD, endDate, GETDATE()) % 2 = 0

    AND endDate < DATEADD(mm, - 6, GETDATE())

    GROUP BY userid, firstname, lastEndDate, DATEDIFF(dd, lastEndDate, GETDATE())


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Here's the result with real data:

    ========== Jim-720070 version ==========

    Warning: Null value is eliminated by an aggregate or other SET operation.

    (3336 row(s) affected)

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'userworkexperience'. Scan count 2, logical reads 11822, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'users'. Scan count 1, logical reads 16838, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 203 ms, elapsed time = 486 ms.

    ================================================================================

    ========== Cadavre version ==========

    (3336 row(s) affected)

    Table 'users'. Scan count 5, logical reads 7321, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'userworkexperience'. Scan count 5, logical reads 6539, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 12, logical reads 89034, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 482 ms, elapsed time = 371 ms.

    ================================================================================

    The time seem to vary a bit inbetween the runs, but overall it seems that Jims has alot less "CPU time", but Cadavres have less "elapsed time".

    Does it mean that Jims version is faster?

  • memymasta (12/12/2011)


    Here's the result with real data:

    ========== Jim-720070 version ==========

    Warning: Null value is eliminated by an aggregate or other SET operation.

    (3336 row(s) affected)

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'userworkexperience'. Scan count 2, logical reads 11822, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'users'. Scan count 1, logical reads 16838, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 203 ms, elapsed time = 486 ms.

    ================================================================================

    ========== Cadavre version ==========

    (3336 row(s) affected)

    Table 'users'. Scan count 5, logical reads 7321, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'userworkexperience'. Scan count 5, logical reads 6539, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 12, logical reads 89034, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 482 ms, elapsed time = 371 ms.

    ================================================================================

    The time seem to vary a bit inbetween the runs, but overall it seems that Jims has alot less "CPU time", but Cadavres have less "elapsed time".

    Does it mean that Jims version is faster?

    The "elapsed" time is the time taken, however the IO points to it being quite bad.

    Try these changes and test again (it's a little difficult without an actual execution plan)

    SELECT userid, firstname, endDate AS lastJob, DATEDIFF(dd, endDate, GETDATE()) AS daysSinceLastJob,

    MAX(jobname) AS jobname

    FROM (SELECT a.id AS userid, a.firstname, a.lastlogin, a.inactive, b.jobname,

    CASE WHEN timestartmonth = 9999 OR timestartyear = 9999

    THEN NULL

    ELSE DATEADD(m, timestartmonth-1, CONVERT(DATETIME, CONVERT(VARCHAR(4),timestartyear)))

    END AS startDate,

    CASE WHEN timeendmonth = 9999 OR timeendyear = 9999

    THEN NULL

    ELSE DATEADD(m, timeendmonth-1, CONVERT(DATETIME, CONVERT(VARCHAR(4),timeendyear)))

    END AS endDate,

    ROW_NUMBER() OVER (PARTITION BY a.id ORDER BY CASE WHEN timeendmonth = 9999 OR timeendyear = 9999

    THEN GETDATE()

    ELSE DATEADD(m, timeendmonth-1, CONVERT(DATETIME, CONVERT(VARCHAR(4),timeendyear)))

    END DESC) AS rn

    FROM #users a

    INNER JOIN #workexperience b ON a.id = b.userid

    WHERE a.inactive <> 1 AND lastlogin >= DATEADD(mm, - 6, GETDATE())) workQuery

    WHERE rn = 1

    AND DATEDIFF(DD, endDate, GETDATE()) % 2 = 0

    AND endDate < DATEADD(mm, - 6, GETDATE())

    GROUP BY userid, firstname, endDate


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Cadavre (12/12/2011)


    Try these changes and test again (it's a little difficult without an actual execution plan)

    ========== Jim-720070 version ==========

    Warning: Null value is eliminated by an aggregate or other SET operation.

    (3336 row(s) affected)

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'userworkexperience'. Scan count 2, logical reads 11822, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'users'. Scan count 1, logical reads 16838, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    (1 row(s) affected)

    SQL Server Execution Times:

    CPU time = 250 ms, elapsed time = 467 ms.

    ================================================================================

    ========== Cadavre version ==========

    (3336 row(s) affected)

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'userworkexperience'. Scan count 1, logical reads 5911, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'users'. Scan count 1, logical reads 6613, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    (1 row(s) affected)

    SQL Server Execution Times:

    CPU time = 250 ms, elapsed time = 561 ms.

    ================================================================================

    Overall this values seems to be recurring.

    However, reading the execution plan the following values are stated:

    Query1 (Jims)

    Query cost(relative to the batch): 57%

    Query2 (Cadavres)

    Query cost(relative to the batch): 43%

    Congratz :')

  • memymasta (12/12/2011)


    Cadavre (12/12/2011)


    Try these changes and test again (it's a little difficult without an actual execution plan)

    ========== Jim-720070 version ==========

    Warning: Null value is eliminated by an aggregate or other SET operation.

    (3336 row(s) affected)

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'userworkexperience'. Scan count 2, logical reads 11822, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'users'. Scan count 1, logical reads 16838, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    (1 row(s) affected)

    SQL Server Execution Times:

    CPU time = 250 ms, elapsed time = 467 ms.

    ================================================================================

    ========== Cadavre version ==========

    (3336 row(s) affected)

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'userworkexperience'. Scan count 1, logical reads 5911, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'users'. Scan count 1, logical reads 6613, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    (1 row(s) affected)

    SQL Server Execution Times:

    CPU time = 250 ms, elapsed time = 561 ms.

    ================================================================================

    Overall this values seems to be recurring.

    However, reading the execution plan the following values are stated:

    Query1 (Jims)

    Query cost(relative to the batch): 57%

    Query2 (Cadavres)

    Query cost(relative to the batch): 43%

    Congratz :')

    The cost relative to the batch is a red-herring, it very rarely represents actual cost (I've had two versions of code where 1 executed in 100ms and the other in 15000ms. The 100ms code showed as 98% of the cost of the batch compared to 2% for the 15000ms code - which is obviously untrue!).

    Looking at the times, Jim's appears better. However, looking at the IO, mine should be. So, I'm guessing that Jim's query is cached on your system so producing better timings. If you're testing on a test server, clear the cache and test again (I normally test 10 times then average out the timings and IO to compare). If this is a production server, do not clear the cache!


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Running on production server, but yes, alot less reads.

    The changes you did made a huge impact on reads in your query.

  • Jim-720070 (12/12/2011)


    Thanks for looking into a bit more deeply Cadavre.. I've been working with SQL Server for 5 years now and have literally never had anyone else look at my code to review.

    I was feeling like I was hitting the tables too many times as well.. Def worth another look! Haha..

    I'm the same, although less experienced!

    I've worked with SQL Server for 2 years, but the way I work is a little different to everyone I've worked with so far.

    This is how I perform a task: -

    1. I start with pen and paper and jot down a few algorithms. Normally, I end up with between 3 and 10 algorithms written down.

    2. I'll then code up the 3 that I think are most likely to perform the best, as well as 1 RBAR algorithm.

    3. I set-up a test environment of 100 rows based on the table structures and data set and run them all against it comparing the result-sets by inserting each one into temp tables. I repeat this test 10 times, assuming that each run produces the same results.

    4. Once all of my algorithms have passed the first stage of testing, I recreate my test environment based on the table structures and data set but this time add 1,000,000 rows and test again. I repeat this test 5 times, assuming that each run produces the same results.

    5. Once I have the two fastest, I take a sub-set of the actual data and test again, this is not repeated.

    6. Assuming the results are the same, I do a final test on the actual tables that will be used. I repeat this test 5 times.

    7. Finally, I do a logic check of the fastest algorithm, then submit the code into the correct branch of our software.

    It makes development take slightly longer for me than for some others, but it also means that when I submit code I can be confident that it will scale well and performs as expected.


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Cadavre (12/12/2011)


    This is how I perform a task: -

    ...

    Your checklist is now mine 😀

Viewing 15 posts - 16 through 30 (of 36 total)

You must be logged in to reply to this topic. Login to reply