Selecting differences between datasets

  • I hope this is a quick one for someone as it's flumoxed me.

    I have 2 SQL statements that output 496 (top statement) and 494 records respectively. The SQL is as follows:

    -Top select

    SELECT ClientID,ReferralNumber,AppointmentDate,ActualTime--,Count(*)

    FROM dbo.vwBRSAppointmentsNPD

    WHERE SpecialtyReferredTo='ICNR'

    AND AppointmentContactCancellationDateTime IS NULL

    AND AppointmentDate between '30-aug-2010' AND '05-sep-2010'

    ORDER BY ClientID,ReferralNumber,AppointmentDate,ActualTime;

    --Bottom Select

    SELECT

    APP.ClientID,APP.ReferralNumber,APP.AppointmentDate,ActualTime

    --,GenHCPCode,RT.TeamCode,RT.SequenceID--,Count(*)

    FROM dbo.vwBRSAppointmentsNPD As APP

    LEFT OUTER JOIN SchemaBRS.AmsReferralTeam As RT on APP.ClientID=RT.ClientID and APP.ReferralNumber=RT.ReferralID

    WHERE SpecialtyReferredTo='ICNR'

    AND APP.AppointmentContactCancellationDateTime IS NULL

    AND APP.AppointmentDate between '30-aug-2010' AND '05-sep-2010'

    AND RT.Sequenceid = (SELECT TOP 1 RT2.Sequenceid

    FROM SchemaBRS.AmsReferralTeam As RT2

    WHERE RT2.ClientID = RT.ClientID

    AND RT2.ReferralID = RT.ReferralID

    AND AppointmentDate between '30-aug-2010' AND '05-sep-2010'

    AND AppointmentDate>=StartDate

    AND (AppointmentDate<=EndDate OR EndDate IS NULL)

    ORDER BY RT2.Startdate)

    ORDER BY APP.ClientID,APP.ReferralNumber,APP.AppointmentDate,ActualTime;

    There is no primary key to this data, but what I want to select is the 2 extra records selected by the top select that aren't included in the bottom select.

    I have tried the 'Not In' predicate based on the ClientID, but got no-where (zero records output). I have also tried the 'Not Exists' predicate with the same result, although I am pretty unfamiliar with the latter and I'm pretty obviously doing something wrong.

    If anyone can help, that would be great and save me an immense amount of time as I'm going to have to carry out this exercise across a lot of different datasets.

    Cheers!

  • Try this:-

    --Modified Bottom Select

    SELECT

    APP.ClientID,

    APP.ReferralNumber,

    APP.AppointmentDate,

    ActualTime,

    OneOfMissingRows = CASE WHEN f.Sequenceid IS NULL THEN 1 ELSE 0 END

    --,GenHCPCode,RT.TeamCode,RT.SequenceID--,Count(*)

    FROM dbo.vwBRSAppointmentsNPD As APP

    LEFT OUTER JOIN SchemaBRS.AmsReferralTeam As RT on APP.ClientID=RT.ClientID and APP.ReferralNumber=RT.ReferralID

    LEFT JOIN (

    SELECT Sequenceid, ClientID, ReferralID

    FROM SchemaBRS.AmsReferralTeam As RT2

    WHERE AppointmentDate between '30-aug-2010' AND '05-sep-2010'

    AND AppointmentDate>=StartDate

    AND (AppointmentDate<=EndDate OR EndDate IS NULL)

    GROUP BY Sequenceid, ClientID, ReferralID

    ) f ON f.Sequenceid = RT.Sequenceid AND f.ClientID = RT.ClientID AND F.ReferralID = RT.ReferralID

    WHERE SpecialtyReferredTo='ICNR' -- common

    AND APP.AppointmentContactCancellationDateTime IS NULL -- common

    AND APP.AppointmentDate between '30-aug-2010' AND '05-sep-2010' -- common

    ORDER BY APP.ClientID, APP.ReferralNumber, APP.AppointmentDate, ActualTime;

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • You might also consider using the EXCEPT keyword to keep your code neater and more concise. I don't know whether there are any performance differences between that and Chris's example - you'd have to test that.

    John

  • Thanks Chris - looks a lot neater - the only problem I have is that the innermost select fails as the AppointmentDate field where I'm setting some date parameters only occurs in the vwBRSAppointmentsNPD table and I'm having problems joining this back to the AmsReferralTeam table (not sure where to put the join or what sort of join to use).

    Cheers

  • Thanks John - I've never heard of 'Except' but I'll look it up.

    Cheers

  • richard.kirby (9/7/2010)


    Thanks Chris - looks a lot neater - the only problem I have is that the innermost select fails as the AppointmentDate field where I'm setting some date parameters only occurs in the vwBRSAppointmentsNPD table and I'm having problems joining this back to the AmsReferralTeam table (not sure where to put the join or what sort of join to use).

    Cheers

    What are the join columns between them?

    --Modified Bottom Select

    SELECT

    APP.ClientID,

    APP.ReferralNumber,

    APP.AppointmentDate,

    APP.ActualTime,

    OneOfMissingRows = CASE WHEN f.Sequenceid IS NULL THEN 1 ELSE 0 END

    FROM dbo.vwBRSAppointmentsNPD As APP

    LEFT JOIN (

    SELECT

    Sequenceid = MAX(Sequenceid),

    ClientID,

    ReferralID

    FROM SchemaBRS.AmsReferralTeam As RT2

    GROUP BY ClientID, ReferralID

    ) f ON f.ClientID = APP.ClientID AND F.ReferralID = APP.ReferralNumber

    WHERE APP.SpecialtyReferredTo = 'ICNR'

    AND APP.AppointmentContactCancellationDateTime IS NULL

    AND APP.AppointmentDate between '30-aug-2010' AND '05-sep-2010'

    ORDER BY APP.ClientID, APP.ReferralNumber, APP.AppointmentDate, APP.ActualTime

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hi Chris - it's ClientID.

    Cheers

  • John Mitchell-245523 (9/7/2010)


    You might also consider using the EXCEPT keyword to keep your code neater and more concise. I don't know whether there are any performance differences between that and Chris's example - you'd have to test that.

    John

    If it's just to see which 2 lines are the difference, except is really the way to go. All I had to do to get from the query in your question to the query with except is comment the 1st order by clause and drop the except keyword in between your 2 queries:

    --Top select

    SELECT ClientID,ReferralNumber,AppointmentDate,ActualTime--,Count(*)

    FROM dbo.vwBRSAppointmentsNPD

    WHERE SpecialtyReferredTo='ICNR'

    AND AppointmentContactCancellationDateTime IS NULL

    AND AppointmentDate between '30-aug-2010' AND '05-sep-2010'

    --ORDER BY ClientID,ReferralNumber,AppointmentDate,ActualTime;

    except

    --Bottom Select

    SELECT

    APP.ClientID,APP.ReferralNumber,APP.AppointmentDate,ActualTime

    --,GenHCPCode,RT.TeamCode,RT.SequenceID--,Count(*)

    FROM dbo.vwBRSAppointmentsNPD As APP

    LEFT OUTER JOIN SchemaBRS.AmsReferralTeam As RT on APP.ClientID=RT.ClientID and APP.ReferralNumber=RT.ReferralID

    WHERE SpecialtyReferredTo='ICNR'

    AND APP.AppointmentContactCancellationDateTime IS NULL

    AND APP.AppointmentDate between '30-aug-2010' AND '05-sep-2010'

    AND RT.Sequenceid = (SELECT TOP 1 RT2.Sequenceid

    FROM SchemaBRS.AmsReferralTeam As RT2

    WHERE RT2.ClientID = RT.ClientID

    AND RT2.ReferralID = RT.ReferralID

    AND AppointmentDate between '30-aug-2010' AND '05-sep-2010'

    AND AppointmentDate>=StartDate

    AND (AppointmentDate<=EndDate OR EndDate IS NULL)

    ORDER BY RT2.Startdate)

    ORDER BY APP.ClientID,APP.ReferralNumber,APP.AppointmentDate,ActualTime;

    Except syntax is very similar to union's. Basically all it does is described by "return all rows from the 1st resultset that are not in the 2nd".



    Posting Data Etiquette - Jeff Moden[/url]
    Posting Performance Based Questions - Gail Shaw[/url]
    Hidden RBAR - Jeff Moden[/url]
    Cross Tabs and Pivots - Jeff Moden[/url]
    Catch-all queries - Gail Shaw[/url]


    If you don't have time to do it right, when will you have time to do it over?

  • richard.kirby (9/7/2010)


    Thanks John - I've never heard of 'Except' but I'll look it up.

    Cheers

    Richard,

    Here's an example, using your provided code:

    --Top select

    SELECT ClientID,ReferralNumber,AppointmentDate,ActualTime

    --,Count(*)

    FROM dbo.vwBRSAppointmentsNPD

    WHERE SpecialtyReferredTo='ICNR'

    AND AppointmentContactCancellationDateTime IS NULL

    AND AppointmentDate between '30-aug-2010' AND '05-sep-2010'

    --ORDER BY ClientID,ReferralNumber,AppointmentDate,ActualTime;

    EXCEPT --<<<<<< Added this >>>>>>--

    --Bottom Select

    SELECT

    APP.ClientID,APP.ReferralNumber,APP.AppointmentDate,ActualTime

    --,GenHCPCode,RT.TeamCode,RT.SequenceID--,Count(*)

    FROM dbo.vwBRSAppointmentsNPD As APP

    LEFT OUTER JOIN SchemaBRS.AmsReferralTeam As RT on APP.ClientID=RT.ClientID and APP.ReferralNumber=RT.ReferralID

    WHERE SpecialtyReferredTo='ICNR'

    AND APP.AppointmentContactCancellationDateTime IS NULL

    AND APP.AppointmentDate between '30-aug-2010' AND '05-sep-2010'

    AND RT.Sequenceid = (SELECT TOP 1 RT2.Sequenceid

    FROM SchemaBRS.AmsReferralTeam As RT2

    WHERE RT2.ClientID = RT.ClientID

    AND RT2.ReferralID = RT.ReferralID

    AND AppointmentDate between '30-aug-2010' AND '05-sep-2010'

    AND AppointmentDate>=StartDate

    AND (AppointmentDate<=EndDate OR EndDate IS NULL)

    ORDER BY RT2.Startdate)

    --ORDER BY APP.ClientID,APP.ReferralNumber,APP.AppointmentDate,ActualTime;

    The only other thing I did besides adding the EXCEPT operator was to remark out the ORDER BY lines.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Cheers and thanks for this - I didn't know until now that there was a 'reverse' of union so to speak.

    -Richard

  • Hi Wayne - works as well!

    Such a simple thing has will increase the efficiency of myself and a few colleagues immensely.

    Great stuff all round.

    -Richard

  • richard.kirby (9/8/2010)


    Hi Wayne - works as well!

    Such a simple thing has will increase the efficiency of myself and a few colleagues immensely.

    Great stuff all round.

    -Richard

    It's worthwhile to get to know the new features added into sql. Play with this a little, and you'll see lots of possibilities for things.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply