Using TOP Clauses within Exists Statement

  • Hi everyone

    I recently read an excellent article by Greg Larsen entitled Creative ways to Use the TOP Clause. In the article he suggests a technique to get the second set of 4 records. I thought I might use a similar technique to dynamically retrive records each page of a report. So I set about to test my Idea.

    The original query from the article was.

    select top 4 * from TopDemo a

    where PaymentID in (select top 8 PaymentID from TopDemo)

    order by PaymentID desc

    I set up a test table and populated some records.

    /*

    CREATE TABLE [dbo].[Widgets]([WidgetID] [int] NOT NULL PRIMARY KEY, )

    Declare @WidgetID int

    Set @WidgetID = 1

    While @WidgetID < 500

    Begin

    Insert into dbo.Widgets (WidgetID)

    Values(@widgetID)

    SET @widgetID = @widgetID + 1

    End

    */

    --Next I used the tecnique and wrote the query

    --declare a page size

    Declare @RecordsPerPage int

    Declare @PageNbr int

    set @recordsPerPage = 5

    set @PageNbr = 2

    --example 1 should get page 2 (records 6-10)

    SELECT q1.* FROM

    (SELECT Top (@RecordsPerPage) * FROM dbo.Widgets T1

    WHERE T1.WidgetID IN

    (SELECT Top (@RecordsPerPage * @PageNbr) T2.WidgetID From dbo.Widgets T2

    Order By T2.WidgetID ASC)

    Order by T1.WidgetID Desc) as q1

    Order by q1.WidgetID Asc

    --Results Were great!

    /*WidgetID

    -----------

    6

    7

    8

    9

    10

    */

    --But Never being satisfied with perfection I altered the query to use an exist statement

    --example 2

    SELECT q1.* FROM

    (SELECT Top (@RecordsPerPage) * FROM dbo.Widgets T1

    WHERE EXISTS

    (SELECT Top (@RecordsPerPage * @PageNbr) * From dbo.Widgets T2

    WHERE T2.WidgetID = T1.WidgetID

    Order By T2.WidgetID ASC)

    Order by T1.WidgetID Desc) as q1

    Order by q1.WidgetID Asc

    --Catastrophe!!

    /*WidgetID

    -----------

    495

    496

    497

    498

    499

    */

    /*

    It seems the where clause (T2.WidgetID = T1.WidgetID) in the sub query is being applied to t2 and not filtering t1. I thought the two statements would be eqivelent and I am hoping some one can shed some light on how the TOP and Exist Clause are being applied.

    I altered the second statement

    */

    --Example 3

    SELECT q1.* FROM

    (SELECT Top (@RecordsPerPage) * FROM dbo.Widgets T1

    WHERE EXISTS

    (SELECT q2.* FROM

    (SELECT Top (@RecordsPerPage * @PageNbr) * From dbo.Widgets T2 Order By T2.WidgetID ASC) as q2

    WHERE q2.WidgetID = T1.WidgetID)

    Order by T1.WidgetID Desc) as q1

    Order by q1.WidgetID Asc

    --Thankfully I got the desired results

    /*

    WidgetID

    -----------

    6

    7

    8

    9

    10

    */

    I looked at the execution plan (something I know very little about ) of all three statements and notice that the second uses a Left Semi Join (For each row in the top (outer) input, scan the bottom (inner) input, and output matching rows.) And the first and third use a Merge Join (Match rows from two suitably sorted input tables exploiting their sort order.) Can anyone shed some more light on the situation?

  • Exists doesn't care about the number of records. It's just checking to see if one does, or doesn't exist. It will return true as soon as it hits the first qualifying record.

    btw, since you're on SQL 2005, look at using the row_number function to do your paging. If may turn out more efficient.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks, I think I get it now. I thought the Top filter of the subquery would be applied first and therefore the search for a match would be limited to the first 10 records of T2. But If I had read bol more carefully I would have realized that.

    When a subquery is introduced with the keyword EXISTS, the subquery functions as an existence test. The WHERE clause of the outer query tests whether the rows that are returned by the subquery exist.

    So In example 2 the Search for a match is performed against each row of T1. Something akin to

    ((SELECT Count (*) FROM (Select top 10 * from Widgets T2 Where T2.WidgetID = T1.WidgetID order by T2.widgetID asc)) > 0) is run against each row and every row of T1 qualifies. The TOP and the ORDER BY are meaningless because EXISTS will return true as long as the subquery returns records, It doesn't care how they are sorted or how many records are returned so long as there is at least one.

    The Query engine doesn't run the sub query (Select Top (10) * from widgets T2 Order By t2.WidgetID asc) and then Look for the matches between the two result sets. Which the more I think about it... It makes sense that it shouldn't. I had always imagined it that way to my mistake.

    In example 3 the search for a match is performed against each row of T1 something is akin to

    ((SELECT Count(*) FROM (Select q2.* from (SELECT Top 10 * from Widgets Order by widgetID asc) as q2 WHERE q2.widgetID = T1.widgetID) ) > 0) and only the first 10 rows match. The TOP and ORDER BY now are very meaningful because the WHERE Clause is applied against the filtered result set of the Top 10 records (q2) and the results are as desired

    And you are right about row number, but I think I understand the Exists clause a little better.

    I ran modified examples of 2 and 3

    SELECT q1.* FROM

    (SELECT Top 100 * FROM dbo.Widgets T1

    WHERE EXISTS

    (SELECT Top 10 * From dbo.Widgets T2

    WHERE T2.WidgetID = T1.WidgetID

    Order By T2.WidgetID ASC)

    Order by T1.WidgetID Desc) as q1

    Order by q1.WidgetID Asc

    Returns 100 records

    --example 3

    SELECT q1.* FROM

    (SELECT Top 100 * FROM dbo.Widgets T1

    WHERE EXISTS

    (SELECT q2.* FROM

    (SELECT Top 10 * From dbo.Widgets T2 Order By T2.WidgetID ASC) as q2

    WHERE q2.WidgetID = T1.WidgetID)

    Order by T1.WidgetID Desc) as q1

    Order by q1.WidgetID Asc

    Returns 10 records

  • Sure. Thing is with the rownumber, you'll only be reading the table once. With exists, you're reading it at least twice.

    -- returns 10 rows for the second page of a report

    DECLARE @PageSize Smallint, @PageNumber smallint

    SET @PageSize = 10

    SET @PageNumber = 2

    SELECT q1.* FROM

    (SELECT *, ROW_NUMBER OVER (ORDER BY q1.WidgetID Asc) RowNo FROM dbo.Widgets T1) q1

    WHERE RowNo between (@PageNumber-1)*@PageSize +1 and (@PageNumber*@PageSize)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply