Top and Group By

  • Here's my scenario, I have customers that can have 2 items assigned to them at the same time. Some of them, though, have more than 2, I need to delete/enddate the oldest ones so they only have 2 active at once.

    I can do a:

    Select CustomerNumber

    From TableA

    Group By CustomerNumber

    Having Count(1) > 2

    and that will give me the list of customers that have more than 2. But now here's my problem, how do I retain the top 2, for each customer, based on my IssuedDate [desc]? I can't throw in my IssuedDate into the above query because then it wouldn't do the grouping, but I need the IssuedDate to do something like Rank() or ROW_NUMBER(). Does anyone else have another suggestion? I realize it would be very easy if someone could just give me a hint, or point me in the right direction. Thanks!

  • Ok, I kind of figured out a way, but I don't like it.

    I did this:

    Select CustomerNumber, IssuedDate

    From

    (

    Select Top 100000 CustomerNumber, IssuedDate, Rank() Over (Partition By CustomerNumber Order By IssuedDate Desc) As [Rank]

    From TableA a

    Where Exists

    (

    Select CustomerNumber

    From TableA a1

    Where a1.CustomerNumber = a.CustomerNumber

    Group By CustomerNumber

    Having Count(1) > 2

    )

    Order By CustomerNumber, Rank

    ) temp

    Where temp.Rank <= 2

    The part I don't like is having to put the Top 100000 in my middle sub-select. How do I get around that, even though I need my order by clause in that middle sub-select so that my Rank is correct?

    BTW - SQL Server said I couldn't put an Order By clause in my sub-select statement unless it had "For XML", or "Top"

  • Hi Gregory, it's been a while ;-):cool:

    you could try something like

    WITH cte as

    (

    SELECT

    CustomerNumber,

    ROW_NUMBER() OVER(PARTITION BY CustomerNumber ORDER BY IssuedDate desc ) AS row

    FROM TableA

    )

    SELECT *

    -- DELETE

    FROM cte

    WHERE row >2



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Yes, it has been, Lutz 🙂

    Those darn CTE's just haven't clicked in my brain yet, but i'm getting better.

    So it appears as though you don't need to have the Group By clause to do a group by as long as you use Rank() or Row_Number()....you guys keep showing me new stuff on here everytime I visit. Thanks again!

  • The PARTITION BY argument of the ranking functions can be seen as an "implicit GROUP BY".



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Thanks again Lutz!

    Another completely different question for you if you don't mind answering.

    I have a stored procedure that needs to do a customer lookup based on some values, it's currently written as a CURSOR :w00t:. If it's possible to write the customer lookup part as a function, that would be much faster, right? (Because it would be set-based and there would be multiple function calls simultaneously instead of RBAR)

  • New question, new thread please.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Under almost any circumstances, a function will be faster than a c.u.r.s.o.r. * cough*.

    Especially, if it's an iTvF (inline-Table valued function).

    Depending on the size of the sproc, you could open a new thread and post it (together with the tables involved and some sample data, usually).

    Since a function can be easily called from a SELECT statement whereas a sproc would be "slightly more difficult", it might also help if you explain or demonstrate how you're planning to use the output of the function.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply