Count with Group By clause

  • I need to pull the number of donors and the organization ID's of those donors (I need the orgID's as inputs to an outer query).

    select orgID, count(distinct indID)

    from transactions

    group by orgID

    The problem is, my count is off. Some employees retired during the year and so they have transactions when they were part of 2 organizations (regular and retired). My aggregates are counting these individuals twice. I get the correct counts if I do:

    select count(distinct indID)

    from transactions

    But this does not give me the orgID's that I need. Any help would be appreciated!!!

  • This is just a shot in the dark, but if you are able to exlude the retired transactions it should take care of the problem. It seems like you should be able to join to some table that would give you that information. Just a guess.

    Ben

  • Yes, I could try to do this which would make my full query pretty complex.

    I was hoping there would be a more simple elegant solution using the OVER clause or something. I simply need to count the number of unique individuals without any influence from the group by clause.

    Perhaps a rank solution may work, but I have not used these functions before.

    Hopefully someone else can provide some guidance.

  • I think u can do it though a rank funtion, rank it via as per your indid,

    if u wanna to get help post your ddl and some sample data properly.

    if u wanna to know about rank function follow below link:

    http://blog.sqlauthority.com/2007/10/09/sql-server-2005-sample-example-of-ranking-functions-row_number-rank-dense_rank-ntile/

    http://www.sqlservercurry.com/2009/04/rank-vs-denserank-with-example-using.html

  • select orgID, count(distinct indID)

    from transactions

    group by orgID

    The problem is, my count is off. Some employees retired during the year and so they have transactions when they were part of 2 organizations (regular and retired). My aggregates are counting these individuals twice. I get the correct counts if I do:

    select count(distinct indID)

    from transactions

    Here is my schema, it is very simple:

    TRANSACTIONS(TranID PK, OrgID FK, IndID FK)

    INDIVIDUALS(IndID PK)

    ORGANIZATIONS(OrgID PK)

    Sample Data:

    Individuals

    1

    2

    3

    Organizations

    11

    12

    13

    Transactions

    20 | 11 | 1

    21 | 11 | 1

    22 | 12 | 2

    23 | 12 | 1

    Running the first query gives me:

    11 | 1

    12 | 2

    When summed, this gives me 3 unique donors, but there are only 2. If I run the 2nd query, I get the correct # of donors, 2.

    Thank you!

  • This should give you what you need. It's not tested.

    SELECT OrgID, Count(IndID)

    FROM Individuals AS i

    CROSS APPLY(

    SELECT TOP (1) OrgID

    FROM Transactions AS t

    WHERE t.IndID = i.IndID

    ORDER BY TranID

    ) AS t

    GROUP BY OrgID

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • If you're looking for the count of distinct donors per org, then your results are correct. If you're looking for the count of distinct donors, that's something else entirely. I doesn't seem like the problem is so much that your SQL isn't returning the right results so much as that you haven't clearly defined what you're after. If you only want to count a donor under one org, regardless of how many orgs they're associated with, that's fine, but you should define how to select the org you want to count; e.g. maybe only the org associated with a donor's most recent transaction.

    Something like this:

    SELECT y.OrgID, COUNT(DISTINCT y.IndID)

    FROM (SELECT x.OrgID, x.IndID, RANK() OVER (PARTITION BY x.IndID ORDER BY x.OrgID) OrgRank FROM Transactions x) y

    WHERE y.OrgRank = 1

    GROUP BY y.OrgID;

    Of course you'd change the "ORDER BY" clause of the RANK function to something that put the org you want in the first rank.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply