Help with Counts

  • It looks like you're looking for either one row or one column of aggregated data, right?

    Why not write each query on its own, either as inline sub-queries in a single Select statement, or as separate queries with Union operators between them?

    Either:

    SELECT

    (Select count(*) from dbo.MyTable) as TotalRecords,

    (Select count(distinct Datakey) from dbo.MyTable) as DistinctDataKeys,

    (select count(*) from (select distinct Datakey, Landline from dbo.MyTable) as A) as DistinctLandlineDatakeys,

    ... etc.

    or

    select 'Total Records' as [Measure], count(*) as Qty

    from dbo.MyTable

    union all

    select 'Datakeys', count(distinct Datakey)

    from dbo.MyTable

    ... etc.

    Can you query each item separately?

    On the more complex rules, like landlines, where you want distinct values, and each can only count if it has a datakey assigned, that becomes a simple matter of querying distinct landline values where Datakey is not null, or whatever the complexity of the rules is.

    I think the complexity of this task is coming from trying to bite it all off in one query. Break it up and it'll simplify itself.

    Or has that already been looked into and I missed it?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • L' Eomot Inversé (1/13/2012)


    stephen99999 (1/13/2012)


    I think the best way to do this is to just test on various sample data. Really change it up, and see if you can "break" the code. If not, then I guess it works (even if we don't fully understand it).

    Stephen

    That's a pretty dangerous approach - trusting code you don't understand to be correct after testing on a set of small samples. If this code doesn't work, how will you ever know it doesn't work, if the data combination that breaks it is so complex that it can only arise with a sample of at least 5000 rows?

    Never said "small." Also, I was just merely asking him to thoroughly test it, which in one of his very recent posts, he did, and found an error. So I would say my post was just fine.

    Besides, sometimes risks must be taken. I see your point, but at the same time were not testing some quirky code involving deletes, updates, or inserts. His data is safe 🙂

    Again... I see your point, I just like to live dangerously lol. hopefully, my comment does not extend this topic to 10 pages..

  • That's quite clear as far as it goes, but it leaves some questions open. You seem to think that it's something that can be written as a single SQL statement, but as I read it you have secified a constrained maximisation problem which may not be possible to express efficiently in such a way.

    For example, if I have

    datakey landline

    1 100

    1 200

    1 300

    2 200

    2 300

    2 100

    3 300

    3 100

    3 200

    in the data you presumably want the uniquelandline count to be 3 because it is possible to find 3 (datakey, landline) combinations no two of which have the same langline or the same datakey. And presumably you also want the count to be 3 is what I have is

    1 100

    1 200

    1 300

    2 100

    2 200

    2 300

    3 100

    because it is still possible to choose 3 combinations which are distinct in both fields: (1,300), (2,200),and (1,300).

    But of course if I pick (1,100) and (2,200) there is no landline value I can pick to go wioth datakey 3, because its only lanline has already been counted with datakey 1, so to get the right answer I have to maximise the number of combinations by not choosing (1,100) in this case. And also by not choosing (2,100) in this case.

    If that is the requirement (with a simlar requirement for each of the other fields) then none of the code suggested so far comes anywhere near doing what you require, because what you have is a constrained maximisation problem which probably needs trial and backtrack to resolve it, and the code so far makes no attempt at all at doing such maximisation.

    If the answer 2 (instead of 3) would be acceptable in the second case, things may be simpler - but you need to find a way of expressing clearly what the requirement is before we can be sure that it is simpler. For example if your count were defined as the smallest number such that a set with that number of combinations satisfying the distinctness rule can be found from your table is not a proper subset of any set obeying the distinctness rule than can be taken from the table you have an equally nasty constrained minimisation (then, of course, 2 would be the only acceptable answer in the second case). If both 2 and 3 are acceptable, maybe the answer can be any number which is the size of some set of comginations which satisfies the uniqueness rule and is not a proper subset of another such set; in that case you genuinely can have simpler code, but you may find that if you run the query twice on the same data you can get two different answers if for example an index has been defragmented between the two runs, or if an extra CPU has been added to the server.

    Tom

  • L' Eomot Inversé (1/13/2012)


    That's quite clear as far as it goes, but it leaves some questions open. You seem to think that it's something that can be written as a single SQL statement, but as I read it you have secified a constrained maximisation problem which may not be possible to express efficiently in such a way.

    For example, if I have

    datakey landline

    1 100

    1 200

    1 300

    2 200

    2 300

    2 100

    3 300

    3 100

    3 200

    in the data you presumably want the uniquelandline count to be 3 because it is possible to find 3 (datakey, landline) combinations no two of which have the same langline or the same datakey. And presumably you also want the count to be 3 is what I have is

    1 100

    1 200

    1 300

    2 100

    2 200

    2 300

    3 100

    because it is still possible to choose 3 combinations which are distinct in both fields: (1,300), (2,200),and (1,300).

    But of course if I pick (1,100) and (2,200) there is no landline value I can pick to go wioth datakey 3, because its only lanline has already been counted with datakey 1, so to get the right answer I have to maximise the number of combinations by not choosing (1,100) in this case. And also by not choosing (2,100) in this case.

    If that is the requirement (with a simlar requirement for each of the other fields) then none of the code suggested so far comes anywhere near doing what you require, because what you have is a constrained maximisation problem which probably needs trial and backtrack to resolve it, and the code so far makes no attempt at all at doing such maximisation.

    If the answer 2 (instead of 3) would be acceptable in the second case, things may be simpler - but you need to find a way of expressing clearly what the requirement is before we can be sure that it is simpler. For example if your count were defined as the smallest number such that a set with that number of combinations satisfying the distinctness rule can be found from your table is not a proper subset of any set obeying the distinctness rule than can be taken from the table you have an equally nasty constrained minimisation (then, of course, 2 would be the only acceptable answer in the second case). If both 2 and 3 are acceptable, maybe the answer can be any number which is the size of some set of comginations which satisfies the uniqueness rule and is not a proper subset of another such set; in that case you genuinely can have simpler code, but you may find that if you run the query twice on the same data you can get two different answers if for example an index has been defragmented between the two runs, or if an extra CPU has been added to the server.

    Thanks L' Eomot Inversé.

    From what you have described , my ideal answer would have been 3. (This may be over complicating the query though)

    I do not want a query where if I run it again on the same dataset it could potentially return different results. So what I need to do is define a rule to stop this from happening. What if there was a rule which ordered the results by each field (oder by landline asc)? So looking at your dataset, I would end up with (1,100), then (2,200) as 200 is the next smallest landline available? Then (3,300) - resulting in a count of 3.Would this be possible and would this make the query any easier?

  • bicky1980 (1/13/2012)


    Total number of populated Landline fields (Needs to have a unique datakey as well as be unique itself - If the same landline is anywhere else in the results its needs to be excluded from the figure)

    We need to know more about the data distribution. What do you want to do if there are two rows (and only two out of 40M) with the same datakey, and each row has a unique (unique in 40M rows) landline value? According to this rule, one or both would have to be excluded.

    I've added a single row to your data set to explain:

    DROP table test2

    create table test2(indkey nvarchar(2),datakey nvarchar(4),landline nvarchar(11),mobile nvarchar(11),email nvarchar(20))

    insert into test2 values

    ('1','0001','01234567890','0712345679','1@test.co.uk'),

    ('2','0001','01234567890','','1@test.co.uk'),

    ('3','0002','01234567890','','2@test.co.uk'),

    ('4','0002','01234567891','','2@test.co.uk'),

    ('5','0002','','07123456789',''),

    ('6','0003','01234567892','07123456791',''), -- unique landline, same datakey

    ('7','0003','01234567897','07123456791',''), -- unique landline, same datakey

    ('8','0004','01234567893','07123456792',''),

    ('9','0005','01234567894','07123456793','2@test.co.uk'),

    ('10','0008','01234567895','07123456793','9@test.co.uk')

    SELECT

    indkey,

    datakey,

    drn = ROW_NUMBER() OVER(PARTITION BY datakey ORDER BY landline),

    landline,

    lrn = ROW_NUMBER() OVER(PARTITION BY landline ORDER BY datakey)

    FROM test2

    WHERE landline <> ''

    ORDER BY datakey, landline

    The description of the problem is much better - thanks - but there are gaps as Tom has pointed out and as this test shows. Also, a much larger sample data set would go a long way.

    Cheers

    ChrisM


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • The order of the rows in a dataset doesn't really matter to set-based SQL operations on it.

    I just need a clarification on this statement, "Total number of populated Landline fields (Needs to have a unique datakey as well as be unique itself - If the same landline is anywhere else in the results its needs to be excluded from the figure)"

    If a landline exists multiple times in the data, and has multiple datakey values, don't count it all, or count it only one time?

    E.g.:

    datakey landline pairs:

    1 100

    2 100

    Would that count as 1 or 0? Clarification of "If the same landline is anywhere else in the results its needs to be excluded from the figure". Does that mean exclude it entirely, or exclude the duplicates?

    If that should count as 1, then this seems to work:

    IF OBJECT_ID(N'tempdb..#test2') IS NOT NULL

    DROP TABLE #test2;

    CREATE TABLE #test2

    (indkey NVARCHAR(2),

    datakey NVARCHAR(4),

    landline NVARCHAR(11),

    mobile NVARCHAR(11),

    email NVARCHAR(20))

    INSERT INTO #test2

    VALUES ('1', '0001', '01234567890', '0712345679', '1@test.co.uk'),

    ('2', '0001', '01234567890', '', '1@test.co.uk'),

    ('3', '0002', '01234567890', '', '2@test.co.uk'),

    ('4', '0002', '01234567891', '', '2@test.co.uk'),

    ('5', '0002', '', '07123456789', ''),

    ('6', '0003', '01234567892', '07123456791', ''),

    ('7', '0004', '01234567893', '07123456792', ''),

    ('8', '0005', '01234567894', '07123456793', '2@test.co.uk'),

    ('9', '0008', '01234567895', '07123456793', '9@test.co.uk') ;

    SELECT 'Total' AS CountType, COUNT(*) AS Qty

    FROM #test2

    UNION ALL

    SELECT 'Datakeys', COUNT(DISTINCT datakey)

    FROM #test2

    UNION ALL

    SELECT 'Landlines', COUNT(DISTINCT landline)

    FROM #test2

    WHERE landline > ''

    AND datakey > ''

    UNION ALL

    SELECT 'Mobiles', COUNT(DISTINCT mobile)

    FROM #test2

    WHERE mobile > ''

    AND datakey > ''

    UNION ALL

    SELECT 'Emails', COUNT(DISTINCT email)

    FROM #test2

    WHERE email > ''

    AND datakey > '';

    If 0, then change the Select to this:

    SELECT 'Total' AS CountType, COUNT(*) AS Qty

    FROM #test2

    UNION ALL

    SELECT 'Datakeys', COUNT(DISTINCT datakey)

    FROM #test2

    UNION ALL

    SELECT 'Landlines',

    COUNT(DISTINCT landline)

    FROM (SELECT landline

    FROM #test2

    WHERE landline > ''

    AND datakey > ''

    GROUP BY landline

    HAVING COUNT(*) = 1) AS Sub

    UNION ALL

    SELECT 'Mobiles',

    COUNT(DISTINCT mobile)

    FROM (SELECT mobile

    FROM #test2

    WHERE mobile > ''

    AND datakey > ''

    GROUP BY mobile

    HAVING COUNT(*) = 1) AS Sub

    UNION ALL

    SELECT 'Emails',

    COUNT(DISTINCT email)

    FROM (SELECT email

    FROM #test2

    WHERE email > ''

    AND datakey > ''

    GROUP BY email

    HAVING COUNT(*) = 1) AS Sub;

    (All scripts tested in SQL 2008 R2 Dev Edition.)

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • stephen99999 (1/12/2012)


    My question is, how did you figure out to do it this way? I have not used row_number(), over, partition by very much and really don't understand it. Can anyone elaborate? Haven't found a good blog that explains this in "idiot terms."

    Ahem... http://qa.sqlservercentral.com/articles/T-SQL/69717/[/url]

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • ChrisM@home (1/13/2012)


    bicky1980 (1/13/2012)


    Total number of populated Landline fields (Needs to have a unique datakey as well as be unique itself - If the same landline is anywhere else in the results its needs to be excluded from the figure)

    We need to know more about the data distribution. What do you want to do if there are two rows (and only two out of 40M) with the same datakey, and each row has a unique (unique in 40M rows) landline value? According to this rule, one or both would have to be excluded.

    I've added a single row to your data set to explain:

    DROP table test2

    create table test2(indkey nvarchar(2),datakey nvarchar(4),landline nvarchar(11),mobile nvarchar(11),email nvarchar(20))

    insert into test2 values

    ('1','0001','01234567890','0712345679','1@test.co.uk'),

    ('2','0001','01234567890','','1@test.co.uk'),

    ('3','0002','01234567890','','2@test.co.uk'),

    ('4','0002','01234567891','','2@test.co.uk'),

    ('5','0002','','07123456789',''),

    ('6','0003','01234567892','07123456791',''), -- unique landline, same datakey

    ('7','0003','01234567897','07123456791',''), -- unique landline, same datakey

    ('8','0004','01234567893','07123456792',''),

    ('9','0005','01234567894','07123456793','2@test.co.uk'),

    ('10','0008','01234567895','07123456793','9@test.co.uk')

    SELECT

    indkey,

    datakey,

    drn = ROW_NUMBER() OVER(PARTITION BY datakey ORDER BY landline),

    landline,

    lrn = ROW_NUMBER() OVER(PARTITION BY landline ORDER BY datakey)

    FROM test2

    WHERE landline <> ''

    ORDER BY datakey, landline

    The description of the problem is much better - thanks - but there are gaps as Tom has pointed out and as this test shows. Also, a much larger sample data set would go a long way.

    Cheers

    ChrisM

    Thanks for the reply, if there were 2 duplicate datakeys with different landline values, I would only want this to be counted as one record, as I dont want duplicate datakeys

    Thanks

  • GSquared (1/13/2012)


    The order of the rows in a dataset doesn't really matter to set-based SQL operations on it.

    I just need a clarification on this statement, "Total number of populated Landline fields (Needs to have a unique datakey as well as be unique itself - If the same landline is anywhere else in the results its needs to be excluded from the figure)"

    If a landline exists multiple times in the data, and has multiple datakey values, don't count it all, or count it only one time?

    E.g.:

    datakey landline pairs:

    1 100

    2 100

    Would that count as 1 or 0? Clarification of "If the same landline is anywhere else in the results its needs to be excluded from the figure". Does that mean exclude it entirely, or exclude the duplicates?

    If that should count as 1, then this seems to work:

    IF OBJECT_ID(N'tempdb..#test2') IS NOT NULL

    DROP TABLE #test2;

    CREATE TABLE #test2

    (indkey NVARCHAR(2),

    datakey NVARCHAR(4),

    landline NVARCHAR(11),

    mobile NVARCHAR(11),

    email NVARCHAR(20))

    INSERT INTO #test2

    VALUES ('1', '0001', '01234567890', '0712345679', '1@test.co.uk'),

    ('2', '0001', '01234567890', '', '1@test.co.uk'),

    ('3', '0002', '01234567890', '', '2@test.co.uk'),

    ('4', '0002', '01234567891', '', '2@test.co.uk'),

    ('5', '0002', '', '07123456789', ''),

    ('6', '0003', '01234567892', '07123456791', ''),

    ('7', '0004', '01234567893', '07123456792', ''),

    ('8', '0005', '01234567894', '07123456793', '2@test.co.uk'),

    ('9', '0008', '01234567895', '07123456793', '9@test.co.uk') ;

    SELECT 'Total' AS CountType, COUNT(*) AS Qty

    FROM #test2

    UNION ALL

    SELECT 'Datakeys', COUNT(DISTINCT datakey)

    FROM #test2

    UNION ALL

    SELECT 'Landlines', COUNT(DISTINCT landline)

    FROM #test2

    WHERE landline > ''

    AND datakey > ''

    UNION ALL

    SELECT 'Mobiles', COUNT(DISTINCT mobile)

    FROM #test2

    WHERE mobile > ''

    AND datakey > ''

    UNION ALL

    SELECT 'Emails', COUNT(DISTINCT email)

    FROM #test2

    WHERE email > ''

    AND datakey > '';

    If 0, then change the Select to this:

    SELECT 'Total' AS CountType, COUNT(*) AS Qty

    FROM #test2

    UNION ALL

    SELECT 'Datakeys', COUNT(DISTINCT datakey)

    FROM #test2

    UNION ALL

    SELECT 'Landlines',

    COUNT(DISTINCT landline)

    FROM (SELECT landline

    FROM #test2

    WHERE landline > ''

    AND datakey > ''

    GROUP BY landline

    HAVING COUNT(*) = 1) AS Sub

    UNION ALL

    SELECT 'Mobiles',

    COUNT(DISTINCT mobile)

    FROM (SELECT mobile

    FROM #test2

    WHERE mobile > ''

    AND datakey > ''

    GROUP BY mobile

    HAVING COUNT(*) = 1) AS Sub

    UNION ALL

    SELECT 'Emails',

    COUNT(DISTINCT email)

    FROM (SELECT email

    FROM #test2

    WHERE email > ''

    AND datakey > ''

    GROUP BY email

    HAVING COUNT(*) = 1) AS Sub;

    (All scripts tested in SQL 2008 R2 Dev Edition.)

    I would only require this to be counted as 1 record only...From your solution for this the results would be:

    Total9

    Datakeys6

    Landlines5

    Mobiles4

    Emails1

    The results should be:

    Total9

    Datakeys6

    Landlines6

    Mobiles5

    Emails3

  • bicky1980 (1/13/2012)


    ...

    Thanks for the reply, if there were 2 duplicate datakeys with different landline values, I would only want this to be counted as one record, as I dont want duplicate datakeys

    Thanks

    Try running this query with a larger dataset, and have a look at rows where drn=1 and lrn=1;

    SELECT

    indkey,

    datakey,

    drn = ROW_NUMBER() OVER(PARTITION BY datakey ORDER BY landline DESC),

    landline,

    lrn = ROW_NUMBER() OVER(PARTITION BY landline ORDER BY datakey)

    FROM test2

    WHERE landline <> ''

    ORDER BY datakey, landline DESC


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • ChrisM@home (1/13/2012)


    bicky1980 (1/13/2012)


    ...

    Thanks for the reply, if there were 2 duplicate datakeys with different landline values, I would only want this to be counted as one record, as I dont want duplicate datakeys

    Thanks

    Try running this query with a larger dataset, and have a look at rows where drn=1 and lrn=1;

    SELECT

    indkey,

    datakey,

    drn = ROW_NUMBER() OVER(PARTITION BY datakey ORDER BY landline DESC),

    landline,

    lrn = ROW_NUMBER() OVER(PARTITION BY landline ORDER BY datakey)

    FROM test2

    WHERE landline <> ''

    ORDER BY datakey, landline DESC

    Can you tell me what this query should be doing?

  • bicky1980 (1/16/2012)


    ChrisM@home (1/13/2012)


    bicky1980 (1/13/2012)


    ...

    Thanks for the reply, if there were 2 duplicate datakeys with different landline values, I would only want this to be counted as one record, as I dont want duplicate datakeys

    Thanks

    Try running this query with a larger dataset, and have a look at rows where drn=1 and lrn=1;

    SELECT

    indkey,

    datakey,

    drn = ROW_NUMBER() OVER(PARTITION BY datakey ORDER BY landline DESC),

    landline,

    lrn = ROW_NUMBER() OVER(PARTITION BY landline ORDER BY datakey)

    FROM test2

    WHERE landline <> ''

    ORDER BY datakey, landline DESC

    Can you tell me what this query should be doing?

    Sure - here's an excellent article by Wayne Sheffield explaining the window functions[/url].


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • Hi Everyone, Thanks for all your help with this query.

    With ChrisM's help, I think I now have the solution:

    CREATE TABLE #test2 (indkey NVARCHAR(2), datakey NVARCHAR(4), landline NVARCHAR(11), mobile NVARCHAR(11), email NVARCHAR(20))

    insert into test2

    values ('1','0001','01234567890','0712345679','1@test.co.uk')

    insert into test2

    values ('2','0001','01234567890','','1@test.co.uk')

    insert into test2

    values ('3','0002','01234567890','','2@test.co.uk')

    insert into test2

    values ('4','0002','01234567891','','2@test.co.uk')

    insert into test2

    values ('5','0002','','07123456789','')

    insert into test2

    values ('6','0003','01234567892','07123456791','')

    insert into test2

    values ('7','0004','01234567893','07123456792','')

    insert into test2

    values ('8','0005','01234567894','07123456793','2@test.co.uk')

    insert into test2

    values ('9','0008','01234567895','07123456793','9@test.co.uk')

    insert into test2

    values ('10','0008','01234567895','','5@test.co.uk')

    insert into test2

    values ('11','0008','01234567896','','6@test.co.uk')

    insert into test2

    values ('12','0009','01234567896','','6@test.co.uk')

    insert into test2

    values ('13','0009','01234567889','','6@test.co.uk')

    select 'Total' AS CountType,count(*) as AllRecs

    from test2

    UNION ALL

    select 'Datakeys',sum(drn)

    from (SELECT *,

    drn = ROW_NUMBER() OVER(PARTITION BY datakey ORDER BY datakey)

    FROM test2) a

    where drn=1

    and drn!=''

    UNION ALL

    select 'Landlines',sum(lrn)

    from (SELECT *,

    lrn = ROW_NUMBER() OVER(PARTITION BY landline ORDER BY datakey)

    FROM test2) a

    where lrn=1

    and landline!=''

    UNION ALL

    select 'Mobiles',sum(mrn)

    from (SELECT *,

    mrn = ROW_NUMBER() OVER(PARTITION BY mobile ORDER BY datakey)

    FROM test2) a

    where mrn=1

    and mobile!=''

    UNION ALL

    select 'Emails',sum(ern)

    from (SELECT *,

    ern = ROW_NUMBER() OVER(PARTITION BY email ORDER BY datakey)

    FROM test2) a

    where ern=1

    and email!=''

    It does appear to collate the correct results - I will test more to ensure the figure is correct. In the meantime could someone assist me with yet another part of this query. A the moment the results are returned as rows, how would I go about returning the results in columns so rather than:

    Total13

    Datakeys7

    Landlines8

    Mobiles5

    Emails5

    I would have

    Total Datakeys Landlines Mobiles Emails

    13 7 8 5 5

    Thanks

  • bicky1980 (1/16/2012)


    Hi Everyone, Thanks for all your help with this query.

    With ChrisM's help, I think I now have the solution:

    CREATE TABLE #test2 (indkey NVARCHAR(2), datakey NVARCHAR(4), landline NVARCHAR(11), mobile NVARCHAR(11), email NVARCHAR(20))

    insert into test2

    values ('1','0001','01234567890','0712345679','1@test.co.uk')

    insert into test2

    values ('2','0001','01234567890','','1@test.co.uk')

    insert into test2

    values ('3','0002','01234567890','','2@test.co.uk')

    insert into test2

    values ('4','0002','01234567891','','2@test.co.uk')

    insert into test2

    values ('5','0002','','07123456789','')

    insert into test2

    values ('6','0003','01234567892','07123456791','')

    insert into test2

    values ('7','0004','01234567893','07123456792','')

    insert into test2

    values ('8','0005','01234567894','07123456793','2@test.co.uk')

    insert into test2

    values ('9','0008','01234567895','07123456793','9@test.co.uk')

    insert into test2

    values ('10','0008','01234567895','','5@test.co.uk')

    insert into test2

    values ('11','0008','01234567896','','6@test.co.uk')

    insert into test2

    values ('12','0009','01234567896','','6@test.co.uk')

    insert into test2

    values ('13','0009','01234567889','','6@test.co.uk')

    select 'Total' AS CountType,count(*) as AllRecs

    from test2

    UNION ALL

    select 'Datakeys',sum(drn)

    from (SELECT *,

    drn = ROW_NUMBER() OVER(PARTITION BY datakey ORDER BY datakey)

    FROM test2) a

    where drn=1

    and drn!=''

    UNION ALL

    select 'Landlines',sum(lrn)

    from (SELECT *,

    lrn = ROW_NUMBER() OVER(PARTITION BY landline ORDER BY datakey)

    FROM test2) a

    where lrn=1

    and landline!=''

    UNION ALL

    select 'Mobiles',sum(mrn)

    from (SELECT *,

    mrn = ROW_NUMBER() OVER(PARTITION BY mobile ORDER BY datakey)

    FROM test2) a

    where mrn=1

    and mobile!=''

    UNION ALL

    select 'Emails',sum(ern)

    from (SELECT *,

    ern = ROW_NUMBER() OVER(PARTITION BY email ORDER BY datakey)

    FROM test2) a

    where ern=1

    and email!=''

    It does appear to collate the correct results - I will test more to ensure the figure is correct. In the meantime could someone assist me with yet another part of this query. A the moment the results are returned as rows, how would I go about returning the results in columns so rather than:

    Total13

    Datakeys7

    Landlines8

    Mobiles5

    Emails5

    I would have

    Total Datakeys Landlines Mobiles Emails

    13 7 8 5 5

    Thanks

    My original query

    SELECT COUNT(*) AS Total,

    COUNT(DISTINCT datakey) AS Uniquedatakey,

    COUNT(DISTINCT CASE WHEN landline<>'' THEN landline END) AS Uniquelandline,

    COUNT(DISTINCT CASE WHEN mobile<>'' THEN mobile END) AS Uniquemobile,

    COUNT(DISTINCT CASE WHEN email<>'' THEN email END) AS Uniqueemail

    FROM #test2

    give these results as well. Coincidence?

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • Instead of using a where clause on the conditional values you need you may want to use a CASE statement incorporated into the select list of the select statement to provide the conditional you need to count the conditional records

Viewing 15 posts - 46 through 60 (of 120 total)

You must be logged in to reply to this topic. Login to reply