Help with Counts

  • Mark-101232 (1/16/2012)


    bicky1980 (1/16/2012)


    Hi Everyone, Thanks for all your help with this query.

    With ChrisM's help, I think I now have the solution:

    CREATE TABLE #test2 (indkey NVARCHAR(2), datakey NVARCHAR(4), landline NVARCHAR(11), mobile NVARCHAR(11), email NVARCHAR(20))

    insert into test2

    values ('1','0001','01234567890','0712345679','1@test.co.uk')

    insert into test2

    values ('2','0001','01234567890','','1@test.co.uk')

    insert into test2

    values ('3','0002','01234567890','','2@test.co.uk')

    insert into test2

    values ('4','0002','01234567891','','2@test.co.uk')

    insert into test2

    values ('5','0002','','07123456789','')

    insert into test2

    values ('6','0003','01234567892','07123456791','')

    insert into test2

    values ('7','0004','01234567893','07123456792','')

    insert into test2

    values ('8','0005','01234567894','07123456793','2@test.co.uk')

    insert into test2

    values ('9','0008','01234567895','07123456793','9@test.co.uk')

    insert into test2

    values ('10','0008','01234567895','','5@test.co.uk')

    insert into test2

    values ('11','0008','01234567896','','6@test.co.uk')

    insert into test2

    values ('12','0009','01234567896','','6@test.co.uk')

    insert into test2

    values ('13','0009','01234567889','','6@test.co.uk')

    select 'Total' AS CountType,count(*) as AllRecs

    from test2

    UNION ALL

    select 'Datakeys',sum(drn)

    from (SELECT *,

    drn = ROW_NUMBER() OVER(PARTITION BY datakey ORDER BY datakey)

    FROM test2) a

    where drn=1

    and drn!=''

    UNION ALL

    select 'Landlines',sum(lrn)

    from (SELECT *,

    lrn = ROW_NUMBER() OVER(PARTITION BY landline ORDER BY datakey)

    FROM test2) a

    where lrn=1

    and landline!=''

    UNION ALL

    select 'Mobiles',sum(mrn)

    from (SELECT *,

    mrn = ROW_NUMBER() OVER(PARTITION BY mobile ORDER BY datakey)

    FROM test2) a

    where mrn=1

    and mobile!=''

    UNION ALL

    select 'Emails',sum(ern)

    from (SELECT *,

    ern = ROW_NUMBER() OVER(PARTITION BY email ORDER BY datakey)

    FROM test2) a

    where ern=1

    and email!=''

    It does appear to collate the correct results - I will test more to ensure the figure is correct. In the meantime could someone assist me with yet another part of this query. A the moment the results are returned as rows, how would I go about returning the results in columns so rather than:

    Total13

    Datakeys7

    Landlines8

    Mobiles5

    Emails5

    I would have

    Total Datakeys Landlines Mobiles Emails

    13 7 8 5 5

    Thanks

    My original query

    SELECT COUNT(*) AS Total,

    COUNT(DISTINCT datakey) AS Uniquedatakey,

    COUNT(DISTINCT CASE WHEN landline<>'' THEN landline END) AS Uniquelandline,

    COUNT(DISTINCT CASE WHEN mobile<>'' THEN mobile END) AS Uniquemobile,

    COUNT(DISTINCT CASE WHEN email<>'' THEN email END) AS Uniqueemail

    FROM #test2

    give these results as well. Coincidence?

    Unlikely - they should be the same. From the original description and "scope creep" I don't think we're there yet.


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • Mark-101232 (1/16/2012)


    bicky1980 (1/16/2012)


    Hi Everyone, Thanks for all your help with this query.

    With ChrisM's help, I think I now have the solution:

    CREATE TABLE #test2 (indkey NVARCHAR(2), datakey NVARCHAR(4), landline NVARCHAR(11), mobile NVARCHAR(11), email NVARCHAR(20))

    insert into test2

    values ('1','0001','01234567890','0712345679','1@test.co.uk')

    insert into test2

    values ('2','0001','01234567890','','1@test.co.uk')

    insert into test2

    values ('3','0002','01234567890','','2@test.co.uk')

    insert into test2

    values ('4','0002','01234567891','','2@test.co.uk')

    insert into test2

    values ('5','0002','','07123456789','')

    insert into test2

    values ('6','0003','01234567892','07123456791','')

    insert into test2

    values ('7','0004','01234567893','07123456792','')

    insert into test2

    values ('8','0005','01234567894','07123456793','2@test.co.uk')

    insert into test2

    values ('9','0008','01234567895','07123456793','9@test.co.uk')

    insert into test2

    values ('10','0008','01234567895','','5@test.co.uk')

    insert into test2

    values ('11','0008','01234567896','','6@test.co.uk')

    insert into test2

    values ('12','0009','01234567896','','6@test.co.uk')

    insert into test2

    values ('13','0009','01234567889','','6@test.co.uk')

    select 'Total' AS CountType,count(*) as AllRecs

    from test2

    UNION ALL

    select 'Datakeys',sum(drn)

    from (SELECT *,

    drn = ROW_NUMBER() OVER(PARTITION BY datakey ORDER BY datakey)

    FROM test2) a

    where drn=1

    and drn!=''

    UNION ALL

    select 'Landlines',sum(lrn)

    from (SELECT *,

    lrn = ROW_NUMBER() OVER(PARTITION BY landline ORDER BY datakey)

    FROM test2) a

    where lrn=1

    and landline!=''

    UNION ALL

    select 'Mobiles',sum(mrn)

    from (SELECT *,

    mrn = ROW_NUMBER() OVER(PARTITION BY mobile ORDER BY datakey)

    FROM test2) a

    where mrn=1

    and mobile!=''

    UNION ALL

    select 'Emails',sum(ern)

    from (SELECT *,

    ern = ROW_NUMBER() OVER(PARTITION BY email ORDER BY datakey)

    FROM test2) a

    where ern=1

    and email!=''

    It does appear to collate the correct results - I will test more to ensure the figure is correct. In the meantime could someone assist me with yet another part of this query. A the moment the results are returned as rows, how would I go about returning the results in columns so rather than:

    Total13

    Datakeys7

    Landlines8

    Mobiles5

    Emails5

    I would have

    Total Datakeys Landlines Mobiles Emails

    13 7 8 5 5

    Thanks

    My original query

    SELECT COUNT(*) AS Total,

    COUNT(DISTINCT datakey) AS Uniquedatakey,

    COUNT(DISTINCT CASE WHEN landline<>'' THEN landline END) AS Uniquelandline,

    COUNT(DISTINCT CASE WHEN mobile<>'' THEN mobile END) AS Uniquemobile,

    COUNT(DISTINCT CASE WHEN email<>'' THEN email END) AS Uniqueemail

    FROM #test2

    give these results as well. Coincidence?

    Because Neither of them work 🙁

    Added 2 addtional rows to check

    insert into test2

    values ('18','0014','01234567125','','9@test.co.uk')

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    This counts the records as 2 unqiue landlines but the dataskeys are the same, so should only count this as one record.

    Use following to see what I thought was the solution and why it doesnt work (look at datakey 0014)

    SELECT *,

    lrn = ROW_NUMBER() OVER(PARTITION BY landline ORDER BY datakey)

    FROM test2

    where landline!=''

  • Can you give this a whirl please Bicky, using a decent-sized dataset rather than the somewhat diminutive sample data set? Cheers.

    DROP TABLE #test2

    CREATE TABLE #test2 (indkey NVARCHAR(2), datakey NVARCHAR(4), landline NVARCHAR(11), mobile NVARCHAR(11), email NVARCHAR(20))

    insert into #test2 values

    ('1','0001','01234567890','0712345679','1@test.co.uk')

    ,('2','0001','01234567890','','1@test.co.uk')

    ,('3','0002','01234567890','','2@test.co.uk')

    ,('4','0002','01234567891','','2@test.co.uk')

    ,('5','0002','','07123456789','')

    ,('6','0003','01234567892','07123456791','')

    ,('7','0004','01234567893','07123456792','')

    ,('8','0005','01234567894','07123456793','2@test.co.uk')

    ,('9','0005','01234567898','07123456793','2@test.co.uk')

    ,('10','0008','01234567895','07123456793','9@test.co.uk')

    ,('11','0008','01234567895','','5@test.co.uk')

    ,('12','0008','01234567896','','6@test.co.uk')

    ,('13','0009','01234567896','','6@test.co.uk')

    ,('14','0009','01234567889','','6@test.co.uk')

    SELECT Datakey, CAST(landline AS VARCHAR(20))

    INTO #landline

    FROM (

    SELECT

    datakey,

    landline,

    datakeyCount = COUNT(*) OVER(PARTITION BY datakey),

    landlineCount = COUNT(*) OVER(PARTITION BY landline)

    FROM #test2

    WHERE landline <> ''

    ) d

    WHERE datakeyCount = 1 AND landlineCount = 1

    SELECT datakey, mobile

    INTO #mobile

    FROM (

    SELECT

    datakey,

    mobile,

    datakeyCount = COUNT(*) OVER(PARTITION BY datakey),

    mobileCount = COUNT(*) OVER(PARTITION BY mobile)

    FROM #test2

    WHERE mobile <> ''

    --AND datakey NOT IN (SELECT datakey FROM #landline)

    ) d

    WHERE datakeyCount = 1 AND mobileCount = 1

    SELECT datakey, email

    INTO #email

    FROM (

    SELECT

    datakey,

    email,

    datakeyCount = COUNT(*) OVER(PARTITION BY datakey),

    emailCount = COUNT(*) OVER(PARTITION BY email)

    FROM #test2

    WHERE email <> '' --AND datakey NOT IN (SELECT datakey FROM #landline UNION SELECT datakey FROM #mobile)

    ) d

    WHERE datakeyCount = 1 AND emailCount = 1

    -- see how much we've captured by picking distinct values:

    SELECT Commtype = 'LANDLINE', Datakey, landline

    FROM #landline

    UNION ALL

    SELECT Commtype = 'MOBILE', Datakey, mobile

    FROM #mobile

    UNION ALL

    SELECT Commtype = 'EMAIL', Datakey, email

    FROM #email

    -- see how much is left to allocate:

    SELECT datakey

    FROM #test2

    EXCEPT

    SELECT datakey

    FROM #landline

    UNION ALL

    SELECT datakey

    FROM #mobile

    UNION ALL

    SELECT datakey

    FROM #email

    Edit: commented out redundant filters


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • ChrisM@home (1/16/2012)


    Can you give this a whirl please Bicky, using a decent-sized dataset rather than the somewhat diminutive sample data set? Cheers.

    DROP TABLE #test2

    CREATE TABLE #test2 (indkey NVARCHAR(2), datakey NVARCHAR(4), landline NVARCHAR(11), mobile NVARCHAR(11), email NVARCHAR(20))

    insert into #test2 values

    ('1','0001','01234567890','0712345679','1@test.co.uk')

    ,('2','0001','01234567890','','1@test.co.uk')

    ,('3','0002','01234567890','','2@test.co.uk')

    ,('4','0002','01234567891','','2@test.co.uk')

    ,('5','0002','','07123456789','')

    ,('6','0003','01234567892','07123456791','')

    ,('7','0004','01234567893','07123456792','')

    ,('8','0005','01234567894','07123456793','2@test.co.uk')

    ,('9','0005','01234567898','07123456793','2@test.co.uk')

    ,('10','0008','01234567895','07123456793','9@test.co.uk')

    ,('11','0008','01234567895','','5@test.co.uk')

    ,('12','0008','01234567896','','6@test.co.uk')

    ,('13','0009','01234567896','','6@test.co.uk')

    ,('14','0009','01234567889','','6@test.co.uk')

    SELECT Datakey, CAST(landline AS VARCHAR(20))

    INTO #landline

    FROM (

    SELECT

    datakey,

    landline,

    datakeyCount = COUNT(*) OVER(PARTITION BY datakey),

    landlineCount = COUNT(*) OVER(PARTITION BY landline)

    FROM #test2

    WHERE landline <> ''

    ) d

    WHERE datakeyCount = 1 AND landlineCount = 1

    SELECT datakey, mobile

    INTO #mobile

    FROM (

    SELECT

    datakey,

    mobile,

    datakeyCount = COUNT(*) OVER(PARTITION BY datakey),

    mobileCount = COUNT(*) OVER(PARTITION BY mobile)

    FROM #test2

    WHERE mobile <> ''

    --AND datakey NOT IN (SELECT datakey FROM #landline)

    ) d

    WHERE datakeyCount = 1 AND mobileCount = 1

    SELECT datakey, email

    INTO #email

    FROM (

    SELECT

    datakey,

    email,

    datakeyCount = COUNT(*) OVER(PARTITION BY datakey),

    emailCount = COUNT(*) OVER(PARTITION BY email)

    FROM #test2

    WHERE email <> '' --AND datakey NOT IN (SELECT datakey FROM #landline UNION SELECT datakey FROM #mobile)

    ) d

    WHERE datakeyCount = 1 AND emailCount = 1

    -- see how much we've captured by picking distinct values:

    SELECT Commtype = 'LANDLINE', Datakey, landline

    FROM #landline

    UNION ALL

    SELECT Commtype = 'MOBILE', Datakey, mobile

    FROM #mobile

    UNION ALL

    SELECT Commtype = 'EMAIL', Datakey, email

    FROM #email

    -- see how much is left to allocate:

    SELECT datakey

    FROM #test2

    EXCEPT

    SELECT datakey

    FROM #landline

    UNION ALL

    SELECT datakey

    FROM #mobile

    UNION ALL

    SELECT datakey

    FROM #email

    Edit: commented out redundant filters

    Could I just ask, how do you copy code from the code window, so I can simple copy and paste into management studio - I am haviing to reformat the code everytime as it usually always is just spanned over 1 line?

    I ran your query but I just got the following back:

    Commtype Datakey landline

    -------- ------- --------------------

    LANDLINE 0003 01234567892

    LANDLINE 0004 01234567893

    MOBILE 0002 07123456789

    MOBILE 0001 0712345679

    MOBILE 0003 07123456791

    MOBILE 0004 07123456792

  • bicky1980 (1/16/2012)This counts the records as 2 unqiue landlines but the dataskeys are the same, so should only count this as one record.

    This statement contradicts your counts. This would make it impossible to have a higher count of unique landlines than the count of unique datakeys, yet you show a count of 7 unique datakeys and 8 unique landlines.

    Jared
    CE - Microsoft

  • You have this:

    insert into test2

    values ('18','0014','01234567125','','9@test.co.uk')

    insert into test2

    values ('19','0014','01234567126','','9@test.co.uk')

    Then this is 1 unique landline?

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0014','01234567126','','9@test.co.uk')

    This is 1 unique landline?

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0015','01234567126','','9@test.co.uk')

    This is 1 unique landline?

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0015','01234567127','','9@test.co.uk')

    This is 2 unique landlines?

    Is this correct?

    Jared
    CE - Microsoft

  • SQLKnowItAll (1/16/2012)


    bicky1980 (1/16/2012)This counts the records as 2 unqiue landlines but the dataskeys are the same, so should only count this as one record.

    This statement contradicts your counts. This would make it impossible to have a higher count of unique landlines than the count of unique datakeys, yet you show a count of 7 unique datakeys and 8 unique landlines.

    Sorry, started to confuse myself (Thanks for pointing this out),

    Results should be:

    Datasets=7

    Landlines=7

    Mobiles=5

    Emails=4

  • SQLKnowItAll (1/16/2012)


    You have this:

    insert into test2

    values ('18','0014','01234567125','','9@test.co.uk')

    insert into test2

    values ('19','0014','01234567126','','9@test.co.uk')

    Then this is 1 unique landline?

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0014','01234567126','','9@test.co.uk')

    This is 1 unique landline?

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0015','01234567126','','9@test.co.uk')

    This is 1 unique landline?

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0015','01234567127','','9@test.co.uk')

    This is 2 unique landlines?

    Is this correct?

    Yes this is all correct

  • bicky1980 (1/16/2012)


    SQLKnowItAll (1/16/2012)


    You have this:

    insert into test2

    values ('18','0014','01234567125','','9@test.co.uk')

    insert into test2

    values ('19','0014','01234567126','','9@test.co.uk')

    Then this is 1 unique landline?

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0014','01234567126','','9@test.co.uk')

    This is 1 unique landline?

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0015','01234567126','','9@test.co.uk')

    This is 1 unique landline?

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0015','01234567127','','9@test.co.uk')

    This is 2 unique landlines?

    Is this correct?

    Yes this is all correct

    Here is the problem with this logic...

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0014','01234567127','','9@test.co.uk')

    insert into test2

    values ('20','0015','01234567126','','9@test.co.uk')

    insert into test2

    values ('21','0015','01234567127','','9@test.co.uk')

    So, for 0014 there is 1 unique landline, and for 0015 there is 1 unique landline. However, the landlines are duplicates between 0014 and 0015, so they get dropped... Which rule supercedes the other? Is this 1 unique landline or 2? You see, these are not concrete business rules as defined in the thread. Maybe I confused myself, but with the rules you gave this could come out to 1 unique landline or 2 depending on which rule is satisfied first.

    Jared
    CE - Microsoft

  • SQLKnowItAll (1/16/2012)


    bicky1980 (1/16/2012)


    SQLKnowItAll (1/16/2012)


    You have this:

    insert into test2

    values ('18','0014','01234567125','','9@test.co.uk')

    insert into test2

    values ('19','0014','01234567126','','9@test.co.uk')

    Then this is 1 unique landline?

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0014','01234567126','','9@test.co.uk')

    This is 1 unique landline?

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0015','01234567126','','9@test.co.uk')

    This is 1 unique landline?

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0015','01234567127','','9@test.co.uk')

    This is 2 unique landlines?

    Is this correct?

    Yes this is all correct

    Here is the problem with this logic...

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0014','01234567127','','9@test.co.uk')

    insert into test2

    values ('20','0015','01234567126','','9@test.co.uk')

    insert into test2

    values ('21','0015','01234567127','','9@test.co.uk')

    So, for 0014 there is 1 unique landline, and for 0015 there is 1 unique landline. However, the landlines are duplicates between 0014 and 0015, so they get dropped... Which rule supercedes the other? Is this 1 unique landline or 2? You see, these are not concrete business rules as defined in the thread. Maybe I confused myself, but with the rules you gave this could come out to 1 unique landline or 2 depending on which rule is satisfied first.

    OK, I see the issue here, I think I was over complecating it to start with. If we could prioritise the datakey initially, then the landline (or other field) afterwards.

  • SQLKnowItAll (1/16/2012)


    bicky1980 (1/16/2012)


    SQLKnowItAll (1/16/2012)


    You have this:

    insert into test2

    values ('18','0014','01234567125','','9@test.co.uk')

    insert into test2

    values ('19','0014','01234567126','','9@test.co.uk')

    Then this is 1 unique landline?

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0014','01234567126','','9@test.co.uk')

    This is 1 unique landline?

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0015','01234567126','','9@test.co.uk')

    This is 1 unique landline?

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0015','01234567127','','9@test.co.uk')

    This is 2 unique landlines?

    Is this correct?

    Yes this is all correct

    Here is the problem with this logic...

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0014','01234567127','','9@test.co.uk')

    insert into test2

    values ('20','0015','01234567126','','9@test.co.uk')

    insert into test2

    values ('21','0015','01234567127','','9@test.co.uk')

    So, for 0014 there is 1 unique landline, and for 0015 there is 1 unique landline. However, the landlines are duplicates between 0014 and 0015, so they get dropped... Which rule supercedes the other? Is this 1 unique landline or 2? You see, these are not concrete business rules as defined in the thread. Maybe I confused myself, but with the rules you gave this could come out to 1 unique landline or 2 depending on which rule is satisfied first.

    GSquared has already pointed this out.

    It looks to me like a modification of the boxes-and-balls or classrooms-and-students methods will be appropriate here.


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • ChrisM@home (1/16/2012)


    SQLKnowItAll (1/16/2012)


    bicky1980 (1/16/2012)


    SQLKnowItAll (1/16/2012)


    You have this:

    insert into test2

    values ('18','0014','01234567125','','9@test.co.uk')

    insert into test2

    values ('19','0014','01234567126','','9@test.co.uk')

    Then this is 1 unique landline?

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0014','01234567126','','9@test.co.uk')

    This is 1 unique landline?

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0015','01234567126','','9@test.co.uk')

    This is 1 unique landline?

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0015','01234567127','','9@test.co.uk')

    This is 2 unique landlines?

    Is this correct?

    Yes this is all correct

    Here is the problem with this logic...

    You have this:

    insert into test2

    values ('18','0014','01234567126','','9@test.co.uk')

    insert into test2

    values ('19','0014','01234567127','','9@test.co.uk')

    insert into test2

    values ('20','0015','01234567126','','9@test.co.uk')

    insert into test2

    values ('21','0015','01234567127','','9@test.co.uk')

    So, for 0014 there is 1 unique landline, and for 0015 there is 1 unique landline. However, the landlines are duplicates between 0014 and 0015, so they get dropped... Which rule supercedes the other? Is this 1 unique landline or 2? You see, these are not concrete business rules as defined in the thread. Maybe I confused myself, but with the rules you gave this could come out to 1 unique landline or 2 depending on which rule is satisfied first.

    GSquared has already pointed this out.

    It looks to me like a modification of the boxes-and-balls or classrooms-and-students methods will be appropriate here.

    And what does that mean?

  • bicky1980 (1/16/2012)


    ...

    And what does that mean?

    They're solutions to a particular type of problem, well known and defined, and I think it's fair to say that one or two folks will be working on it right now. Since it's 10:59pm here in the UK, I'm not one of them - until tomorrow morning 😉


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • ChrisM@home (1/16/2012)


    bicky1980 (1/16/2012)


    ...

    And what does that mean?

    They're solutions to a particular type of problem, well known and defined, and I think it's fair to say that one or two folks will be working on it right now. Since it's 10:59pm here in the UK, I'm not one of them - until tomorrow morning ;-)[/quote

    Ok, I'm in uk too, so understand... Thanks for all your help do far

  • Hi Bickie

    Here's a solution which seems to work with an extended sample data set.

    Sample data:

    DROP TABLE #test2

    CREATE TABLE #test2 (indkey NVARCHAR(2), datakey NVARCHAR(4), landline NVARCHAR(11), mobile NVARCHAR(11), email NVARCHAR(20))

    INSERT INTO #test2 VALUES

    ('1','0001','01234567890','0712345679','1@test.co.uk') -- YES

    ,('2','0001','01234567890','','1@test.co.uk') -- NO, dupe on indkey = 1 (datakey & landline)

    ,('3','0002','01234567890','','2@test.co.uk') -- NO, dupe on indkey = 1 (landline)

    ,('4','0002','01234567891','','2@test.co.uk') -- YES

    ,('5','0002','','07123456789','')-- NO, dupe on indkey = 4 (datakey)

    ,('6','0003','01234567892','07123456791','') -- YES

    ,('7','0004','01234567893','07123456792','') -- YES

    ,('8','0005','01234567894','07123456793','2@test.co.uk') -- YES

    ,('9','0005','01234567898','07123456793','2@test.co.uk') -- NO, dupe on indkey = 8 (datakey)

    ,('10','0008','01234567895','07123456793','9@test.co.uk') -- YES

    ,('11','0008','01234567895','','5@test.co.uk') -- NO, dupe on indkey = 10 (datakey)

    ,('12','0008','01234567896','','6@test.co.uk') -- NO, dupe on indkey = 10 (datakey)

    ,('13','0009','01234567896','','6@test.co.uk') -- NO, dupe on indkey = 10 (landline)

    ,('14','0009','01234567889','','6@test.co.uk') -- YES

    ,('15','0010','01189567889','','6@test.co.uk') -- YES

    ,('16','0010','01189567890','','6@test.co.uk') -- NO, dupe on indkey = 15 (datakey)

    ,('17','0001','03189567889','','6@test.co.uk') -- NO, dupe on indkey = 1 (datakey)

    ,('18','0002','03189567890','','6@test.co.uk') -- NO, dupe on indkey = 4 (datakey)

    ,('19','0003','03189567891','','6@test.co.uk') -- NO, dupe on indkey = 6 (datakey)

    Code:

    ;WITH Level1 AS (

    SELECT indkey, datakey, landline

    FROM (

    SELECT indkey, datakey, landline,

    rnll = ROW_NUMBER() OVER(PARTITION BY landline ORDER BY datakey),

    rndk = ROW_NUMBER() OVER(PARTITION BY datakey ORDER BY landline)

    FROM #test2

    WHERE landline <> ''

    ) d

    WHERE rnll = 1 AND rndk = 1

    ),

    Level2 AS (

    SELECT indkey, datakey, landline

    FROM (

    SELECT

    indkey, datakey, landline,

    rnll = ROW_NUMBER() OVER(PARTITION BY landline ORDER BY datakey),

    rndk = ROW_NUMBER() OVER(PARTITION BY datakey ORDER BY landline)

    FROM #test2 t

    WHERE landline <> ''

    AND NOT EXISTS (SELECT 1 FROM Level1 l WHERE l.landline = t.landline OR l.datakey = t.datakey)

    ) d

    WHERE rnll = 1 AND rndk = 1

    )

    SELECT * FROM Level1

    UNION ALL

    SELECT * FROM Level2

    ORDER BY datakey, landline

    What I suggest you do is this: run the code against a larger sample or your actual data and spool the results into a temp table. Check the temp table for dupes on either datakey or landline. If there are dupes, extend the sample data set such that when the code is run against it, the dupes show - then post the extended sample data set here.

    SELECT *

    INTO #ResultsForChecking

    FROM Level1

    UNION ALL

    SELECT * FROM Level2

    ORDER BY datakey, landline

    Cheers

    ChrisM


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

Viewing 15 posts - 61 through 75 (of 120 total)

You must be logged in to reply to this topic. Login to reply