Finding NULL count

  • Experts,

    I have a table called Company. It has 42 columns. This is the key table for all SP's. I saw that lot of columns have only NULL values. I.e nothing is stored in that columns.

    My requirement is,say for example,Assume i have a table Called Emp.

    Emp:

    ........................................................

    Eno Ename Salary Phone Fax Email

    .........................................................

    1 AA 5000 4534 NULL aa@aa.com

    2 BB 4000 NULL NULL NULL

    3 CC 6000 NULL NULL cc@cc.com

    4 DD 2000 NULL NULL dd@dd.com

    5 EE 1000 NULL NULL NULL

    .........................................................

    Expected Output is

    ..........................................................

    ColumnName RowValue_Count NULL_Count

    ..........................................................

    Eno 5 0

    Ename 5 0

    Salary 5 0

    Phone 1 4

    Fax 0 5

    Email 3 2

    .........................................................

    I have some questions.

    1) Will NULL affect the performance ? Because i am going to involve myself in a SP tuning work ? If it affects, shall we remove that column ? or suppose if it is required by the users for future use shall we store some other value (for example 0) instead of NULL.

    2)If the answer for the above question is 'YES',In what way it will affect the performance ?

    Inputs are highly appreciable !

    karthik

  • The simpler approach would be to just pass through the table for each column and use SUM and CASE to count up your totals. However, if your "Kipp" table has a great many rows this will be inefficient. A better approach might be to make a single pass through the table and compute your summary data and then list it out in the appropriate rows. A first pass at this (albeit a bit ragged looking) might look something like:

    declare @company table

    ( Eno int,

    Ename varchar(7),

    Salary int,

    Phone varchar(6),

    Fax varchar(6),

    Email varchar(12)

    )

    insert into @company

    select 1, 'AA', 5000, 4534, NULL, 'aa@aa.com' union all

    select 2, 'BB', 4000, NULL, NULL, NULL union all

    select 3, 'CC', 6000, NULL, NULL, 'cc@cc.com' union all

    select 4, 'DD', 2000, NULL, NULL, 'dd@dd.com' union all

    select 5, 'EE', 1000, NULL, NULL, NULL

    --select * from @company

    ;with summary as

    ( select

    sum ( case when Eno is not null then 1 else 0 end ) as Eno_Val,

    sum ( case when Eno is null then 1 else 0 end ) as Eno_Null,

    sum ( case when Ename is not null then 1 else 0 end ) as Ename_Val,

    sum ( case when Ename is null then 1 else 0 end ) as Ename_Null,

    sum ( case when Salary is not null then 1 else 0 end ) as Salary_Val,

    sum ( case when Salary is null then 1 else 0 end ) as Salary_Null,

    sum ( case when Phone is not null then 1 else 0 end ) as Phone_Val,

    sum ( case when Phone is null then 1 else 0 end ) as Phone_Null,

    sum ( case when Fax is not null then 1 else 0 end ) as Fax_Val,

    sum ( case when Fax is null then 1 else 0 end ) as Fax_Null,

    sum ( case when EMail is not null then 1 else 0 end ) as Email_Val,

    sum ( case when EMail is null then 1 else 0 end ) as Email_Null

    from @company

    ), matchList as

    ( select 1 as n, 'Eno' as ColumnName union all

    select 2, 'Ename' union all

    select 3, 'Salary' union all

    select 4, 'Phone' union all

    select 5, 'Fax' union all

    select 6, 'Email'

    )

    select

    columnName,

    case n when 1 then eno_val

    when 2 then ename_val

    when 3 then Salary_val

    when 4 then Phone_val

    when 5 then Fax_val

    when 6 then Email_val

    else null

    end as rowValue_count,

    case n when 1 then eno_null

    when 2 then ename_null

    when 3 then Salary_null

    when 4 then Phone_null

    when 5 then Fax_null

    when 6 then Email_null

    end as null_count

    from matchList

    cross join summary

    /* -------- Sample Output: --------

    columnName rowValue_count null_count

    ---------- -------------- -----------

    Eno 5 0

    Ename 5 0

    Salary 5 0

    Phone 1 4

    Fax 0 5

    Email 3 2

    */

  • I think it won't work in Sql2000. I am using sql2000.

    karthik

  • It will modify to work with SQL Server 2000 fine; however, you will need to use derived tables instead of CTEs. I will look at this over lunch and get back to you.

  • Kent,

    Thanks for your prompt reply. Carry on !

    karthik

  • Karthik,

    Here is a version that ought to work for SQL Server 2000:

    select

    columnName,

    case n when 1 then Eno_val

    when 2 then eName_val

    when 3 then Salary_val

    when 4 then Phone_val

    when 5 then Fax_val

    when 6 then Email_val

    else null

    end as rowValue_count,

    case n when 1 then Eno_null

    when 2 then eName_null

    when 3 then Salary_null

    when 4 then Phone_null

    when 5 then Fax_null

    when 6 then Email_null

    else null

    end as null_count

    from

    ( select

    sum ( case when Eno is not null then 1 else 0 end ) as Eno_Val,

    sum ( case when Eno is null then 1 else 0 end ) as Eno_Null,

    sum ( case when Ename is not null then 1 else 0 end ) as Ename_Val,

    sum ( case when Ename is null then 1 else 0 end ) as Ename_Null,

    sum ( case when Salary is not null then 1 else 0 end ) as Salary_Val,

    sum ( case when Salary is null then 1 else 0 end ) as Salary_Null,

    sum ( case when Phone is not null then 1 else 0 end ) as Phone_Val,

    sum ( case when Phone is null then 1 else 0 end ) as Phone_Null,

    sum ( case when Fax is not null then 1 else 0 end ) as Fax_Val,

    sum ( case when Fax is null then 1 else 0 end ) as Fax_Null,

    sum ( case when EMail is not null then 1 else 0 end ) as Email_Val,

    sum ( case when EMail is null then 1 else 0 end ) as Email_Null

    from @company

    ) a

    cross join

    ( select 1 as n, 'Eno' as columnName union all

    select 2, 'Ename' union all

    select 3, 'Salary' union all

    select 4, 'Phone' union all

    select 5, 'Fax' union all

    select 6, 'Email'

    ) b

    /* -------- Sample Output: --------

    columnName rowValue_count null_count

    ---------- -------------- -----------

    Eno 5 0

    Ename 5 0

    Salary 5 0

    Phone 1 4

    Fax 0 5

    Email 3 2

    */

  • COUNT(column_name) can do this also, no need for all of the CASE functions. Just subtract it from COUNT(*) for the count of NULLs in the column.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • rbarryyoung,

    Thanks for your reply.

    But some table has more than 150 columns. So do i have to hard code it manually ? Is there any way to include the column name automatically ?

    karthik

  • Going further,

    I have morethan 120 tables in a DB. I have to pass each table one by one as a parameter and i need the mentioned output.

    As each and evry table vary its column name we can't hard code it in our code. Our query has to read the column name dynamically.

    As I have to send this statistics report ( for all the tables) to my client,Please help me.

    karthik

  • Karthik

    I would advise you to write a script that uses INFORMATION_SCHEMA.COLUMNS to generate the query that will count the NULLS in each column in each database. It won't be pretty, but I think it's the best way and it has the further advantage of being reusable.

    John

  • Check out this article...

    http://qa.sqlservercentral.com/articles/Advanced+Querying/exoticsqlnullcolumnanalysis/1442/

    Ryan Randall

    Solutions are easy. Understanding the problem, now, that's the hard part.

  • Ryan,

    Thanks for giving that URL.

    I have tried out the below code:

    ......................................................................................................................................

    --drop table #ColumnNames,#FinalResult

    Select ID = IDENTITY(1,1,int),name

    into #ColumnNames

    From syscolumns where object_name(id) = 'Stock'

    Declare @Sql varchar(2000),@Sql1 varchar(2000)

    Declare @min-2 int,@Max int

    Declare @curColName varchar(50)

    Declare @Rowcount int,@NULLCount int

    select @Rowcount = 0,@NULLCount = 0

    select @max-2 = max(ID) from #ColumnNames

    select @min-2 = 1

    Create table #FinalResult

    (

    TableName varchar(50),

    ColumnName varchar(50),

    Row_Value_Count int,

    Row_NULL_Count int

    )

    While @min-2 <= @max-2

    Begin

    Select @curColName = name

    From #ColumnNames

    Where ID = @min-2

    Select @Sql = 'select count('+@curColName+') from Stock where '+@curColName +' is not null'

    print @Sql

    exec (@Sql)

    Select @Sql1 = 'select count(*) from Stock where isnull('+@curColName+",'') ='' "

    print @Sql1

    exec (@Sql1)

    insert into #FinalResult

    Select 'Stock',@curColName,@Rowcount,@NULLCount

    select @min-2 = @min-2 +1

    End

    Select * from #FinalResult

    .....................................................................................................................

    But the problem is I am unable to insert @Sql values into @RowCount and @Sql1 values into @NULLCount.

    Anybody know how to insert it ? Please help me.

    karthik

  • And the error message you get is...?

  • No i am not getting any error message, But the results are not getting insert into that particluar table.

    karthik

  • Actually, It is working fine. But i want to capture the exec (@sql) & exec(@sql1) results in a seperate table.

    karthik

Viewing 15 posts - 1 through 15 (of 24 total)

You must be logged in to reply to this topic. Login to reply