Calculate physical space based upon total number of rows of data

  • With our current infrastructure we're experiencing alot of ASYNC NETWORK IO...

    I know that the physical space as information is stored within a table varies by the number of columns and the datatypes associated with those columns, but surely there has to be a way of estimating how much space is needed for a table based upon the number of records you want to put in, or when querying data between servers just how much does those 1.5 million records equate to in terms of size across the network

    For instance, if I know I've copied 575,000,000 rows of data across the network in a given month is there any way to calculate how much physical space that converts to? i.e Did we transfer over 690GB of data across our internal network? Is it less? More?

    Is there some sort of algorithm out there than can provide a decent estimate or is this something that's just to "iffy" to tell without taking into account many other things?

    I kinda assume the answer is no, but if anyone can shed some light on it it'd greatly be appreciated!

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • Microsoft states how much storage different data types use. http://msdn.microsoft.com/en-us/library/ms187745.aspx for instance shows int, bigint, smallint, and tinyint. And they tell how to get most of the other data types. for instance for a varchar

    varchar [ ( n | max ) ]

    Variable-length, non-Unicode character data. n can be a value from 1 through 8,000. max indicates that the maximum storage size is 2^31-1 bytes. The storage size is the actual length of data entered + 2 bytes. The data entered can be 0 characters in length. The ISO synonyms for varchar are char varying or character varying.

    If all you use is fixed length data types then you could estimate the size, but as soon as you start using variable length data types its just too 'iffy'. Varchar(max)/Varbinary(max) can range from 2 bytes to 2^31-1 bytes.

    Instead of trying to calculate the amount of data why not just monitor the network so you can see what the traffic is like, and when/if you are having problems.

  • In my experience (which is a LOT when it comes to sql server perf analysis and tuning) the vast majority of async network io waits are caused by slow client machines/apps not processing rows as quickly as sql server is sending them.

    if you want to know data sent you can set up a packet sniffer to capture the tds packets sent by sql server.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • You can check the dmv dm_db_index_physical_stats to find out the avg record size:

    Select * From sys.dm_db_index_physical_stats(DB_ID(),Object_id('MyHugeTable'),null,null,'sampled')

    This should give you a prety good idea of the size of each rows of the table and then you can compute roughly how much bytes you'll send.

  • TheSQLGuru (1/4/2011)


    In my experience (which is a LOT when it comes to sql server perf analysis and tuning) the vast majority of async network io waits are caused by slow client machines/apps not processing rows as quickly as sql server is sending them.

    Absolutely agreed. I've seen one case where the waits were caused by misconfigured network hardware, but that's not common.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply