Table design

  • I have the following table in my DB, to which I want to add another 10 columns; aLL the new columns

    are the following data type: [char] (5). I am just concerned whether this will affect the performance as

    the table already have 23 columns in it. Current DB size is around 600 MB and this table grows by 15000 records each year.

    Is it better to create another table or I can simply add the new columns in the existing table; how big a

    table can be? I know there is some restriction in sql2000 (8000 bytes), but that is no the case in sql2005..

    Further more, please advice me on the timestamp column in the table, this table is designed by my Predecessor

    and I am just wondering whether this is really needed, as we also have an unique Id in this table.

    This DB was migrated from SQL 2000 to SQL2005, due to this col 8 and col 9 are of ntext data types; In SQL 2005

    ntext is replaced by varchar(max) data type and I heard this is much better when performance is concerned, please

    advice me on the cons and pros if I convert ntext to varchar(max), and how can I safely do this conversion.

    Any advice will be greatly appreciated.

    [dbo].[tbl_report1] (

    [ID] [int] IDENTITY (1, 1) NOT NULL ,

    [dat_repDate] [datetime] NULL ,

    [personID] [nvarchar] (13) COLLATE Latin1_General_CI_AS NULL ,

    [str_col_1] [nvarchar] (50) COLLATE Latin1_General_CI_AS NULL ,

    [str_col_2] [nvarchar] (50) COLLATE Latin1_General_CI_AS NULL ,

    [str_col_3] [nvarchar] (50) COLLATE Latin1_General_CI_AS NULL ,

    [str_col_4] [nvarchar] (50) COLLATE Latin1_General_CI_AS NULL ,

    [str_col_5] [nvarchar] (50) COLLATE Latin1_General_CI_AS NULL ,

    [str_col_6] [nvarchar] (50) COLLATE Latin1_General_CI_AS NULL ,

    [str_col_7] [nvarchar] (50) COLLATE Latin1_General_CI_AS NULL ,

    [txt_col_8] [ntext] COLLATE Latin1_General_CI_AS NULL ,

    [str_col_9] [ntext] COLLATE Latin1_General_CI_AS NULL ,

    [str_col_10] [nvarchar] (255) COLLATE Latin1_General_CI_AS NULL ,

    [str_col_12] [nvarchar] (255) COLLATE Latin1_General_CI_AS NULL ,

    [str_col_13] [nvarchar] (50) COLLATE Latin1_General_CI_AS NULL ,

    [str_col_14] [nvarchar] (50) COLLATE Latin1_General_CI_AS NULL ,

    [int_col_15] [nvarchar] (2) COLLATE Latin1_General_CI_AS NULL ,

    [str_col_16] [nvarchar] (3) COLLATE Latin1_General_CI_AS NULL ,

    [str_col_17] [nvarchar] (5) COLLATE Latin1_General_CI_AS NULL ,

    [str_col_18] [nvarchar] (10) COLLATE Latin1_General_CI_AS NULL ,

    [str_col_19] [nvarchar] (50) COLLATE Latin1_General_CI_AS NULL ,

    [str_col_20] [nvarchar] (50) COLLATE Latin1_General_CI_AS NULL ,

    [str_col_21] [nvarchar] (50) COLLATE Latin1_General_CI_AS NULL ,

    [str_col_22] [nvarchar] (50) COLLATE Latin1_General_CI_AS NULL ,

    [str_col_23] [char] (3) COLLATE Latin1_General_CI_AS NULL ,

    [upsize_ts] [timestamp] NULL <-----------
    )

  • Did you change the names to obfuscate things, or are the columns really called that?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • The timestamp column isn't there to be a unique identifier. It's a versioning stamp in the database that some applications use to determine if something has changed in the row.

    As far as ntext to nvarchar(max), they're about the same thing, but you can do more with the varchar. More T-SQL commands work, whereas text requires pointer commands in places. The text type has been deprecated, so if you can convert there to varchar(max), do that. It's simple to do in SSMS, though the table will be rebuilt and that will take time and resources.

    Without more information about what the data is and why you need to add these columns, can't recommend whether they should be added here or in another table.

  • First of all forgive me for cross posting.

    Sorry I didn't mean to confuse things by changing the names; I thought the names are irrelevant in this context, data types need to be correct. Other than PersonID And str_col_17, which are foreign keys, all other columns are part of a report that we create for each person. PersonId is the unique ID awarded for a person and str_col_17 is periodId (come from a a table where periods are defined) so that different reports can be created in different periods of time in a year.

    Actually I need to add 10 more fields to the reports; they are all 3 characters long.

    As far I can see there are no application uses this timestamp column, I am wondering whether it is taking up space unnecessarily; the DB is used by a web application front end to create reports and Access report builder for generating reports. Can I just remove the field without any implications? Is there any way to check and see whether these columns are actually used?

    Can I just convert the ntext to varchar(max), in that case, how long will it take to convert a table with around 50000 records.

    Thanks Steve for your advice!!!

  • The conversion time is hard to guess. It's a "build a new table, copy all data from the old one". You can test that with a fake table.

    As far as the design, I hesitate to create columns for reporting since it eventually will cause issues as the data is munged up. The same values are across columns, when a PersonID/Period/Value design would work. That being said, if this isn't changed often, and you get good performance, it might be the design that works for you. The issue is in learning about the pros and cons and thinking forward about possible other reports, and then making an intelligent decision for your environment.

    If you'd like to post more, or maybe even write some more detailed specs, we could debate it here, or make it an article and solicit design ideas.

    On the timestamp, hard to tell. Access might use it to determine if a row has changed.

    No worries on the cross post.

  • Gosh... even though this is just a reporting table, I still wouldn't add the 10 columns to it unless it were trully normalized data. What is in the 10 columns you want to add?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • awp (7/18/2009)


    Sorry I didn't mean to confuse things by changing the names; I thought the names are irrelevant in this context, data types need to be correct.

    If those had been the actual column names, my advice would have been to scrap that design, produce a normalised design with specific columns.

    As it is, there is no way for us to say whether those columns should be in the same table, in two tables or in 10. Without knowing what's stored, any one of those could be rght

    Can I just remove the field without any implications?

    If nothing's using it, yes. If something is, that app will break. You'll have to do some impact analysis, examine in detail all the apps that use this table and ensure that nothing's using it

    Can I just convert the ntext to varchar(max), in that case, how long will it take to convert a table with around 50000 records.

    You can, though ntext should go to nvarchar(max) unless you're sure there's no unicode data in any of the columns. If there is, a conversion to varchar will fail or lose data.

    ALTER TABLE < Table name > ALTER COLUMN < Column name > NVARCHAR(max)

    No way to say how long it'll take. Depends on hardware, data sizes, server load and other things. Test it out on a dev server first.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Steve, you may be right about Access using the timestamp column; unless I find another way to create reports, I would have to leave that column alone.

    I can’t understand why you guys think this will lead to data corruption, currently I am saving around 20 columns of data for that report, and now the user wants to save more unique fields to that report, so how can I do that? Otherwise I have to create another table where you have to have same PersonID/Period/Value design and inner join those two tables to create the whole report.

    Thanks again guys for the advice on this issue.

    Jeff, they are all 2-3 long strings.

  • No one said anything about data corruption.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks Gail for your input on this topic, they were really clear and ontarget; I thought Steve said data being mangled up...I may have misunderstood..

    WHat do you think I should do, I need to implement another around 10 unique fields in the report?

  • Mangled up = bad data, misleading data, loss of integrity, etc.

    Can you maybe explain a bit more what this table is used for and how it's used. It; almost sounding like it's populated from elsewhere solely for a report to run off. Is that the case?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • awp (7/18/2009)


    Steve, you may be right about Access using the timestamp column; unless I find another way to create reports, I would have to leave that column alone.

    I can’t understand why you guys think this will lead to data corruption, currently I am saving around 20 columns of data for that report, and now the user wants to save more unique fields to that report, so how can I do that? Otherwise I have to create another table where you have to have same PersonID/Period/Value design and inner join those two tables to create the whole report.

    Thanks again guys for the advice on this issue.

    Jeff, they are all 2-3 long strings.

    Yep... I know they're all 2-3 character strings from what you said before. What are they though? What I'm trying to figure out is a way to gen this report "auto-magically" so you never have to add new columns. If they're based on dates, periods, or some such, we can do that quite easily. Even if they're not based on dates, we might be able to make this report more dynamic. We just need to more about what the report looks like and what the source(s) of data are so that we can help better.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • In fact, take a look at the following article so you can see more of what I mean... these types of reports can be made to run very fast and never need a modification to a table...

    http://qa.sqlservercentral.com/articles/cross+tab/65048/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Actually they are assessment data of students, such as grades, assessment of a task, comments etc, they are quite varying data which cannot be generated automatically, they are recorded by teachers during the term period and constantly updated. Each term a new report is created.

  • awp (7/18/2009)


    Actually they are assessment data of students, such as grades, assessment of a task, comments etc, they are quite varying data which cannot be generated automatically, they are recorded by teachers during the term period and constantly updated. Each term a new report is created.

    That's what I'm talking about... you shouldn't have to create a new report every term.

    You still haven't told me what the purpose of the columns you want to add are.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 15 posts - 1 through 15 (of 33 total)

You must be logged in to reply to this topic. Login to reply