Data types / lengths and primary key efficiency

  • Okay this is pretty straight forward but basically I want to know what the performance difference would be between using the following for a primary key :

    int

    varchar(10)

    char(10)

    And what would be some performance enhancing techniques for the better of the two characters. I was thinking of padding my strings to the full length.

    Basically I have 3 data sources,  2 use numeric PK's  in the main and another with strings (although most are numeric strings). Until now these have been maintained separately but to streamline web operations I want to merge the DTS and come up with a one for all database.

    To avoid conflicts I intend to prefix each PK with a letter which will mean that I will also have to adjust any FK's as well. Not a problem but would this have any impact on performance apart from the obvious fact that the field itself is larger so more data goes over the wire?

  • if the fields are indexed, then technically the CHAR fields would take up more space, which would mean less data per page of memory; more data per page is better, AFAIK.

    so in theory, an index on a varchar would pack a bit more data per page, unless all the actual values are ten digits in length.

    I don't know that the difference would be much as far as performance.

    I also seem to remember that because sorting chars/varchars as text in an index can cause problems when they vary in length, because items get sorted like this:

    10

    1001

    111

    20

    2002

    21

     

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • The fields will certainly be indexed as these are the PKs used for a number of lookups and joins.

    I read that fixed length would be better for sorts but does the same apply to joins and indexes? I can't see any scenario where I would sort on the PK.

    The application already has a wide range of sorts none of which involve the PK.

    But I would have thought joining on a varchar(10) had more overhead than an int.

  • it's the index seek you would worry about when it comes to sorting.

    If a SQl statement says WHERE PK='2002', the sorting is used in the PK's index to quickly lookup the value; no problem for one value, but I'm thinking that any time more than one record is selected, for example IN( '2002','101','1002')

    that because it is a varchar, then index seek would be replaced with a table scan. someone else, please chime in as to whether I'm wrong, because I'm no expert.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • That would be bad for this app as a page of records is obtained using an in list

    it is cached but there are 1000's of these types of pages and the cache would all expire at once so for a period during the cache rebuild lots of in list queries would be going on.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply