UniqueIdentifier as a Primary Key

  • alanspeckman (9/12/2012)


    Is it ever considered a best practice to use a UniqueIdentifier(GUID) as a primary key?

    Is there ever a case where you would want to use a GUID as a PK?

    There are use cases for everything - even GUIDs as a clustered primary key. The important point is to be aware of all the implications before making a choice. In most cases, only the very expert would choose a GUID clustering key, and even then there would have to be special reasons for doing so.

  • IgorMi (9/13/2012)


    The above from BOL has proved true in a test i did some weeks ago. I wanted to see the fragmentation, page counts, index sizes for int, bigint, newid() and newsequentialid() keys.

    I find your results too hard to read. You might like to compare them with those shown at http://blogs.msdn.com/b/sqlserverfaq/archive/2010/05/27/guid-vs-int-debate.aspx

  • Jeff Moden (9/12/2012)


    SQL Server now uses Type 4 GUIDs which are nothing more than pseudo-random numbers and there's no guarantee that two machines can't generate the same number. The old Type 1 GUIDs that SQL Server used to generate were guaranteed to be unique even between machines provided that things like the MAC address (IIRC) for the machine was different.

    NEWID() calls CoCreateGuid which calls UuidCreate. It issues type IV guids. The documentation for CoCreateGuid says:

    The CoCreateGuid function calls the RPC function UuidCreate, which creates a GUID, a globally unique 128-bit integer. Use CoCreateGuid when you need an absolutely unique number that you will use as a persistent identifier in a distributed environment. To a very high degree of certainty, this function returns a unique value – no other invocation, on the same or any other system (networked or not), should return the same value.

    NEWSEQUENTIALID() calls UuidCreateSequential and issues type I GUIDs. The documentation for UuidCreateSequential says:

    The UuidCreateSequential function returns RPC_S_UUID_LOCAL_ONLY when the originating computer does not have an ethernet/token ring (IEEE 802.x) address. In this case, the generated UUID is a valid identifier, and is guaranteed to be unique among all UUIDs generated on the computer. However, the possibility exists that another computer without an ethernet/token ring address generated the identical UUID. Therefore you should never use this UUID to identify an object that is not strictly local to your computer. Computers with ethernet/token ring addresses generate UUIDs that are guaranteed to be globally unique.

    All values which should be unique in a database must be enforced by a unique constraint or index, of course.

  • SQL Kiwi (9/14/2012)


    IgorMi (9/13/2012)


    The above from BOL has proved true in a test i did some weeks ago. I wanted to see the fragmentation, page counts, index sizes for int, bigint, newid() and newsequentialid() keys.

    I find your results too hard to read. You might like to compare them with those shown at http://blogs.msdn.com/b/sqlserverfaq/archive/2010/05/27/guid-vs-int-debate.aspx

    [font="Courier New"]

    Table Ind_level External Fragm(%) Avg Frag Size Page_count Internal Fragm(%) Ind_size (KB)

    identity 0 0.369160869242.8620697043 99.96470225 56344

    identity_bigint 0 0.475184794199.36842117576 99.45420064 60608

    newsequentialid 0 0.678521292140.13114758548 99.70427477 68384

    newid 0 99.27137306 1 12352 68.99110452 98816

    [/font]

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (9/14/2012)


    [font="Courier New"]

    Table Ind_level External Fragm(%) Avg Frag Size Page_count Internal Fragm(%) Ind_size (KB)

    identity 0 0.369160869242.8620697043 99.96470225 56344

    identity_bigint 0 0.475184794199.36842117576 99.45420064 60608

    newsequentialid 0 0.678521292140.13114758548 99.70427477 68384

    newid 0 99.27137306 1 12352 68.99110452 98816

    [/font]

    Thanks Chris, though I did manage to read them eventually. The 'internal fragmentation' results seem peculiar to me, given the SQL Escalation Engineer's results:

  • "Internal fragmentation" = "average page density", although the latter seems more sensible to me.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • SQL Kiwi (9/14/2012)


    ChrisM@Work (9/14/2012)


    [font="Courier New"]

    Table Ind_level External Fragm(%) Avg Frag Size Page_count Internal Fragm(%) Ind_size (KB)

    identity 0 0.369160869242.8620697043 99.96470225 56344

    identity_bigint 0 0.475184794199.36842117576 99.45420064 60608

    newsequentialid 0 0.678521292140.13114758548 99.70427477 68384

    newid 0 99.27137306 1 12352 68.99110452 98816

    [/font]

    Thanks Chris, though I did manage to read them eventually. The 'internal fragmentation' results seem peculiar to me, given the SQL Escalation Engineer's results:

    Thank you Chris and Paul.

    I really didn't know how to make it looks more presentable. Next time I'll know.

    Regards

    IgorMi

    Igor Micev,
    My blog: www.igormicev.com

  • CELKO (9/14/2012)


    Is it ever considered a best practice to use a UniqueIdentifier(GUID) as a primary key?

    By definition it cannot be a key; it is not an attribute of any entity in the logical data model. A well-designed key should have validation rules (a regular expression, check digit, etc) and verification (a trusted source, physical test, etc).

    I think it is funny that noobs will use a GUID when they have an industry standard identifier. And it is sooo much longer than the indusry standard, which you have to have anyway to make a valid model.

    Look at how people use them; they are mock pointers, used to mimic a 1970's network DB in SQL.

    The purpose of UUID and GUID was to provide a physical locator at the internet level.

    Please check your record, I think it is stuck.

    Perhaps, in a logical model it can't be a key. Unfortunately, there are times when you move from the logical model to the physical model and you need something different as a key, something not seen by the application or the user but is used within the database for whatever reason determined by the "moron" that architects the database.

    Hey, just means more opportunity for you clean up someone elses mess, why are you complaining?

  • IgorMi (9/14/2012)


    SQL Kiwi (9/14/2012)


    ChrisM@Work (9/14/2012)


    [font="Courier New"]

    Table Ind_level External Fragm(%) Avg Frag Size Page_count Internal Fragm(%) Ind_size (KB)

    identity 0 0.369160869242.8620697043 99.96470225 56344

    identity_bigint 0 0.475184794199.36842117576 99.45420064 60608

    newsequentialid 0 0.678521292140.13114758548 99.70427477 68384

    newid 0 99.27137306 1 12352 68.99110452 98816

    [/font]

    Thanks Chris, though I did manage to read them eventually. The 'internal fragmentation' results seem peculiar to me, given the SQL Escalation Engineer's results:

    Thank you Chris and Paul.

    I really didn't know how to make it looks more presentable. Next time I'll know.

    Regards

    IgorMi

    Use Paul's method, it's much nicer and probably quicker than dicking around with spaces. I'm not allowed to use it. Failed the flower-arranging course.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • CELKO (9/14/2012)


    By definition it cannot be a key; it is not an attribute of any entity in the logical data model. A well-designed key should have validation rules (a regular expression, check digit, etc) and verification (a trusted source, physical test, etc).

    My company makes and sells identical widgets. They ship with a GUID sticker so the owner can prove ownership. What is the key for my widgets table in my logical data model?

    I think it is funny that noobs will use a GUID when they have an industry standard identifier. And it is sooo much longer than the indusry standard, which you have to have anyway to make a valid model.

    That's a physical design consideration 😉 Besides, a GUID is only 16 bytes - same as an nchar(8) natural key.

  • CELKO (9/14/2012)


    Is it ever considered a best practice to use a UniqueIdentifier(GUID) as a primary key?

    By definition it cannot be a key; it is not an attribute of any entity in the logical data model.

    Heh... although my preference is to avoid GUIDs, they can, by definition, be used as a key. It can easily be reflected in a sticker or embossed in to the body of a product as a serial number. By definition, the serial number becomes an attribute of the product just like a VIN can be.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Thank you everyone for contributing here.

    In Summary,

    1. In logical design, it cannot be viewed as a valid PK.

    2. In physical design, it is valid, but only in certain circumstances such as as using on a bar-code like sticker for widgets. Since the cluster index is included in all non-clustered indexes(something like that), performance really takes the biggest hit here.

    I appeciate the reference to this article

    http://blogs.msdn.com/b/sqlserverfaq/archive/2010/05/27/guid-vs-int-debate.aspx

    This part of the article, to me, hits home.

    The above considerations make the use of GUIDs unfavorable for a clustered index in environments which have large number of queries performing JOIN operations in OLTP and when referential integrity is enforced on the database among multiple tables. Throw non-clustered indexes that you created on the table as covering indexes for the frequently run queries against the database, you can have a significant performance bottleneck.

    Thanks again!

    Alan

  • CELKO (9/14/2012)


    By definition it cannot be a key; it is not an attribute of any entity in the logical data model. A well-designed key should have validation rules (a regular expression, check digit, etc) and verification (a trusted source, physical test, etc).

    Are you saying that every primary key must be based on a meaningful piece of data pertaining to the record? In essence, using an identity column is not proper?

  • jeffem (9/17/2012)


    CELKO (9/14/2012)


    By definition it cannot be a key; it is not an attribute of any entity in the logical data model. A well-designed key should have validation rules (a regular expression, check digit, etc) and verification (a trusted source, physical test, etc).

    Are you saying that every primary key must be based on a meaningful piece of data pertaining to the record? In essence, using an identity column is not proper?

    Huh, you didn't know? Here you are: http://qa.sqlservercentral.com/Forums/Topic1357081-392-1.aspx

    You can start reading J.C. stand on this subject from page 1, but you need to read to the end to find out what he really offers instead...

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • ... {Edit}. Sorry... post removed. Responded to the wrong post.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 15 posts - 16 through 30 (of 32 total)

You must be logged in to reply to this topic. Login to reply