Table Design, and Primary Key vs Clustered Index

  • I'm developing a Main Menu in Access, with BE on SQL Server. I think this situation is a good one for understanding table design and indexes. One aspect that confuses me is that (if I read this correctly) a Primary Key (which must be unique) is automatically the Clustered Index. [More on this at the end]

    TABLES:

    Users:

    ---NetworkID - User's network login ID (can easily retrieve this in Access, will use this to determine which reports are availbable to the user via table UserMenus.

    ---ERPID - User's ERP Login ID. There is a table in the ERP db Users that I will access to retrieve user information.

    ? Would it be prudent to include a Identity (AutoNumber) field, i.e. RecID, to have a non-data related id field?

    MenuItems: - table of all the reports, one for each report in the system. [Each user will have access to one or more MenuItems].

    MenuItemID - Identity

    Descr - Description that displays on the user's menu

    FormName - Access form to open when user selects this menu item.

    UserMenus: - table of the reports for each user

    NetworkID - User's Network login ID

    MenuItemID - MenuItems ID

    RecID - Unique record identifier (Is this necessary?)

    Frequent Queries (there's really only this one - to select the UserMenu records for the user in order to display the reports he can run):

    SELECT Descr, FormName FROM UserMenus WHERE NetworkID = {variable: user's network Login ID}

    - this query creates the value list for the Listbox which displays the user's menu choices.

    What Indexes Are Best?

    First - Is it good practice/prudent/highly recommended to always have a Clustered index?

    Here is my guess:

    Users: NetworkLoginID as a Clustered index because most frequent select WHERE criteria.

    Here's where I have a concern: It is possible to change a person's NetworkLoginID. If I understand how things work, the clustered index is stored physically in the order of the index key. If a person's ID is ABC and then changes to MNO, there would have to be an overhaul of the storage? (Although this is very infrequent in this case, it illustrates what can happen).

    MenuItems: MenuItemID as a Clustered index - it is unique, will not change (except for deletions), and will be used to join UserMenus with MenuItems.

    UserMenus: I'm really not sure. The statement I had at the beginning regarding Primary Key and Clustered Indexes causes me to not understand what is best?

    I need your help, oh wise ones.

    [font="Verdana"]"The Road To Success Is Always Under Construction."[/font]

  • >>a Primary Key (which must be unique) is automatically the Clustered Index

    By default, if you create a primary key in SQL Server it will be clustered. The primary key does not have to be clustered though.

    >>(Users Table) Would it be prudent to include a Identity (AutoNumber) field, i.e. RecID, to have a non-data related id field?

    I like the non-data related IDs (surrogate keys), but I don't believe there is necessarily a hard and fast rule on whether they should be used or not. Data related IDs (Natural Keys) are frequently larger and may in some cases involve multiple columns, making foreign keys larger and requiring more resources for joins. But adding another field to a table that has no "meaning" or purpose (the surrogate key) also adds to the table storage space. Since the NetworkID can change, it's not a good choice for the primary key. Using a surrogate key would allow you to change the network ID and not update the UserMenu table, so I think that is a good choice.

    >>(UserMenus table) RecID - Unique record identifier (Is this necessary?)

    No, the only purpose for this table is to connect the Users table and the MenuItems table, so all you need is the primary key from each table.

    >>SELECT Descr, FormName FROM UserMenus WHERE NetworkID = {variable: user's network Login ID}

    I think you left out the join to MenuItems - seems like just an omission though, I'm sure you have it in your live query since you're selecting data that is in that table

    >>What Indexes Are Best?

    As a general simplified rule, you want to index the columns you will be filtering by. In this case, you are filtering the Users table by NetworkID, the UserMenus table by either NetworkID or RecID and the MenuItems table by MenuItemsID. The join to MenuItems will be handled by the primary key on MenuItemID there. I'm going to guess that the tables are small enough (a few hundred rows?) that you won't see much performance difference.

    >>First - Is it good practice/prudent/highly recommended to always have a Clustered index?

    Yes. As a general rule it is good to have a clustered index. The primary key is not always the best choice for the clustered index, but in my opinion it is better than nothing. When trying to decide between a clustered and non-clustered index, think of it this way: both are good for looking up a single value (for example: ID 13), clustered indexes are good for picking up a range of values (for example: Dates 2/20/2009 through 10/1/2009). Clustered indexes also have the benefit of having all the data at the leaf level, so once you are there you don't have to go anywhere to get the rest of the table data.

    >>If I understand how things work, the clustered index is stored physically in the order of the index key. If a person's ID is ABC and then changes to MNO, there would have to be an overhaul of the storage? (Although this is very infrequent in this case, it illustrates what can happen).

    Technically it's stored logically in the order of the index key, not physically, but it usually works out to about the same thing. Yes, if you change a clustered key value, the data will likely have to physically move too, so try to avoid creating a clustered index on values that update frequently (frequently is vague on purpose).

    >> Users: NetworkLoginID as a Clustered index because most frequent select WHERE criteria.

    You will want to index NetworkLoginID and since there isn't another column that you will search by, you might as well make that the clustered index. You already know that if the NetworkLoginID changes the data will likely move. As a general rule, I would avoid this, but since it is infrequent and you probably don't have a lot of data in the table, I'm guessing the actual impact of moving a row every once in awhile will not be that great. Just remember that for your next project with 1,000 rows changing every minute, you probably DO NOT want to do this.

    >>MenuItems: MenuItemID as a Clustered index - it is unique, will not change (except for deletions), and will be used to join UserMenus with MenuItems.

    Sounds good. Since this will be your only index, it ought to be clustered.

    >>UserMenus: I'm really not sure. The statement I had at the beginning regarding Primary Key and Clustered Indexes causes me to not understand what is best?

    Your primary key for this table should be both columns - (NetworkID or RecID) and MenuItemID. You'll only have each possible combination in the table once, so it will be unique. Creating the primary key will also create the index you need there. I suggest putting the Users table key (NetworkID or RecID) first in the primary key/index since it will likely be more selective than MenuItemID and will match your where clause.

    Good luck!

    Chad

  • I would use UserID as the Primary Key. It will be unique, natural and by using that it reduces the probability of error.

    Indexes are usually created on columns that are defined as Foreign Key Constraints.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Chad - I really appreciate your taking the time and effort to give me a lesson in indexing! Wow. That was a comprehensive reply. I reread everything a few times to get it to sink in. (Almost got a good mental picture). I guess this was a good example for a lesson. You mentioned negligible performance difference because the files would be so small. True, but the principles are the same, and that's what is important.

    Welsh - Your reply sounds like a lead in for all this... Part II: Taking the understanding to the next level. I am rereading about Foreign Keys (I've only skimmed over information about them so far). Looks like the are too important to put on a back burner. Thanks!

    [font="Verdana"]"The Road To Success Is Always Under Construction."[/font]

  • http://qa.sqlservercentral.com/blogs/stratesql/archive/2010/10/13/return-of-index-analysis-part-1.aspx

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply